Tim Sherratt

Sharing recent updates and work-in-progress

Apr 2019

All 9,738 OCRd text files harvested from books, pamphlets and leaflets in @TroveAustralia’s ‘book' zone have been uploaded to @aarnet’s CloudStor for easy browsing/download. There’s also a 400mb zip file if you want the whole lot.

The harvesting method and code is available in this notebook. All this and more will be documented soon in my GLAM Workbench. #dhhacks