Sharing recent updates and work-in-progress
I’m just in the midst of updating my harvest of OCRd text from Trove’s digitised books (more about that soon!). But amongst the items catalogued as ‘books’ are a wide assortment of ephemera, posters, advertisements, and other oddities. There’s no consistent way of identifying these items through the search interface, but because I’ve found the number of pages in each ‘book’ as part of the harvesting process, I can limit results to items with just a single digitised page – there’s more than 1,500! To make it easy to explore this collection of odds and ends, I’ve downloaded all the single page images and compiled them into one big PDF with links back to their entries in Trove. Enjoy your browsing!
This is another example of the ways in which we can extend and enrich existing collection interfaces using simple technologies like PDFs and CSVs. We can create slices across existing categories to expose interesting features, and provide new entry points for researchers. Some other examples in the GLAM Workbench are the collection of editorial cartoons from The Bulletin, the list of Trove newspapers with non-English content, the harvest of ABC Radio National programs, and the recent collection of politicians talking about COVID. Let me know if you have any ideas for additional slices! #dhhacks