Tim Sherratt

Sharing recent updates and work-in-progress

glamworkbench

02 Apr 2020

The GLAM CSV Explorer has had a few updates — you can now filter by organisation, and upload your own CSV files! #GLAMWorkbench Try it live on Binder.

31 Mar 2020

Buildings might be closed, but the data is open – explore hundreds of datasets from Australian GLAM organisations!

For a couple of years I’ve been harvesting datasets created or published by Australian GLAM organisations through government data portals. I’ve just completed the latest harvest, and there’s now 369 datasets, containing 983 files, from 23 G...
11 Mar 2020

My harvest of OCRd text from @TroveAustralia digitised books, ephemera, and parliamentary papers has been updated! There’s now 19,795 text files (about 3gb) to explore! Harvesting details and links to browse/download files from Cloudstor are in the #GLAMWorkbench. #dhhacks

03 Mar 2020

I’ve added some more documentation to the Trove Newspaper Harvester page in the #GLAMWorkbench. Get your @TroveAustralia newspaper articles in bulk! #dhhacks #collectionsasdata

27 Feb 2020

New section added to the #GLAMWorkbench with examples from @Library_Vic! #slvdata #dhhacks #collectionsasdata

27 Feb 2020

More fun with @iiif_io and images from @library_vic – resize, rotate, crop and more! Try it out with this new notebook in the #GLAMWorkbench. #slvdata #dhhacks

26 Feb 2020

New #GLAMWorkbench notebook! Download images from @Library_Vic using IIIF and Handle… #dhhacks

21 Feb 2020

Want to save @TroveAustralia newspaper articles as images (that aren’t sliced up in annoying ways)? There’s an app for that in the #GLAMWorkbench. #dhhacks

17 Feb 2020

New ‘Trove images' section added to the #GLAMWorkbench! Here you’ll find my latest Jupyter notebook harvesting data about the use of standard licences & rights statements in Trove’s picture zone. #dhhacks

14 Feb 2020

Voting in the 2019 @dhawards is now open! Go and check out all the cool #DigitalHumanities projects from around the world. And while you’re there, you might like to vote for my #GLAMWorkbench in the ‘Tools’ category!

20 Nov 2019

New #GLAMWorkbench section with examples of how to get random-ish works and newspaper articles from @TroveAustralia. #dhhacks

04 Sep 2019

The @naagovau RecordSearch section of the #GLAMWorkbench has been updated with more notebooks to help you get Australian archives data in a usable form. glam-workbench.github.io/recordsea… Useful for #twitterstorians, #ozhist, & #govhack!

25 Aug 2019

I’ve updated my harvest of OCRd text from digitised journals in @TroveAustralia. The complete dataset now includes 33,035 issues from 720 titles – about 8gb of text to explore. Details in the #GLAMWorkbench: glam-workbench.github.io/trove-jou… #dhhacks

09 Aug 2019

There’s a new section of the GLAM Workbench devoted to the National Museum of Australia collection API! Harvest @nma data, then explore it by time and place. #dhhacks

24 Jul 2019

Updates to the Trove newspapers section of GLAM Workbench – adding links to app-ified versions of some notebooks, & direct links to @mybinderteam for everything. If you work with @TroveAustralia newspapers you might find it useful.

24 Jul 2019

Download & explore 1,499,259 rows of open data from NSW State Archives Online Indexes

NSW State Archives publishes a number of detailed indexes containing data manually extracted from their records. These provide additional entry points to the records, such as a person’s name, or a place. But they also provide useful data fo...
11 Jul 2019

New in GLAM Workbench! Notebooks to harvest, index, analyse, and aggregate transcripts of speeches & interviews by Australian prime ministers. Plus links to harvested data and aggregated files. #dhhacks

11 Jul 2019

Reorganising things a little at GLAM Workbench. @statelibrarynsw gets its own section. Hansard and @datagovau GLAM datasets now under ‘Australian government’. Making some space for further additions…

17 Jun 2019

Kicked off a new GLAM Workbench repository dedicated to @SLSA with a quick notebook hack to get higher res versions of digitised photos. #dhhacks

07 Jun 2019

Recent additions to the Trove Newspapers section of the GLAM Workbench: getting images from @TroveAustralia newspaper articles, and uploading article to @Omeka-S: glam-workbench.github.io/trove-new…

26 May 2019

More GLAM Workbench updates! More full text of Australian books! I’ve added the notebook & data from my harvest of @TroveAustralia books in the @InternetArchive. There’s metadata and text of 1,153 books to explore. #dhhacks

19 May 2019

Some overdue updates to the GLAM Workbench. First here’s details, data, and code from a harvest of GLAM datasets on @datagovau. Includes details of more than 400 CSV datasets. #dhhacks

09 May 2019

Over the last week I’ve been downloading editorial cartoons published in The Bulletin from @TroveAustralia. There’s 3,471 cartoons – at least one from every issue published between 4 Sep 1886 and 17 Sep 1952. And you can browse them all…

To make it easier to explore the images, I’ve compiled them into a series of PDFs – one PDF for each decade. The PDFs include lower resolution versions of the images together with their publication details and a link to Trove. They’re all available from DropBox:

The complete collection of high resolution images (about 60gb in total) can be downloaded from CloudStor. The names of each image file provide useful contextual metadata. For example, the file name 19330412-2774-nla.obj-606969767-7.jpg tells you:

  • 19330412 – the cartoon was published on 12 April 1933
  • 2774 – it was published in issue number 2774
  • nla.obj-606969767 – the Trove identifier for the issue, can be used to make a url eg [nla.gov.au/nla.obj-6...](https://nla.gov.au/nla.obj-606969767)
  • 7 – on page 7

There’s some details of the method that I used to find the cartoons in this notebook. I’ve also documented everything in the Trove Journals section of my GLAM Workbench.

Be warned – the language, images, and ideas presented in The Bulletin were often racist, anti-Semitic, and sexist. You won’t have to look far within this collection to find something offensive. This was, after all, the journal whose slogan for many years was ‘Australia for the white man’. This is our history… #dhhacks

27 Apr 2019

And now my GLAM Workbench has a ‘Trove Maps’ section to document examples and explorations using data from @TroveAustralia’s ‘map’ zone: glam-workbench.github.io/trove-map… Includes a list of 20,158 maps with high-res downloads. #dhhacks

23 Apr 2019

I’ve been busy lately harvesting LOTS of full text data from @TroveAustralia’s digitised journals – so many opportunities for research! You should be able to get to all the code & data from the new Trove journals section of my GLAM Workbench. #dhhacks