Tim Sherratt

Sharing recent updates and work-in-progress

glamworkbench

24 Jul 2019

Updates to the Trove newspapers section of GLAM Workbench – adding links to app-ified versions of some notebooks, & direct links to @mybinderteam for everything. If you work with @TroveAustralia newspapers you might find it useful.

24 Jul 2019

Download & explore 1,499,259 rows of open data from NSW State Archives Online Indexes

NSW State Archives publishes a number of detailed indexes containing data manually extracted from their records. These provide additional entry points to the records, such as a person’s name, or a place. But they also provide useful data fo...
11 Jul 2019

New in GLAM Workbench! Notebooks to harvest, index, analyse, and aggregate transcripts of speeches & interviews by Australian prime ministers. Plus links to harvested data and aggregated files. #dhhacks

11 Jul 2019

Reorganising things a little at GLAM Workbench. @statelibrarynsw gets its own section. Hansard and @datagovau GLAM datasets now under ‘Australian government’. Making some space for further additions…

17 Jun 2019

Kicked off a new GLAM Workbench repository dedicated to @SLSA with a quick notebook hack to get higher res versions of digitised photos. #dhhacks

07 Jun 2019

Recent additions to the Trove Newspapers section of the GLAM Workbench: getting images from @TroveAustralia newspaper articles, and uploading article to @Omeka-S: glam-workbench.github.io/trove-new…

26 May 2019

More GLAM Workbench updates! More full text of Australian books! I’ve added the notebook & data from my harvest of @TroveAustralia books in the @InternetArchive. There’s metadata and text of 1,153 books to explore. #dhhacks

19 May 2019

Some overdue updates to the GLAM Workbench. First here’s details, data, and code from a harvest of GLAM datasets on @datagovau. Includes details of more than 400 CSV datasets. #dhhacks

09 May 2019

Over the last week I’ve been downloading editorial cartoons published in The Bulletin from @TroveAustralia. There’s 3,471 cartoons – at least one from every issue published between 4 Sep 1886 and 17 Sep 1952. And you can browse them all…

To make it easier to explore the images, I’ve compiled them into a series of PDFs – one PDF for each decade. The PDFs include lower resolution versions of the images together with their publication details and a link to Trove. They’re all available from DropBox:

The complete collection of high resolution images (about 60gb in total) can be downloaded from CloudStor. The names of each image file provide useful contextual metadata. For example, the file name 19330412-2774-nla.obj-606969767-7.jpg tells you:

  • 19330412 – the cartoon was published on 12 April 1933
  • 2774 – it was published in issue number 2774
  • nla.obj-606969767 – the Trove identifier for the issue, can be used to make a url eg [nla.gov.au/nla.obj-6...](https://nla.gov.au/nla.obj-606969767)
  • 7 – on page 7

There’s some details of the method that I used to find the cartoons in this notebook. I’ve also documented everything in the Trove Journals section of my GLAM Workbench.

Be warned – the language, images, and ideas presented in The Bulletin were often racist, anti-Semitic, and sexist. You won’t have to look far within this collection to find something offensive. This was, after all, the journal whose slogan for many years was ‘Australia for the white man’. This is our history… #dhhacks

27 Apr 2019

And now my GLAM Workbench has a ‘Trove Maps’ section to document examples and explorations using data from @TroveAustralia’s ‘map’ zone: glam-workbench.github.io/trove-map… Includes a list of 20,158 maps with high-res downloads. #dhhacks

23 Apr 2019

I’ve been busy lately harvesting LOTS of full text data from @TroveAustralia’s digitised journals – so many opportunities for research! You should be able to get to all the code & data from the new Trove journals section of my GLAM Workbench. #dhhacks

22 Apr 2019

I’ve added a section for the @TroveAustralia ‘book’ zone to the GLAM Workbench.

22 Apr 2019

All 9,738 OCRd text files harvested from books, pamphlets and leaflets in @TroveAustralia’s ‘book' zone have been uploaded to @aarnet’s CloudStor for easy browsing/download. There’s also a 400mb zip file if you want the whole lot.

The harvesting method and code is available in this notebook. All this and more will be documented soon in my GLAM Workbench. #dhhacks

31 Mar 2019

Train from Canberra to Melbourne booked for #VALATechCamp. I’ll be hanging around both days, so let me know if you’d like to chat about the GLAM Workbench, Jupyter, Trove data, or any of the other things I fiddle with…

24 Feb 2019

I’ve updated the notebook for harvesting records from @archivesnz’s Archway database in my GLAM Workbench. I just used it to harvest more than 8,000 records from series 8333 relating to naturalisation. #dhhacks

21 Feb 2019

New section added to my GLAM Workbench for the Queensland State Archives (@qsarchives). Includes a notebook to add series information into their Naturalisations 1851-1904 index. #dhhacks

17 Feb 2019

Suggestions of new topics and collections for my GLAM workbench are welcome!

17 Feb 2019

I’ve added a section for Library and Archives Canada to my GLAM workbench. The first notebook extracts records of people from a specific country from their naturalisations database and saves the results as a CSV file. #dhhacks

15 Feb 2019

Current status — extracting data from Library and Archives Canada’s 1915-1946 naturalisation database. Coming soon to my GLAM Workbench…

01 Feb 2019

I’ve added a ‘save chart’ option to the QueryPic app in my GLAM Workbench. Visualise your searches in @TroveAustralia newspapers, then save the results as HTML for easy download. #dhhacks

23 Jan 2019

One more and I’m done for the night… New GLAM Workbench page for the ‘Trove API introduction’ notebooks.

23 Jan 2019

I’ve finished putting details of all the current GLAM Workbench repositories into the new documentation site. Still a few notebooks to migrate from the original workbench, but getting there! There’s about 50 Jupyter notebooks so far. #dhhacks

23 Jan 2019

Added a ‘data’ section to the GLAM Workbench docs, with info on harvests from government data portals, as well as series from @naagovau relating to ASIO and the White Australia Policy.

23 Jan 2019

And now a GLAM Workbench page for @Te_Papa…

23 Jan 2019

Added a page for @ArchivesNZ’s Archway to the GLAM Workbench docs…