Tim Sherratt - Sharing recent updates and work-in-progress

Tim Sherratt

Sharing recent updates and work-in-progress

21 Nov 2022

The Australian history industry and the impact of digitisation (open access preprint chapter)

The Australian History Industry was published recently. Edited by Paul Ashton and Paula Hamilton, the book ‘explores the complex, multi-roomed house of Australian history’, exploring academic, school, and public history, the impact of digit...
21 Nov 2022

Recent updates to trove-newspaper-harvester and trove-newspaper-images

Catching up on some software package updates over the last few months. The trove-newspaper-harvester package is now at v0.6.5. Recent changes include: Fix to handle articles with missing metadata Don’t try to re-download existing text and ...
22 Sep 2022

Do you want your Trove newspaper articles in bulk? Meet the new Trove Newspaper Harvester Python package!

The Trove Newspaper Harvester has been around in different forms for more than a decade. It helps you download all the articles in a Trove newspaper search, opening up new possibilities for large-scale analysis. You can use it as a command-...
15 Sep 2022

From 48 PDFs to one searchable database – opening up the Tasmanian Post Office Directories with the GLAM Workbench

A few weeks ago I created a new search interface to the NSW Post Office Directories from 1886 to 1950. Since then, I’ve used the same process on the Sydney Telephone Directories from 1926 to 1954. Both of these publications had been digitis...
05 Sep 2022

Fresh harvest of OCRd text from Trove's digitised periodicals – 9gb of text to explore and analyse!

I’ve updated the GLAM Workbench’s harvest of OCRd text from Trove’s digitised periodicals. This is a completely fresh harvest, so should include any corrections made in recent months. It includes: 1,430 periodicals OCRd text from 41,645 is...
05 Sep 2022

Explore Trove's digitised newspapers by place

I’ve updated my map displaying places where Trove digitised newspapers were published or distributed. You can view all the places on single map – zoom in for more markers, and click on a marker for title details and a link back to Trove. If...
01 Sep 2022

Making NSW Postal Directories (and other digitised directories) easier to search with the GLAM Workbench and Datasette

As part of my work on the Everyday Heritage project I’m looking at how we can make better use of digitised collections to explore the everyday experiences woven around places such as Parramatta Road in Sydney. For example, the NSW Postal Di...
29 Aug 2022

Interested in Victorian shipwrecks? Kim Doyle and Mitchell Harrop have added a new notebook to the Heritage Council of Victoria section of the GLAM Workbench exploring shipwrecks in the Victorian Heritage Database: glam-workbench.net/heritage-…

29 Aug 2022


25 Aug 2022

Minor update to RecordSearch Data Scraper – now captures ‘institution title’ for agencies if it is present. pypi.org/project/r…

16 Aug 2022

Many thanks to the British Library – sponsors of the GLAM Workbench’s web archives section!

You might have noticed some changes to the web archives section of the GLAM Workbench. I’m very excited to announce that the British Library is now sponsoring the web archives section! Many thanks to the British Library and the UK Web Archi...
15 Aug 2022

New GLAM data to search, visualise and explore using the GLAM Workbench!

There’s lots of GLAM data out there if you know where to look! For the past few years I’ve been harvesting a list of datasets published by Australian galleries, libraries, archives, and museums through open government data portals. I’ve jus...
09 Aug 2022

Zotero now saves links to digitised items in Trove from the NLA catalogue!

I’ve made a small change to the Zotero translator for the National Library of Australia’s catalogue. Now, if there’s a link to a digitised version of the work in Trove, that link will be saved in Zotero’s url field. This makes it quicker an...
01 Aug 2022

View embedded JSON metadata for Trove's digitised books and journals

The metadata for digitised books and journals in Trove can seem a bit sparse, but there’s quite a lot of useful metadata embedded within Trove’s web pages that isn’t displayed to users or made available through the Trove API. This notebook ...
29 Jul 2022

Where did all those NSW articles go? Trove Newspapers Data Dashboard update!

I was looking at my Trove Newspapers Data Dashboard again last night trying to figure out why the number of newspaper articles from NSW seemed to have dropped by more than 700,000 since my harvesting began. It took me a while to figure out,...
28 Jul 2022

Catching up – some recent GLAM Workbench updates!

There’s been lots of small updates to the GLAM Workbench over the last couple of months and I’ve fallen behind in sharing details. So here’s an omnibus list of everything I can remember… Data Weekly harvests of basic Trove newspaper data c...
14 Jul 2022

Calling all Tasmanian historians – you can now save resources from Libraries Tasmania into Zotero!

I’ve created a Zotero translator for the Libraries Tasmania catalogue. Using it, you can save metadata and digital resources to your own research database with a single click. Libraries Tasmania actually has three catalogues rolled into one...
14 Jul 2022

Updated dataset! Harvests of Trove list metadata from 2018, 2020, and 2022 are now available on Zenodo: doi.org/10.5281/z… Another addition to the growing collection of historical Trove data. #GLAMWorkbench

Screen capture of version information from Zenodo showing that there are three available versions, v1.0, v1.1, and v1.2.
10 Jul 2022

Updated dataset! Details of 2,201,090 unique public tags added to 9,370,614 resources in Trove between August 2008 and July 2022. Useful for exploring folksonomies, and the way people organise and use massive online resources like Trove. doi.org/10.5281/z…

09 Jul 2022

Ok, I’ve created a Zenodo community for datasets documenting changes in the content and structure of Trove. Lots more to add… zenodo.org/communiti…

09 Jul 2022

Coz I love making work for myself, I’ve started pulling datasets out of #GLAMWorkbench code repos & creating new data repos for them. This way they’ll have their own version histories in Zenodo. Here’s the first: github.com/GLAM-Work…

28 Jun 2022

Ahead of my session at #OzHA2022 tomorrow, I’ve updated the NAA section of the #GLAMWorkbench. Come along to find out how to harvest file details, digitsed images, and PDFs, from a search in RecordSearch! github.com/GLAM-Work…

26 Jun 2022

55,633 items digitised by the National Archives of Australia last week. Including:

  • Bonegilla name index cards (A2751 & A2752): +42,434
  • CMF Personnel Dossiers (B884): +10,150
  • Aust Women’s Land Army personnel cards (C610): +961


A2571, Name Index Cards, Migrants Registration [Bonegilla], 33686 files digitised; B884, Citizen Military Forces Personnel Dossiers, 1939-1947, 10150 files digitised; A2572, Name Index Cards, Migrants Registration [Bonegilla], 8748 files digitised; C610, Australian Women's Land Army - personnel cards, alphabetical series, 961 files digitised; A9301, RAAF Personnel files of Non-Commissioned Officers (NCOs) and other ranks, 1921-1948, 735 files digitised; D874, Still photograph outdoor and studio negatives, annual single number series with N prefix (and progressive alpha infix A-K from 1948-1957), 624 files digitised; B883, Second Australian Imperial Force Personnel Dossiers, 1939-1947, 163 files digitised; J853, Architectural plans, annual single number series with alpha (denoting Papua New Guinea and discipline) prefix and/or alpha/numeric (denoting size and amendment) suffix, 161 files digitised; A14487, Royal Australian Air Force Air Board and Air Council Agendas, Submissions and Determinations - Master Copy, 102 files digitised; A2478, Non-British European migrant selection documents, 21 files digitised; D4881, Alien registration cards, alphabetical series, 18 files digitised; A471, Courts-Martial files [including war crimes trials], single number series, 10 files digitised; A1877, British migrants - Selection documents for free or assisted passage (Commonwealth nominees), 9 files digitised; A13860, Medical Documents - Army (Department of Defence Medical Documents), 9 files digitised; A1196, Correspondence files, multiple number series [Class 501] [501-539] [Classified] [Main correspondence files series of the agency], 9 files digitised; B78, Alien registration documents, 8 files digitised; A712, Letters received, annual single number series with letter prefix or infix, 6 files digitised; A12372, RAAF Personnel files - All Ranks [Main correspondence files series of the agency], 6 files digitised; AP476/4, Applications etc. for registration of copyright of literary, dramatic and musical productions, pictures etc., 6 files digitised; A714, Books of duplicate certificates of naturalization A(1)[Individual person] series, 6 files digitised;
26 Jun 2022

Newspapers added to Trove last week

  • Freelance (WA)
  • The Standard (WA)
  • Berrigan Advocate (NSW)
  • Baileys Sporting & Dramatic Weekly (WA)
  • Farmers' Weekly (WA)
  • Harvey-Waroona Mail (WA)
  • W.A. Family Sphere (WA)
  • Coonabarabran Times (NSW)


26 Jun 2022

Noticed that QueryPic was having a problem with some date queries. Should be fixed in the latest release of the Trove Newspapers section of the #GLAMWorkbench: glam-workbench.net/trove-new… #maintenance #researchinfrastructure