Tim Sherratt

Mining for meanings

Monday, June 30, 2025

In 2012, I was lucky enough to be awarded a Harold White Fellowship by the National Library of Australia. I used my time to explore ways of using Trove’s digitised newspapers as data, and presented my work at a public lecture in May 2012. I spoke from notes and never got round to writing it all up. The recording made by the NLA has disappeared from their website, but is still available in the Internet Archive.

Continue reading →

A brief and biased history of Trove Twitter bots

Thursday, June 19, 2025

The socials recently alerted me to an interesting article by Dominique Carlon, Jean Burgess, and Kateryna Kasianenko on the history of community-created Twitter bots. The article explores bot-making within the context of Twitter’s rise and fall, and provides a handy taxonomy of bot species. However, it doesn’t include any Australian bots amidst the examples. That’s a bit disappointing, as I remember the bot-building years as a time of great fun and creativity.

Continue reading →

Some Archives Week goodies

Wednesday, June 11, 2025

It’s International Archives Week and I’m feeling a bit crook after being double-vaxxed yesterday, so instead of doing something productive, I’m just going to make a list of potentially handy archives-related resources from the Wonderful World of Wragge(TM). The theme of Archives Week is #ArchivesAreAccessible, which you’d have to regard as rather aspirational given the various ways access is limited by law, policy, practice, technology, and history. But what the heck, discussions about the meaning of access are always welcome.

Continue reading →

New dataset – Trove links shared on Twitter, 2009 to 2020

Tuesday, June 10, 2025

A few years ago, I harvested the details of tweets that included links to Trove. The data has just been sitting on my computer, so I thought I should package it up and share, in case it’s of use to anyone. The story is that back in 2021, I was working on the article ‘More than newspapers’ for a special section of History Australia focusing on Trove. I was thinking that I might include something about the way Trove newspaper articles were mobilised within online discussions about history – a topic I first explored in ‘Life on the outside: connections, contexts, and the wild, wild web’, my keynote for the Annual Conference of the Japanese Association of Digital Humanities in 2014.

Continue reading →

GLAM Workbench – preprint for 'Building User-Friendly Toolkits and Platforms for Digital Humanities'

Thursday, June 5, 2025

This is a preprint of my contribution to the publication ‘Building User-Friendly Toolkits and Platforms for Digital Humanities’. It provides a brief overview of the GLAM Workbench. I had to leave a lot out, but hopefully it provides a useful summary of what the GLAM Workbench is, and what I’d like it to be. The GLAM Workbench is a collection of tools and resources created to help researchers use and explore the digital collections of GLAM organisations (galleries, libraries, archives, and museums).

Continue reading →

No more harvesting data from the National Archives of Australia

Monday, May 19, 2025

A couple of weeks ago I bid farewell to Trove due to the cancellation of my API keys and the NLA’s lack of transparency around changes to API access. Now it seems I have to wave goodbye to 16+ years of work on RecordSearch, the National Archives of Australia’s online database. I noticed this morning that my weekly harvest of recently digitised files in RecordSearch had failed. A quick check showed that my harvester was being blocked by Cloudflare’s bot protection software.

Continue reading →

Farewell Trove

Wednesday, May 7, 2025

Over the last few months I’ve been grappling with the cancellation of my Trove API keys by the National Library of Australia. It may seem like a minor technical hiccup from the outside, but it’s had a major personal impact. For the sake of my health, I’ve decided to stop work on Trove, archive all my code repositories related to Trove, and move on. Farewell Trove. But don’t panic! All of my Trove tools and resources available through the GLAM Workbench and elsewhere will remain online.

Continue reading →

SLV LAB and GLAM Workbench updates

Monday, May 5, 2025

Last week the State Library of Victoria launched SLV LAB, a prototyping and innovation lab that ‘experiment[s] with technology to open access to collections, data and spaces’. The SLV LAB encourages collaboration, and is sharing code, datasets, and tutorials. It’s an exciting development and I’m looking forward to seeing what they get up to. I’ve added SLV LAB to the GLAM data portals & repositories section of my Australian GLAM data list.

Continue reading →

New PROV section added to the GLAM Workbench

Wednesday, April 30, 2025

There’s a brand new GLAM Workbench section to help you work with data from the Public Record Office Victoria! Over the past couple of months, I’ve been poking around in the PROV’s collection API. The API provides data about PROV’s archival holdings in a machine readable format. This makes it possible to use, analyse, and visualise the collection in new ways. I’ve already shared a few of the results of my explorations.

Continue reading →

The GLAM Workbench introduction to how notebooks work now runs in Jupyter Lite

Monday, April 28, 2025

I’ve just updated my introduction to using Jupyter notebooks in the GLAM Workbench so that it runs in Jupyter Lite – that means no more waiting for cloud services to spin up, it all happens in your browser! All the Jupyter notebooks in GLAM Workbench can be run in the cloud using the free Binder service – either through the ARDC (requires authentication), or through the public, community-run service. While it’s usually just a matter of clicking a link, Binder can take a while to build the necessary computing environment, and sometimes it just fails.

Continue reading →

Update on Trove data access and my suspended API keys

Friday, April 11, 2025

On 21 February, my Trove API keys were cancelled without warning. A week later, I met with NLA staff and was shocked to be told that downloading ‘content’, such as the text of digitised newspaper articles, was regarded as a breach of the API terms of use. Without API access I can’t continue my work helping researchers make use of Trove. More generally though, the NLA’s actions threaten innovative digital research.

Continue reading →

Using the Public Record Office Victoria's API to build an overview of their collection

Thursday, April 10, 2025

Over the past few weeks I’ve been exploring the Public Record Office Victoria’s public API. There’s not a lot of documentation, but there is a lot of data! What’s not immediately obvious is that the API includes information about a variety of different entities within the PROV’s model for archival description – not just items, but functions, agencies, series and more. You can limit your API requests to a particular entity using the category field.

Continue reading →

More than 6 million rows of data from Public Record Office Victoria added to the GLAM Name Index Search

Wednesday, April 9, 2025

The GLAM Name Index Search now includes more than 6 million rows of data from the Public Record Office Victoria, downloaded using their public API. The GLAM Name Index Search brings together records that include the names of people from 10 Australian GLAM organisations. With a single search, you can find information about individuals across millions of rows of data. Previous versions of the GLAM Name Index Search included a few datasets from the Public Record Office Victoria that had been shared through government open data portals.

Continue reading →

Introducing PROVBot – sharing photos from Public Record Office Victoria

Wednesday, April 9, 2025

With poor old TroveNewsBot killed by the NLA, my Mastodon feed has had less GLAM goodness of late. To try and fill the void I’ve created PROVBot, sharing photos from the Public Record Office Victoria. PROVBot makes use of the Public Record Office Victoria’s public API. At this stage it just selects and shares a random photograph once a day, but in the future I’ll probably add more features, such as the ability to respond to search queries.

Continue reading →

Trove API users beware! – the latest in the saga of my cancelled API keys

Sunday, March 2, 2025

After my Trove API keys were cancelled without warning on 21 February, I reluctantly agreed to a meeting with the National Library of Australia. They had provided so little information in their emails, that it seemed to be the only way to find out what was really going on. I came out of the meeting shocked by the NLA’s change in attitude towards API use. TL;DR – you’re probably breaching the API terms of use All Trove API users need to be aware that the NLA now insists that accessing the ‘content’ of resources, rather than just the descriptive metadata, is a breach of the API terms of use.

Continue reading →

15 years of work on Trove threatened by the NLA

Monday, February 24, 2025

See my latest post for an update! On Friday, without warning, I received an email from the National Library of Australia informing me that my Trove API keys had been suspended. This threatens the future of 15 years of work helping people use and understand the possibilities of Trove for new types of research. What’s happened? Here’s the full text of the email: Your recently published work on the GLAM Workbench regarding extracting metadata and text from a National e-Deposit (NED) periodical has been brought to the Library’s attention.

Continue reading →

The Primary Source – GLAM collection news and help

Thursday, February 20, 2025

I’ve created a new site (or in fact, renovated an old site) to aggregate news from GLAM collections (that’s galleries, libraries, archives, and museums) and help researchers using those collections. It’s called The Primary Source which is a bit of a bad history pun. Why is is needed? Before the nazi takeover of the old bird site, I had a list of GLAM organisation accounts which made it pretty easy to follow what was going on in Australia’s galleries, libraries, archives, and museums.

Continue reading →

National Archives of Australia Digitisation Dashboard

Thursday, February 20, 2025

Since March 2021, I’ve been harvesting details of newly-digitised files in the National Archives of Australia to help document long-term changes to online access. A few weeks ago, I summarised the data from 2024, and published annual compilations in Zenodo. I’ve now created an automatically-updated dashboard which displays digitisation progress in the past week, the current year, and since my harvests began. Each week, after the latest data harvest, a GitHub action runs a Jupyter notebook that pulls in the data, generates some visualisations and summaries, and saves the results as an HTML page.

Continue reading →

Search the content of periodicals uploaded to Trove through the National eDeposit service

Wednesday, February 19, 2025

I’ve added a notebook to the GLAM Workbench that walks through the steps involved in creating a fully searchable database of content extracted from a periodical uploaded to Trove through the National eDeposit service (NED). Why is this needed? I was contacted recently by a member of the team that publishes The Triangle, a community newsletter from the south coast of NSW. Issues of The Triangle from 2007 to the present have been uploaded to Trove through the National eDeposit service, but they were wondering whether it was possible to search across all their newsletters in Trove.

Continue reading →

Ten years of data! The files you're not allowed to see in the National Archives of Australia

Wednesday, February 5, 2025

I’ve created a new dataset containing 10 years of data that can be used to explore the workings of the National Archives of Australia’s access examination system. Australian government records become available for public access after 20 years. But before being opened to the public, records go through a process known as access examination to determine whether they should be withheld, either partially or completely. The grounds for exemption are laid out in the Archives Act and include things like national security and personal privacy.

Continue reading →