Tim Sherratt

Sharing recent updates and work-in-progress

Aug 2021

New dataset – Politicians talking about COVID

The Trove Journals section of the GLAM Workbench includes a notebook that helps you download press releases, speeches, and interview transcripts by Australian federal politicians. These documents are compiled and published by the Parliamentary Library, and the details are regularly harvested into Trove.

Using this notebook, I’ve created a collection of documents that include the words ‘COVID’ or ‘Coronavirus’. It includes all the metadata from Trove, as well as the full text of each document downloaded from the Parliamentary Library. There’s 3,995 documents in total, covering the period up until early April 2021. You can download them all as a zip file (12 mb).

While I was compiling this dataset, I also made a few improvements to the notebook. You can now filter the results to weed out false positives, and identify duplicates. #dhhacks