My program of rolling out new features and integrations across the GLAM Workbench continues. The latest section to be updated is the Web Archives section!
There are no new notebooks with this update, but some important changes under the hood. If you haven’t used it before, the Web Archives section contains 16 notebooks providing documentation, tools, apps, and examples to help you make use of web archives in your research. The notebooks are grouped by the following topics: Types of data, Harvesting data and creating datasets, and Exploring change over time.
There’s no doubt that Trove’s digitised newspapers have had a significant impact on the practice of history in Australia. But analysing that impact is difficult when Trove itself is always changing – more newspapers and articles are being added all the time.
In an attempt to chart the development of Trove, I’ve created a dataset that shows (approximately) when particular newspaper titles were first added. This gives a rough snapshot of what Trove contained at any point in the last 12 years.
To make it easier for people to suggest additions, I’ve created a GitHub repository for my list of GLAM Jupyter examples and resources. Contributions are welcome!
This list is automatically pulled into the GLAM Workbench’s help documentation. #dhhacks
I recently made some changes in the GLAM Workbench’s Help documentation, adding a new Running notebooks section. This section provides detailed information of running and managing GLAM Workbench repositories using Reclaim Cloud and Docker.
I’m still rolling out this functionality across all the repositories, but it’s going to take a while. When I’m finished you’ll be able to create your own persistent environment on Reclaim Cloud from any repository with just the click of a button.
As I foreshadowed some weeks ago, I’ve shut down my Patreon page. Thanks to everyone who has supported me there over the last few years!
I’ve now shifted across to GitHub Sponsors, which is focused on supporting open source projects. This seems like a much better fit for the things that I do, which are all free and open by default.
So if you think things like the GLAM Workbench, Historic Hansard, OzGLAM Help, and The Real Face of White Australia are worth supporting, you can sign up using my GitHub Sponsors page.
I’ve updated, refreshed, and reorganised the Trove newspapers section of the GLAM Workbench. There’s currently 22 Jupyter notebooks organised under the following headings:
Trove newspapers in context – Notebooks in this section look at the Trove newspaper corpus as a whole, to try and understand what’s there, and what’s not. Visualising searches – Notebooks in this section demonstrate some ways of visualising searches in Trove newspapers – seeing everything rather than just a list of search results.
It was way back in 2009 that I created my first scraper for getting machine-readable data out of the National Archives of Australia’s online database, RecordSearch. Since then I’ve used versions of this scraper in a number of different projects such as The Real Face of White Australia, Closed Access, and Redacted (including the recent update). The scraper is also embedded in many of the notebooks that I’ve created for the RecordSearch section of the GLAM Workbench.
Here’s the video of my presentation, ‘Secrets and lies’, for the (Re)create symposium at the University of Canberra, 21 April 2021. It’s mainly about finding and resting redactions in ASIO surveillance files held by the National Archives of Australia.
Secrets and lives from Tim Sherratt on Vimeo.
Here are links to the various sites and resources mentioned in the video:
For more on records relating to the White Australia policy see The Real Face of White Australia Some summary information on ASIO records in the National Archives of Australia Jenny Holzer’s Mass MoCA exhibition, including redaction paintings CIA Realizes It’s Been Using Black Highlighters All These Years, The Onion, 2005 Fun with the Petrovs, a Trove list that brings together photos from ASIO files in the NAA FOIA Facelift, MuckRock, 2020 Redaction Hall of Shame, MuckRock, 2016 Withheld Pending Advice, Inside Story, 2017, looks at ‘closed’ files in the NAA; the 2020 update is in this Twitter thread redacted, 2017 – browse my original collection of redactions The original #redactionart story, 2016-2017 – part 1 and part 2 The Redaction Zoo, 2017 DIY #redactionart , 2017 – repository of images Edward Shaddow’s #redactionart cookie cutters Wearing access, 2018, talk by Bonnie Wildie describing the creation of her #redactionart dress #redactionart jigsaw Some new #redactionart critters, 2021 #redactionart hardcover journal, 2021, Redbubble #redactionart quilt cover, 2021, Redbubble #redactionart scarf, 2021, Redbubble DIY #redactionart collage – make your own collages of recycled ASIO redactions (includes at least one redaction art critter) I haven’t yet written up the details of training my latest redaction finder.
I’m interested in understanding what gets digitised and when by our cultural institutions, but accessible data is scarce. The National Archives of Australia lists ‘newly scanned' records in RecordSearch, so I thought I’d see if I could convert that list into a machine-readable form for analysis. I’ve had a lot of experience trying to get data out of RecordSearch, but even so it took me a while to figure out how the ‘newly scanned’ page worked.
Over the last few years, I’ve been very grateful for the support of my Patreon subscribers. Financially, their contributions have helped me cover a substantial proportion of the cloud hosting costs associated with projects like Historic Hansard and The Real Face of White Australia. But, more importantly, just knowing that they thought my work was of value has helped keep me going, and inspired me to develop a range of new resources.
You might have noticed some changes to the GLAM Workbench home page recently. One of the difficulties has always been trying to explain what the GLAM Workbench actually is, so I thought it might be useful to put more examples up front. The home page now lists about 25 notebooks under the headings:
Finding GLAM data Asking different questions Hacking heritage Bringing documentation alive Hopefully they give a decent representation of the sorts of things you can do using the GLAM Workbench.
I’ve been doing a bit of work behind the scenes lately to prepare for a major update to the GLAM Workbench. My plan is to provide one click installation of any of the GLAM Workbench repositories on the Reclaim Cloud platform. This will provide a useful step up from Binder for any researcher who wants to do large-scale or sustained work using the GLAM Workbench. Reclaim Cloud is a paid service, but they do a great job supporting digital scholarship in the humanities, and it’s fairly easy to minimise your costs by shutting down environments when they’re not in use.
I’ve given a couple of talks lately on the GLAM Workbench and some of my other work relating to the construction of online access to GLAM collections. Videos and slides are available for both:
From collections as data to collections as infrastructure: Building the GLAM Workbench, seminar for the Centre for Creative and Cultural Research, University of Canberra, 22 February 2021 – video (40 minutes) and slides Building the GLAM Workbench (and various other projects such as The Real Face of White Australia, Closed Access, and redacted), guest lecture for the Cultural Data Sculpting course, EPFL, Switzerland, 18 March 2021 – video (1hr 40mins) and slides I’ve also updated the presentations page in the GLAM Workbench.
It was Open Data Day on Saturday 6 March – here’s some of the ready-to-go datasets you can find in the GLAM Workbench – there’s something for historians, humanities researchers, teachers & more!
First here’s a list of Australian GLAM (that’s galleries, libraries, archives & museums) data sources. It includes APIs, portals, and downloadable datasets. Suggested additions welcome!
There’s also a list of Australian GLAM datasets that are available through government open data portals.
The recent change of labels from ‘Barcode' to ‘ItemID’ in the National Archives of Australia’s RecordSearch database broke the Zotero translator. I’ve now updated the translator, and the new version has been merged into the Zotero translators repository. It should be updated when you restart Zotero, but if not you can go to Preferences > Advanced > Files and folders and click on the Reset translators button.
The translator lets you:
@TroveNewsBot has been sharing Trove newspaper articles on Twitter for over 7 years. With its latest upgrade the bot now has an ‘on this day’ function. Every day at AEST9.00am, TroveNewsBot will share an article published on that day in the past.
Even better, you can make your own ‘on this day' queries by tweeting to @TroveNewsBot with the hashtag #onthisday. For example:
Tweeting ‘#onthisday #luckydip’ – will return a random article published on this day in the past.
The NAA recently changed field labels in RecordSearch, so that ‘Barcode' is now ‘Item ID’. This required an update to my recordsearch_tools screen scraper. I also had to make a few changes in the RecordSearch section of the GLAM Workbench. #dhhacks
After some recent investigations of the availability of open access versions of articles published in paywalled Australian history journals, I’ve started a Google doc to capture useful links and information for Australian historians wanting to make their research open access. Comments and additions are welcome. #dhhacks
In 2014 I pulled together a sample of web pages that included links back to digitised newspaper articles in Trove and created the ‘Trove Traces’ app. It was interesting, and sometimes disturbing, to see the diversity of sites that made use of Trove. Amongst the family and local history enthusiasts were climate change deniers and racists who found ‘evidence' for their views in past newspapers. And of course, the sample only includes links in web pages, not social media sharing.
I’ve added an API Query Builder to the DigitalNZ section of the GLAM Workbench. You can use it to learn about the different parameters available from the search API, and experiment with different queries. Just get your API key from DigitalNZ, then try entering keywords and selecting options. Once you understand how the API works, you can start thinking about how you can make use of it in your own projects.