Today version 1 of the Trove API was decommissioned. As I explained elsewhere, this meant that a number of Trove Twitter bots also died. The problem is that version 2 of the API provides no easy way to randomly select records. Bots, and other apps that share random content, require major reworking.
After a lot of experimentation, I’ve settled on a few methods for selecting random-ish results. They’re far from perfect, but they seem to work reliably.
tl;dr Version 1 of the Trove API will be discontinued soon so Trove Twitter bots need to be upgraded. Unfortunately, Version 2 of the Trove API doesn’t support the random selection of resources, so the current behaviour of many bots will change.
The problem In January 2018, I created a series of templates on Glitch that made it easy for people to build their own Trove Twitter bots. And they did!
Over the last few weeks I’ve been exploring ways of recording dates for 70,000 digitised pages from Sydney Stock Exchnage records in the @TheANUArchives. Here’s the progress so far…
Here’s my attempt to calculate NSW holidays from 1900 to 1950. It’s probably incomplete, but it’s a start… nbviewer.jupyter.org/github/wr…
A couple of years ago I gave a talk in which I tried to justify what I do as research. I was going to turn it into an article, but never did. So here’s ‘The multiplication of contexts’ as a blog post.
The @naagovau RecordSearch section of the #GLAMWorkbench has been updated with more notebooks to help you get Australian archives data in a usable form. glam-workbench.github.io/recordsea… Useful for #twitterstorians, #ozhist, & #govhack!
Want to save searches for items in @naagovau’s RecordSearch as CSVs for exploration & analysis? This notebook walks through the process of constructing, managing, and saving data harvests. #dhhacks
I’ve updated my harvest of OCRd text from digitised journals in @TroveAustralia. The complete dataset now includes 33,035 issues from 720 titles – about 8gb of text to explore. Details in the #GLAMWorkbench: glam-workbench.github.io/trove-jou… #dhhacks
My app to browse & search @TroveAustralia’s digitised journals has been updated! Since 4 July, 112 new titles & 86,211 new articles have been added to Trove. Many of these new titles are parliamentary papers. Explore here: trove-titles.herokuapp.com #dhhacks
Another WIP notebook in need of additional documentation… This one explores the stats around volunteer correction of OCR errors in @TroveAustralia’s newspapers. More to come!
And this notebook uses TF-IDF to explore the OCRd text of a digitised journal from Trove. Get the top TF-IDF scores for each year across a journal’s life and see how they change. More documentation coming!
This notebooks lets you download the OCRd text of a digitised journal from @TroveAustralia (via CloudStor) and then explore word frequencies over time. More documentation coming soon!
A new notebook looking at the data about digitised journals on @TroveAustralia. #dhhacks
There’s a new section of the GLAM Workbench devoted to the National Museum of Australia collection API! Harvest @nma data, then explore it by time and place. #dhhacks
The second new notebook looks at @TroveAustralia’s newspapers as a whole, visualising both by time and by state. Along the way it looks at favourites such as the WWI effect and the copyright cliff of death. #dhhacks
Some brand new Jupyter notebooks for those interested in #ozhist & digital exploration of @TroveAustralia’s newspapers. The first walks through different ways of visualising newspaper searches over time. #dhhacks
I’ve updated the @invisibleaus data repository with latest transcriptions/markings from White Australia Policy records in @naagovau.
According to my last harvest, @TroveAustralia’s digitised journals comprise 31,216 separate issues. Here are the number of issues by year.