Micro.blog offers another alternative for people wanting more control over their socials. I’m posting this from my Micro account – it’ll be cross-posted to Mastodon & Twitter, saved to GitHub, syndicated via RSS, and accessible from updates.timsherratt.org

Tracking Trove changes over time

I’ve been doing a bit of cleaning up, trying to make some old datasets more easily available. In particular I’ve been pulling together harvests of the number of newspaper articles in Trove by year and state. My first harvests date all the way back to 2011, before there was even a Trove API. Unfortunately, I didn’t run the harvests as often as I should’ve and there are some big gaps. Nonetheless, if you’re interested in how Trove’s newspaper corpus has grown and changed over time, you might find them useful.

Continue reading →

Adventures in FOI – HASS RDC Scoping Studies

So my FOI request to release the scoping studies that informed investments in the current round of ARDC-managed HASS research infrastructure development was partially successful. As I’ve previously noted, reports from the ARDC and Academy of Humanities are now publicly available. There was a third document identified as relevant to my request that was finally released yesterday, but it isn’t what I was expecting. It’s a report of consultation relating to a discussion paper by Dandalo Partners, rather than the actual discussion paper itself!

Continue reading →

The GLAM Workbench wants you!

Over the past few months I’ve been doing a lot of behind-the-scenes work on the GLAM Workbench – automating, standardising, and documenting processes for developing and managing repositories. These sort of things ease the maintenance burden on me and help make the GLAM Workbench sustainable, even as it continues to grow. But these changes are also aimed at making it easier for you to contribute to the GLAM Workbench! Perhaps you’re part of a GLAM organisation that wants to help researchers explore its collection data – why not create your own section of the GLAM Workbench?

Continue reading →

Omeka S Tools – new Python package

Over the last couple of years I've been fiddling with bits of Python code to work with the Omeka S REST API. The Omeka S API is powerful, but the documentation is patchy, and doing basic things like uploading images can seem quite confusing. My code was an attempt to simplify common tasks, like creating new items. In case it's of use to others, I've now shared my code as a Python package.

Continue reading →

Zotero support in Australian GLAMs

Last year I started compiling information about the level of Zotero integration provided by Australian GLAM organisations though their online collections. The basic test is, can Zotero capture useful, structured information about an item from the collection interface. The results are not great. Zotero extracts information from a web site using a variety of 'translators'. Some of these translators look for generic information embedded in a web page, such as <TITLE> and <META> tags.

Continue reading →

Testing, testing...

I regularly update the Python packages used in the different sections of the GLAM Workbench; though probably not as often as I should. Part of the problem is that once I've updated the packages, I have to run all the notebooks to make sure I haven't inadvertently broken something -- and this takes time. And in those cases where the notebooks need an API key to run, I have to copy and paste the key in at the appropriate spots, then remember to delete them afterwords.

Continue reading →

Some big pictures of newspapers in Trove and DigitalNZ

One of the things I really like about Jupyter is the fact that I can share notebooks in a variety of different formats. Tools like QueryPic can run as simple web apps using Voila, static versions of notebooks can be viewed using NBViewer, and live versions can be spun up as required on Binder. It’s also possible to export notebooks at PDFs, slideshows, or just plain-old HTML pages. Just recently I realised I could export notebooks to HTML using the same template I use for Voila.

Continue reading →

Exploring GLAM data at ResBaz

The video of my key story presentation at ResBaz Queensland (simulcast via ResBaz Sydney) is now available on Vimeo. In it, I explore some of the possibilities of GLAM data by retracing my own journey through WWI service records, The Real Face of White Australia, #redactionart, and Trove – ending up at the GLAM Workbench, which brings together a lot of my tools and resources in a form that anyone can use.

Continue reading →

GLAM Workbench Nectar Cloud Application updated!

The newly-updated DigitalNZ and Te Papa sections of the GLAM Workbench have been added to the list of available repositories in the Nectar Research Cloud’s GLAM Workbench Application. This means you can create your very own version of these repositories running in the Nectar Cloud, simply by choosing them from the app’s dropdown list. See the Using Nectar help page for more information. I’ve also taken the opportunity to make use of the new container registry service developed by the ARDC as part of the ARCOS project.

Continue reading →

DigitalNZ & Te Papa sections of the GLAMWorkbench updated!

In preparation for my talk at ResBaz Aotearoa, I updated the DigitalNZ and Te Papa sections of the GLAM Workbench. Most of the changes are related to management, maintenance, and integration of the repositories. Things like: Setting up GitHub actions to automatically generate Docker images when the repositories change, and to upload the images to the Quay.io container registry Automatic generation of an index.ipynb file based on README.md to act as a front page within Jupyter Lab Addition of a reclaim-manifest.

Continue reading →

A template for GLAM Workbench development

I’m hoping that the GLAM Workbench will encourage GLAM organisations and GLAM data nerds (like me) to create their own Jupyter notebooks. If they do, they can put a link to them in the list of GLAM Jupyter resources. But what if they want to add the notebooks to the GLAM Workbench itself? To make this easier, I’ve been working on a template repository for the GLAM Workbench. It generates a new skeleton repository with all the files you need to develop and manage your own section of the GLAM Workbench.

Continue reading →

More thoughts on the Trove researcher platform for advanced research

Previously on ‘What could we do with $2.3 million?’, the National Library of Australia produced a draft plan for an ‘Advanced Researcher Platform’ that was thoroughly inadequate. Rather than submit this plan to the ARDC for consideration as part of the HASS RDC process, the NLA wisely decided to make some fundamental changes. The redrafted draft is now available for re-feedback. This is where we pick up the story… So what has changed?

Continue reading →

Coming up! GLAM Workbench at ResBaz(s)

Want a bit of added GLAM with your digital research skills? You’re in luck, as I’ll be speaking at not one, but three ResBaz events in November. If you haven’t heard of it before, ResBaz (Research Bazaar) is ‘a worldwide festival promoting the digital literacy at the centre of modern research’. On Wednesday, 24 November I’ll be giving a key story presentation (like a keynote, but with more story!) entitled Exploring GLAM data for ResBaz Queensland.

Continue reading →

New video – using the Trove Newspaper & Gazette Harvester

The latest help video for the GLAM Workbench walks through the web app version of the Trove Newspaper & Gazette Harvester. Just paste in your search url and Trove API key and you can harvest thousands of digitised newspaper articles in minutes!

Continue reading →

Harvest newspaper issues as PDFs

An inquiry on Twitter prompted me to put together a notebook that you can use to download all available issues of a newspaper as PDFs. It was really just a matter of copying code from other tools and making a few modifications. The first step harvests a list of available issues for a particular newspaper from Trove. You can then download the PDFs of those issues, supplying an optional date range.

Continue reading →

GLAM Workbench now in the Nectar Research Cloud!

The GLAM Workbench isn’t dependent on one big piece of technological infrastructure. It’s basically a collection of Jupyter notebooks, and those notebooks can be used within a variety of different environments. This helps make the GLAM Workbench more sustainable – new components can be swapped in and out as required. It also makes it possible to create different pathways for users, depending on their digital skills, institutional support, and research needs.

Continue reading →

More GLAM Name Index updates from Queensland State Archives and SLWA

A new version of the GLAM Name Index Search is available. An additional 49 indexes have been added, bringing the total to 246. You can now search for names in more than 10.2 million records from 9 organisations. The new indexes come from Queensland State Archives and the State Library of WA. QSA announced on Friday that they’d added two new indexes to their site. When I went to harvest them, I realised there was another 25 indexes that I hadn’t previously picked up.

Continue reading →

Getting data about newspaper issues in Trove

When you search Trove’s newspapers, you find articles – these articles are grouped by page, and all the pages from a particular date make up an issue. But how do you find out what issues are available? How do you get a list of dates when newspapers were published? This notebook in the GLAM Workbench shows how you can get information about issues from the Trove API. Using the notebook, I’ve created a couple of datasets ready for download and use.

Continue reading →

GLAM Workbench at eResearch Australasia 2021

Way back in 2013, I went to the eResearch Australasia conference as the manager of Trove to talk about new research possibilities using the Trove API. Eight years years later I was back, still spruiking the possibilities of Trove data. This time, however, I was discussing Trove in the broader context of GLAM data – all the exciting possibilities that have emerged as galleries, libraries, archives and museums make more of their collections available in machine-readable form.

Continue reading →