GLAM Workbench – preprint for 'Building User-Friendly Toolkits and Platforms for Digital Humanities'

Thursday, June 5, 2025

This is a preprint of my contribution to the publication ‘Building User-Friendly Toolkits and Platforms for Digital Humanities’. It provides a brief overview of the GLAM Workbench. I had to leave a lot out, but hopefully it provides a useful summary of what the GLAM Workbench is, and what I’d like it to be.

The GLAM Workbench is a collection of tools and resources created to help researchers use and explore the digital collections of GLAM organisations (galleries, libraries, archives, and museums).¹ It’s mainly focused on collections from Australia and New Zealand, but some sections venture across international boundaries to explore topics such as web archives and Wikidata.

GLAM organisations make a lot of rich cultural data available online, but getting that data in a machine-readable form that can be aggregated and analysed is often difficult. The GLAM Workbench tries to fill this gap by providing code examples and API documentation, but data access alone is not enough. Researchers need to understand the history, structure, and extent of the data – both its limits and its possibilities. By sharing snapshots, building overviews, and exploring patterns and inconsistencies, the GLAM Workbench also attempts to contextualise GLAM collections and open them to new types of questions.²

History and motivation

I created the GLAM Workbench in 2017, but it incorporates the latest versions of tools, such as the Trove Newspaper Harvester, which I’ve been maintaining for more than 15 years.³ One of my motivations was simply to bring together useful snippets, notes, and doodles from a variety of blog posts, web applications, and code repositories, and make them available in a form that could be more easily navigated and maintained.

I was also keen to explore the way that Jupyter notebooks combine code and narrative. I wanted to find ways to support researchers as they developed their digital skills and confidence, not just dump them at the command line or point them to an app.

The ongoing development of the GLAM Workbench is also part of my own research. I’m interested in the meaning of access within the context of GLAM collections. What changes when you can download data and explore collections beyond the limitations of the web interface?⁴

Contents and technologies

At its heart, the GLAM Workbench comprises at least 171 Jupyter notebooks and 59 datasets shared through more than 70 GitHub repositories.⁵ Added to this are a number of web apps, online databases, and guides to related resources. Code from some notebooks has also been spun off into independent Python packages. All of this is brought together within a single documentation site, built using MkDocs Material.

The contents are mostly organised by institution, reflecting the idiosyncrasies of the data. I’ve partially implemented tags to draw together similar resources across institutions, but this needs to be made more consistent, ideally using the TaDiRAH taxonomy.⁶ Many of the notebooks describe methods for accessing data and building datasets. Others demonstrate techniques for visualisation and analysis, suggest workarounds for limits imposed by collection interfaces, or provide example-driven documentation for APIs and datasets.

There is no single platform or server underlying the GLAM Workbench. Instead, it follows a pattern described in the ARDC Community Data Lab’s architecture principles as ‘infrastructure at rest’.⁷ Notebooks can be run as required in a variety of contexts from cloud services to local computers. This is made possible by standardised configuration files and automated processes that build virtual computing environments from each GitHub repository.⁸

Impact and engagement

The GLAM Workbench has helped to expand understanding of the research possibilities of GLAM collection data. The list of publications citing the GLAM Workbench or one of its embedded tools now includes more than 100 entries.⁹ Some of these relate to individual research projects, while others survey the practices of GLAM organisations and the needs of research infrastructure around the world.

My work on the GLAM Workbench has helped inspire organisations such as the National Library of Scotland to explore new ways of supporting digital research.¹⁰ A recent report from the ‘Towards a National Collection’ project in the UK has mentioned the GLAM Workbench alongside a number of national libraries in Europe and the USA for ‘encouraging innovative research and expanding public engagement with heritage resources’.¹¹

And yet, there are disappointments. Most of the Australian GLAM organisations whose collections are featured in the GLAM Workbench have shown little interest in sharing or engaging with its resources. This makes it difficult to get tools to the people who could benefit from them. There’s some irony in the fact that the websites of the National Library of Scotland, the British Library, the UK National Archives, the V&A Museum, and DigitalNZ all include links to the GLAM Workbench, but the National Library of Australia (NLA) and the National Archives of Australia (NAA) do not.

Maintenance and sustainability

While a number of individuals have contributed notebooks and additions to the GLAM Workbench, it remains essentially a one man operation. Over the years, I’ve sought to ease the maintenance burden by automating processes, adding some basic testing frameworks, and generating machine-readable metadata that summarises the contents of each repository. For example, I created a GLAM Workbench repository template that makes it easy to start work on a new topic.¹²

Development of the web archives section of the GLAM Workbench was made possible by a grant from the International Internet Preservation Consortium, and the section’s ongoing maintenance is supported by the British Library.¹³ I’m grateful too for my GitHub sponsors who help cover some of my cloud hosting bills, and to the ARDC for funding to integrate RO-Crate metadata.¹⁴ But beyond this, the GLAM Workbench has received no dedicated funding or institutional support. It has, nonetheless, outlived some well-funded digital infrastructure projects in the HASS sector.

Sustainability means more than money, though. The GLAM Workbench doesn’t have to continue in its current form to have a long-term impact. My focus is on ensuring that its contents are open to future reuse and modification. Everything is openly licensed, published through GitHub, and preserved in Zenodo. If tools are useful they can live on, independent of me.

The future

I’m writing this at a difficult time. Changes wrought by the NLA and NAA in early 2025 have made it impossible for me to continue work on the Trove and RecordSearch sections of the GLAM Workbench.¹⁵ In the Trove section alone, there are more than 70 notebooks.

The GLAM Workbench is not my job, no-one pays me. I work on it because I think its useful and important, and because I enjoy the process of solving problems and helping researchers. The NLA’s actions, in particular, have robbed me of that joy, and made me consider whether I want to continue. Research infrastructure is people.

On the other hand, there are many more GLAM collections for me to explore. I’m also hoping to find new ways of collaborating with individuals and institutions. I’m often inspired to create new tools and resources by gnarly questions from researchers. While such questions continue, so the GLAM Workbench will grow.

References

Ames, Sarah, and Lucy Havens. “Exploring National Library of Scotland Datasets with Jupyter Notebooks.” IFLA Journal, December 27, 2021. doi.org/10.1177/0….

Bailey, Rebecca, Javier Pereda, Chris Michaels, and Tom Callahan. “Unlocking the Potential of Digital Collections. A Call to Action.” Towards a National Collection, November 21, 2024. doi.org/10.5281/z….

Candela, Gustavo, Sally Chambers, and Tim Sherratt. “An Approach to Assess the Quality of Jupyter Projects Published by GLAM Institutions.” Journal of the Association for Information Science and Technology 74, no. 13 (2023): 1550–64. doi.org/10.1002/a….

“GLAM Workbench (GitHub Organisation).” Accessed June 5, 2025. github.com/GLAM-Work….

IIPC. “Asking Questions with Web Archives – Introductory Notebooks for Historians.” Accessed June 5, 2025. netpreserve.org/projects/….

Jackson, Andy. “GLAM Workbench Update.” UK Web Archive Blog. Accessed June 2, 2025. blogs.bl.uk/webarchiv….

Sefton, Peter, Tom Honeyman, Tim Sherratt, and Conal Tuohy. “The ARDC Community Data Lab Architecture: Research Software Deployment Principles and Patterns for Integrity, Reproducibility and Sustainability,” May 10, 2024. zenodo.org/records/1….

Sherratt, Tim. “Develop a New GLAM Workbench Repository.” GLAM Workbench. Accessed June 5, 2025. glam-workbench.net/get-invol….

———. “Farewell Trove.” Tim Sherratt – Sharing Recent Updates and Work-in-Progress, May 7, 2025. updates.timsherratt.org/2025/05/0….

———. “GLAM Workbench.” Zenodo, June 5, 2025. doi.org/10.5281/z….

———. “GLAM Workbench.” Accessed June 5, 2025. glam-workbench.net/..

———. “GLAM Workbench Citations.” GLAM Workbench. Accessed June 5, 2025. glam-workbench.net/citations….

———. “Hacking Heritage: Understanding the Limits of Online Access.” In The Routledge International Handbook of New Digital Practices in Galleries, Libraries, Archives, Museums and Heritage Sites, edited by H Lewi, W Smith, S Cooke, and D vom Lehn, 116–30. London & New York: Routledge, 2020. doi.org/10.5281/z….

———. “No More Harvesting Data from the National Archives of Australia.” Tim Sherratt – Sharing Recent Updates and Work-in-Progress, May 19, 2025. updates.timsherratt.org/2025/05/1….

———. “Some Important Updates for the Trove Newspaper & Gazette Harvester.” Tim Sherratt – Sharing Recent Updates and Work-in-Progress, August 31, 2023. updates.timsherratt.org/2023/08/3….

———. “Supporters.” GLAM Workbench. Accessed June 5, 2025. glam-workbench.net/get-invol….

———. “Trove Newspapers: Data Dashboard.” Accessed June 5, 2025. wragge.github.io/trove-new….

———. “Trove-Newspaper-Harvester.” Python, October 23, 2023. doi.org/10.5281/z….

Sherratt, Tim, Harry Keightley, Ben Foley, and Michael Niemann. “GLAM-Workbench/Glam-Workbench-Template.” Python. GLAM Workbench, August 24, 2023. github.com/GLAM-Work….

“TaDiRAH The Taxonomy of Digital Research Activities in the Humanities.” Accessed June 5, 2025. tadirah.info/..

Talboom, Leontien, and Mark Bell. “Keeping It under Lock and Keywords: Exploring New Ways to Open up the Web Archives with Notebooks.” Archival Science, July 4, 2022. doi.org/10.1007/s….

“Trove Historical Data.” Accessed June 5, 2025. zenodo.org/communiti….

Sherratt, “GLAM Workbench.” ↩︎
See, for example: Sherratt, “Trove Newspapers: Data Dashboard.” and “Trove Historical Data.” ↩︎
Sherratt, “Trove-Newspaper-Harvester.” ↩︎
See, for example: Sherratt, “Hacking Heritage: Understanding the Limits of Online Access.” ↩︎
“GLAM Workbench (GitHub Organisation).” ↩︎
“TaDiRAH The Taxonomy of Digital Research Activities in the Humanities.” ↩︎
Sefton et al., “The ARDC Community Data Lab Architecture.” ↩︎
For more on best practices in sharing Jupyter projects, see: Candela, Chambers, and Sherratt, “An Approach to Assess the Quality of Jupyter Projects Published by GLAM Institutions.” ↩︎
Sherratt, “GLAM Workbench Citations.” ↩︎
Ames and Havens, “Exploring National Library of Scotland Datasets with Jupyter Notebooks.” For another example of the GLAM Workbench’s influence, see: Talboom and Bell, “Keeping It under Lock and Keywords.” ↩︎
Bailey et al., “Unlocking the Potential of Digital Collections. A Call to Action,” 58. ↩︎
Sherratt et al., “GLAM-Workbench/Glam-Workbench-Template.” For documentation see: Sherratt, “Develop a New Repository.” ↩︎
“Asking Questions with Web Archives – Introductory Notebooks for Historians.”; Jackson, “GLAM Workbench Update.” ↩︎
Sherratt, “Supporters.”; Sherratt, “Some Important Updates for the Trove Newspaper & Gazette Harvester.” ↩︎
Sherratt, “Farewell Trove.”; Sherratt, “No More Harvesting Data from the National Archives of Australia.” ↩︎

glamworkbench

GLAM Workbench ­– preprint for 'Building User-Friendly Toolkits and Platforms for Digital Humanities'