glamworkbench

Explore Trove's digitised newspapers by place

Monday, September 5, 2022

I’ve updated my map displaying places where Trove digitised newspapers were published or distributed. You can view all the places on single map – zoom in for more markers, and click on a marker for title details and a link back to Trove. If you want to find newspapers from a particular area, just click on a location using this map to view the 10 closest titles. You can view or download the dataset used to construct the map.

Continue reading →

Making NSW Postal Directories (and other digitised directories) easier to search with the GLAM Workbench and Datasette

Thursday, September 1, 2022

As part of my work on the Everyday Heritage project I’m looking at how we can make better use of digitised collections to explore the everyday experiences woven around places such as Parramatta Road in Sydney. For example, the NSW Postal Directories from 1886 to 1908 and 1909 to 1950 have been digitised by the State Library of NSW and made available through Trove. The directories list residences and businesses by name and street location.

Continue reading →

Monday, August 29, 2022 →

Interested in Victorian shipwrecks? Kim Doyle and Mitchell Harrop have added a new notebook to the Heritage Council of Victoria section of the GLAM Workbench exploring shipwrecks in the Victorian Heritage Database: glam-workbench.net/heritage-…

Monday, August 29, 2022 →

Updates!

troveharvester Python package updated to v0.5.1: github.com/wragge/tr…
Trove Newspaper Harvester section of #GLAMWorkbench updated to v1.1.1 to use latest troveharvester: glam-workbench.net/trove-har…

Thursday, August 25, 2022 →

Minor update to RecordSearch Data Scraper – now captures ‘institution title’ for agencies if it is present. pypi.org/project/r…

Many thanks to the British Library – sponsors of the GLAM Workbench’s web archives section!

Tuesday, August 16, 2022

You might have noticed some changes to the web archives section of the GLAM Workbench. I’m very excited to announce that the British Library is now sponsoring the web archives section! Many thanks to the British Library and the UK Web Archive for their support – it really makes a difference. The web archives section was developed in 2020 with the support of the International Internet Preservation Consortium’s Discretionary Funding Programme, in collaboration with the British Library, the National Library of Australia, and the National Library of New Zealand.

Continue reading →

New GLAM data to search, visualise and explore using the GLAM Workbench!

Monday, August 15, 2022

There’s lots of GLAM data out there if you know where to look! For the past few years I’ve been harvesting a list of datasets published by Australian galleries, libraries, archives, and museums through open government data portals. I’ve just updated the harvest and there’s now 463 datasets containing 1,192 files. There’s a human-readable version of the list that you can browse. If you just want the data you can download it as a CSV.

Continue reading →

Zotero now saves links to digitised items in Trove from the NLA catalogue!

Tuesday, August 9, 2022

I’ve made a small change to the Zotero translator for the National Library of Australia’s catalogue. Now, if there’s a link to a digitised version of the work in Trove, that link will be saved in Zotero’s url field. This makes it quicker and easier to view digitised items – just click on the ‘URL’ label in Zotero to open the link. It’s also handy if you’re viewing a digitised work in Trove and want to capture the metadata about it.

Continue reading →

View embedded JSON metadata for Trove's digitised books and journals

Monday, August 1, 2022

The metadata for digitised books and journals in Trove can seem a bit sparse, but there’s quite a lot of useful metadata embedded within Trove’s web pages that isn’t displayed to users or made available through the Trove API. This notebook in the GLAM Workbench shows you how you can access it. To make it even easier, I’ve added a new endpoint to my Trove Proxy that returns the metadata in JSON format.

Continue reading →

Where did all those NSW articles go? Trove Newspapers Data Dashboard update!

Friday, July 29, 2022

I was looking at my Trove Newspapers Data Dashboard again last night trying to figure out why the number of newspaper articles from NSW seemed to have dropped by more than 700,000 since my harvesting began. It took me a while to figure out, but it seems that the search index was rebuilt on 31 May, and that caused some major shifts in the distribution of articles by state, as reported by the main result API.

Continue reading →

Catching up – some recent GLAM Workbench updates!

Thursday, July 28, 2022

There’s been lots of small updates to the GLAM Workbench over the last couple of months and I’ve fallen behind in sharing details. So here’s an omnibus list of everything I can remember… Data Weekly harvests of basic Trove newspaper data continue, there’s now about three months worth. You can view a summary of the harvested data through the brand new Trove Newspaper Data Dashboard. The Dashboard is generated from a Jupyter notebook and is updated whenever there’s a new data harvest.

Continue reading →

Thursday, July 14, 2022 →

Updated dataset! Harvests of Trove list metadata from 2018, 2020, and 2022 are now available on Zenodo: doi.org/10.5281/z… Another addition to the growing collection of historical Trove data. #GLAMWorkbench

Screen capture of version information from Zenodo showing that there are three available versions, v1.0, v1.1, and v1.2.

Saturday, July 9, 2022 →

Coz I love making work for myself, I’ve started pulling datasets out of #GLAMWorkbench code repos & creating new data repos for them. This way they’ll have their own version histories in Zenodo. Here’s the first: github.com/GLAM-Work…

Tuesday, June 28, 2022 →

Ahead of my session at #OzHA2022 tomorrow, I’ve updated the NAA section of the #GLAMWorkbench. Come along to find out how to harvest file details, digitsed images, and PDFs, from a search in RecordSearch! github.com/GLAM-Work…

Sunday, June 26, 2022 →

Noticed that QueryPic was having a problem with some date queries. Should be fixed in the latest release of the Trove Newspapers section of the #GLAMWorkbench: glam-workbench.net/trove-new… #maintenance #researchinfrastructure

Friday, June 24, 2022 →

The Trove Newspapers section of the #GLAMWorkbench has been updated! Voilá was causing a problem in QueryPic, stopping results from being downloaded. A package update did the trick! Everything now updated & tested. glam-workbench.net/trove-new…

Friday, June 24, 2022 →

Some more #GLAMWorkbench maintenance – this app to download a high-res page images from Trove newspapers now doesn’t require an API key if you have a url, & some display problems have been fixed. trove-newspaper-apps.herokuapp.com/voila/ren…

Screen shot of app -- Download a page image The Trove web interface doesn't provide a way of getting high-resolution page images from newspapers. This simple app lets you download page images as complete, high-resolution JPG files.

Thursday, June 23, 2022 →

The Trove Newspaper and Gazette Harvester section of the #GLAMWorkbench has been updated! No major changes to notebooks, just lots of background maintenance stuff such as updating packages, testing, linting notebooks etc. glam-workbench.net/trove-har…

Wednesday, June 1, 2022 →

Ordering some #GLAMWorkbench stickers…

Proof image of a hexagonal sticker. The sticker has white lettering on a blue blackground which reads GLAM Workbench. In the centre is a crossed hammer and wrench icon.

Using Datasette on Nectar

Thursday, May 26, 2022

If you have a dataset that you want to share as a searchable online database then check out Datasette – it’s a fabulous tool that provides an ever-growing range of options for exploring and publishing data. I particularly like how easy Datasette makes it to publish datasets on cloud services like Google’s Cloudrun and Heroku. A couple of weekends ago I migrated the TungWah Newspaper Index to Datasette. It’s now running on Heroku, and I can push updates to it in seconds.

Continue reading →

Convert your Trove newspaper searches to an API query with just one click!

Friday, May 20, 2022

I’m thinking about the Trove Researcher Platform discussions & ways of integrating Trove with other apps and platforms (like the GLAM Workbench). As a simple demo I modifed my Trove Proxy app to convert a newspaper search url from the Trove web interface into an API query (using the trove-query-parser package). The proxy app then redirects you to the Trove API Console so you can see the results of the API query without needing a key.

Continue reading →

My Trove researcher platform wishlist

Wednesday, May 11, 2022

The ARDC is collecting user requirements for the Trove researcher platform for advanced research. This is a chance to start from scratch, and think about the types of data, tools, or interface enhancements that would support innovative research in the humanities and social sciences. The ARDC will be holding two public roundtables, on 13 and 20 May, to gather ideas. I created a list of possible API improvements in my response to last year’s draft plan, and thought it might be useful to expand that a bit, and add in a few other annoyances, possibilities, and long-held dreams.

Continue reading →

Tuesday, May 10, 2022 →

Spending the evening updating the NAA section of the #GLAMWorkbench. Here’s a fresh harvest of the agency functions currently being used in RecordSearch… gist.github.com/wragge/d1…

Working with Trove data – a collection of tools and resources

Monday, May 2, 2022

The ARDC is organising a couple of public forums to help gather researcher requirements for the Trove component of the HASS RDC. One of the roundtables will look at ‘Existing tools that utilise Trove data and APIs’. Last year I wrote a summary of what the GLAM Workbench can contribute to the development of humanities research infrastructure, particularly in regard to Trove. I thought it might be useful to update that list to include recent additions to the GLAM Workbench, as well as a range of other datasets, software, tools, and interfaces that exist outside of the GLAM Workbench.

Continue reading →

Saturday, April 30, 2022 →

And so it starts… #GLAMWorkbench

Screenshot of GLAM Workbook welcome page. Text states: 'This is a companion to the GLAM Workbench. Here you'll documentation, tips, tutorials, and exercises to help you work with digital collections from galleries, libraries, archives, and museums (the GLAM sector).'

Thursday, April 28, 2022 →

Ok, I’ve created a new #GLAMWorkbench meta issue to try and bring together all the things I’m trying to do to improve & automate the code & documentation. This should help me keep track of things… github.com/GLAM-Work… #DayofDH2022

Thursday, April 28, 2022 →

A couple of hours of #DayofDH2022 left – feeling a bit uninspired, so I’m going to do some pruning & reorganising of the #GLAMWorkbench issues list: github.com/GLAM-Work…

Tracking Trove changes over time

Wednesday, April 20, 2022

I’ve been doing a bit of cleaning up, trying to make some old datasets more easily available. In particular I’ve been pulling together harvests of the number of newspaper articles in Trove by year and state. My first harvests date all the way back to 2011, before there was even a Trove API. Unfortunately, I didn’t run the harvests as often as I should’ve and there are some big gaps. Nonetheless, if you’re interested in how Trove’s newspaper corpus has grown and changed over time, you might find them useful.

Continue reading →

The GLAM Workbench wants you!

Wednesday, March 2, 2022

Over the past few months I’ve been doing a lot of behind-the-scenes work on the GLAM Workbench – automating, standardising, and documenting processes for developing and managing repositories. These sort of things ease the maintenance burden on me and help make the GLAM Workbench sustainable, even as it continues to grow. But these changes are also aimed at making it easier for you to contribute to the GLAM Workbench! Perhaps you’re part of a GLAM organisation that wants to help researchers explore its collection data – why not create your own section of the GLAM Workbench?

Continue reading →

Omeka S Tools – new Python package

Thursday, February 17, 2022

Over the last couple of years I've been fiddling with bits of Python code to work with the Omeka S REST API. The Omeka S API is powerful, but the documentation is patchy, and doing basic things like uploading images can seem quite confusing. My code was an attempt to simplify common tasks, like creating new items. In case it's of use to others, I've now shared my code as a Python package.

Continue reading →