Tim Sherratt

Sharing recent updates and work-in-progress

Jun 2025

New dataset – Trove links shared on Twitter, 2009 to 2020

A few years ago, I harvested the details of tweets that included links to Trove. The data has just been sitting on my computer, so I thought I should package it up and share, in case it’s of use to anyone.

The story is that back in 2021, I was working on the article ‘More than newspapers’ for a special section of History Australia focusing on Trove. I was thinking that I might include something about the way Trove newspaper articles were mobilised within online discussions about history – a topic I first explored in ‘Life on the outside: connections, contexts, and the wild, wild web’, my keynote for the Annual Conference of the Japanese Association of Digital Humanities in 2014. In the end, the article went in another direction, so I didn’t use the data.

I remembered this recently and thought I should I should do something with it. I’ve now created a dataset and shared it on Zenodo. I’m not working on Trove any more, but I’m hoping that someone else might find the data useful!

DOI

The dataset contains information about tweets from 2009 to 2020 that include links to Trove. The tweet data was compiled using Twarc in May 2021, under Twitter’s academic access program. The search queries used were:

  • url:nla.gov.au/nla.news
  • url:trove.nla.gov.au
  • url:newspapers.nla.gov.au

Many of the tweets were produced by bots. Fortunately, I’d been maintaining a list of Trove bots on Twitter, so I used the list to separate the tweets into two files, one for bots and one for ordinary users.

To respect user intentions and comply with the Twitter API terms of use, I removed all the tweet information except for tweet_id and tweet_date from the files. If it hasn’t been deleted, the full data for each tweet can probably be obtained from the X API using the tweet_id, though you might need a paid subscription.

The two main files are:

  • trove_url_tweets.csv – links shared by human users (although it may include some unidentified bots)
  • trove_url_tweets_bots.csv – links shared by bots

I also created some additional data files:

  • trove_url_totals.csv – the number of times each Trove link was shared by users (not including bots)
  • active_users_per_year.csv – the number of unique users each year who shared a link to Trove
  • active_bots_per_year.csv – the number of active bots each year sharing links to Trove

There’s more information about the structure and contents of the data files in the Zenodo record.

Overview

I haven’t explored the data in detail, but here’s some quick summaries to give you a taste.

summary
number of unique users sharing Trove links 9,294
number of bots sharing Trove links 43
number of tweets by humans containing Trove links 48,293
number of tweets by bots containing Trove links 318,797
number of unique links shared by humans 36,886
number of unique links shared by bots 270,501

What types of links were people sharing?

types of link shared by humans count
newspaper article 34,548
other (search queries, home page etc) 8,385
work (items other than newspapers – books, maps, photos etc) 4,856
newspaper page 1,377
newspaper title 400

How did the number of links shared by humans vary across time?

Bar chart showing the number of Trove links shared on Twitter by year from 2009 to 2020. Colours indicate the type of Trove resource.

Which articles or pages were shared most often by humans? Here’s the top ten (click on the link to view).

trove_id trove_type tweets retweets quotes total times shared
75869223 article 1,232 61 34 1,327
1298497 article 141 1,028 53 1,222
102074798 article 74 693 77 844
68141866 article 138 522 48 708
41602327 article 633 30 0 663
100645214 article 111 467 20 598
502650 page 1 513 12 526
60828173 article 48 444 19 511
4173156 article 53 321 10 384
79410604 article 2 303 69 374

The most shared article reports that PM Menzies had described Hitler as a ‘great man’ at a meeting in July 1939. However, most of the tweets sharing this link came from a single user. A number of the other articles relate to the weather, a reflection of the fact that Trove’s newspaper articles have been mobilised on both sides of the climate change debate.

How many Twitter users were sharing links to Trove each year?

Bar chart showing the number of Twitter users sharing links to Trove each year from 2009 to 2020

I haven’t included any of the bot data in these summaries because I think I’ll write a second bot-themed post – coming soon!