Tim Sherratt

Sharing recent updates and work-in-progress

Nov 2022

Recent updates to trove-newspaper-harvester and trove-newspaper-images

Catching up on some software package updates over the last few months.

The trove-newspaper-harvester package is now at v0.6.5. Recent changes include:

  • Fix to handle articles with missing metadata
  • Don’t try to re-download existing text and PDF files on restart
  • Better error messages for CLI
  • Better handling of exceptions

The trove-newspaper-images package is now at v0.2.1. Recent changes include:

  • Minor changes to make it easier to use this package within the trove-newspaper-harvester
  • Use argparse directly for the CLI, putting the initialisation within a function to avoid conflicts
  • Remove the messages printed to stdout
  • Updated the repository and documentation to use nbdev v2
  • Don’t try to re-download existing images