software . 2021

GLAM-Workbench/trove-newspapers

Sherratt, Tim;
Open Access English
  • Published: 29 Oct 2021
  • Publisher: Zenodo
Abstract
Current version: v1.2.0 This repository contains Jupyter notebooks to work with data from Trove's newspapers zone. For more information see the Trove Newspapers section of the GLAM Workbench. Notebook topics Trove newspapers in context Visualise the total number of newspaper articles in Trove by year and state – explore how Trove's newspaper articles are distributed over time, and by state Analyse rates of OCR correction – explore patterns in OCR text correction; how many corrections are there and where have they been made? Finding non-English newspapers in Trove – use automated language detection to identify non-English language newspapers in Trove Beyond the copyright cliff of death – find newspapers with content published after 1954 Gathering historical data about the addition of newspaper titles to Trove – find when newspaper titles were added to Trove by extracting lists from web archives Visualising searches QueryPic – simple app to visualise newspaper searches over time, this is the latest version with many new features QueryPic Deconstructed – an older version of QueryPic that lets you build queries using keywords, states, or newspapers Visualise Trove newspaper searches over time – use facets to slice up newspaper search results and visualise over time Map Trove newspaper results by state – create a choropleth map to visualise search results by state Map Trove newspaper results by place of publication – links newspapers to their place of publication and maps the results Map Trove newspaper results by place of publication over time – adds a time dimension to the example above Harvesting data See the Trove Newspaper and Gazette Harvester if you want to harvest all the articles from a search. Harvest information about newspaper issues – get information about available issues for each newspaper from the Trove API Harvest the issues of a newspaper as PDFs – harvest available issues of a newspaper as PDFs Harvest Australian Women's Weekly covers (or the front pages of any newspaper) – harvest the front pages of any newspaper, including covers from the Australian Women's Weekly Useful tools Save a Trove newspaper article as an image – grabs the page on which an article was published, and then crops the page image to the boundaries of the article to create a complete, intact image of the article as it was originally published Download a page image – a simple app that lets you download page images as complete, high-resolution JPG files Generate an article thumbnail – generate a nice square thumbnail image for a newspaper article Upload Trove newspaper articles to Omeka-S – steps through the process of uploading Trove newspaper articles to your own Omeka-S instance via the API Tips and tricks Today's news yesterday – uses the date index and the firstpageseq parameter to find articles from exactly 100 years ago that were published on the front page Create a Trove OCR corrections ticker – uses the has:corrections parameter to get the total number of newspaper articles with OCR corrections Get a list of Trove newspapers that doesn't include government gazettes – workaround for a problem with the newspaper/titles endpoint of the API Get the page coordinates of a digitised newspaper article from Trove – demonstrates how to find the coordinates of a newspaper article on a digitised page Get creative Make composite images from lots of Trove newspaper thumbnails – creates thumbnails from a search and compiles them into a mega image Create 'scissors and paste' messages from Trove newspaper articles – snip words out of page images and compile them into the message of your choice Create large composite images from snipped words – harvest multiple versions of a list of words and compile them all into one big image See the GLAM Workbench for more details. Data files CSV formatted lists of newspaper titles in Trove trove_newspaper_titles_2009_2021.csv – complete dataset of captures and titles trove_newspaper_titles_first_appearance_2009_2021.csv – filtered dataset, showing only the first appearance of each title / place / date range combination There is also an alphabetical list of newspaper titles, showing approximately when they first appeared in Trove. CSV formatted list of Australian Women's Weekly issues, 1933 to 1982 Australian Women's Weekly front covers, 1933 to 1982 (2,566 images on Cloudstor) For easy browsing, I've compiled the images into a set of PDF files, one for each decade, available from Dropbox: 1933 to 1939 1940 to 1949 1950 to 1959 1960 to 1969 1970 to 1979 1980 to 1982 Trove newspapers with non-English language content Trove newspapers with articles published after 1954 Cite as See the GLAM Workbench or Zenodo for up-to-date citation details. This repository is part of the GLAM Workbench. If you think this project is worthwhile, you might like to sponsor me on GitHub.
Persistent Identifiers
Subjects
free text keywords: digital humanities, Trove, Jupyter, newspapers, GLAM Workbench
Communities
  • Digital Humanities and Cultural Heritage
  • Social Science and Humanities
Download from
1 research outcomes, page 1 of 1
Any information missing or wrong?Report an Issue