• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 3 versions
Publication . 2022

Wattpad titles corpus

Pianzola, Federico; Rebora, Simone;
Open Access
Published: 30 Aug 2022
Publisher: Open Science Framework
This is collection of all the stories' titles published on Wattpad at the date: January 2018. It's a corpus of around 30 millions titles in more than 50 different languages. It includes mainly original fiction and a small part of fan fiction (roughly 10%). The R Markdown files regarding the procedures for network analysis and sentiment analysis can be found in the GitHub repository: We published an article based on this data

books, classics, DH, digital humanities, digital literary studies, empirical literary studies, fiction, language recognition, literary modeling, literature, narrative, narratology, natural language processing, NLP, novels, reader response, readers, reading, sentiment analysis, social media, social reading, stories, teenagers, Wattpad, world literature

Related Organizations
Funded by
Reading Literature in a Digital Culture
  • Funder: European Commission (EC)
  • Project Code: 792849
  • Funding stream: H2020 | MSCA-IF-GF
Validated by funder
Related to Research communities
Digital Humanities and Cultural Heritage
Download from