research data . Dataset . 2021 . Embargo end date: 24 May 2021

Latvian user comment dataset 1.0

Shekhar, Ravi; Purver, Matthew; Pollak, Senja; Pelicon, Andraž; Krustok, Ivar;
Open Access
  • Published: 19 Apr 2021
  • Publisher: Ekspress Meedia Group
Abstract
The dataset is an archive of reader comments from the Delfi news site from 2014-2019, containing approximately 12M comments, mostly in the Latvian language, with some in Russian. Description of the Datasets There are 6 CSV files: * ``lv-comments-2014.csv`` contains **2 753 655** comments from year 2014 * ``lv-comments-2015.csv`` contains **2 221 122** comments from year 2015 * ``lv-comments-2016.csv`` contains **1 897 669** comments from year 2016 * ``lv-comments-2017.csv`` contains **1 896 083** comments from year 2017 * ``lv-comments-2018.csv`` contains **2 222 051** comments from year 2018 * ``lv-comments-2019.csv`` contains **1 421 883** comments from year 2...
Persistent Identifiers
Funded by
EC| EMBEDDIA
Project
EMBEDDIA
Cross-Lingual Embeddings for Less-Represented Languages in European News Media
  • Funder: European Commission (EC)
  • Project Code: 825153
  • Funding stream: H2020 | RIA
Communities
Digital Humanities and Cultural Heritage
Download from
Any information missing or wrong?Report an Issue