research data . Dataset . 2020 . Embargo end date: 02 Jul 2020

OAGL Paper Metadata Dataset

Çano, Erion;
Open Access
  • Published: 30 Jun 2020
  • Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
OAGL is a paper metadata dataset consisting of 17528680 records which comprise various scientific publication attributes like abstracts, titles, keywords, publication years, venues, etc. The last field of each record is the page length of the corresponding publication. Dataset records (samples) are stored as JSON lines in each text file. The data is derived from OAG data collection ( which was released under ODC-BY license. This data (OAGL Paper Metadata Dataset) is released under CC-BY license ( If using it, please cite the following paper: Çano Erion, Bojar Ondřej: How Many Pa...
Persistent Identifiers
Funded by
European Live Translator
  • Funder: European Commission (EC)
  • Project Code: 825460
  • Funding stream: H2020 | RIA
Digital Humanities and Cultural Heritage
Download from
Any information missing or wrong?Report an Issue