• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 17 versions
Publication . Article . Other literature type . Preprint . 2021

Packaging research artefacts with RO-Crate

Stian Soiland-Reyes; Peter Sefton; Mercè Crosas; Leyla Jael Castro; Frederik Coppens; José M. Fernández; Daniel Garijo; +9 Authors
Open Access

An increasing number of researchers support reproducibility by including pointers to and descriptions of datasets, software and methods in their publications. However, scientific articles may be ambiguous, incomplete and difficult to process by automated systems. In this paper we introduce RO-Crate, an open, community-driven, and lightweight approach to packaging research artefacts along with their metadata in a machine readable manner. RO-Crate is based on Schema$.$org annotations in JSON-LD, aiming to establish best practices to formally describe metadata in an accessible and practical way for their use in a wide variety of situations. An RO-Crate is a structured archive of all the items that contributed to a research outcome, including their identifiers, provenance, relations and annotations. As a general purpose packaging approach for data and their metadata, RO-Crate is used across multiple areas, including bioinformatics, digital humanities and regulatory sciences. By applying "just enough" Linked Data standards, RO-Crate simplifies the process of making research outputs FAIR while also enhancing research reproducibility. An RO-Crate for this article is available at

Comment: 42 pages. Submitted to Data Science


Data publishing, Data packaging, FAIR, Linked DAta, Metadata, Reproducibility, Research Object, Computer Science - Digital Libraries, H.1.1, H.3.2, General Medicine, Digital Libraries (cs.DL), FOS: Computer and information sciences, H.1.1; H.3.2, Biology and Life Sciences, Technology and Engineering, General Engineering

10. Talking datasets - Understanding data sensemaking behaviours Laura Koesten, Kathleen Gregory, Paul Groth, Elena Simperl International Journal of Human-Computer Studies (2021-02) DOI: 10.1016/j.ijhcs.2020.102562

11. Why linked data is not enough for scientists Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delder eld, Ian Dunlop, … Carole Goble Future Generation Computer Systems (2013-02) DOI: 10.1016/j.future.2011.08.004

12. Using a suite of ontologies for preserving work ow-centric research objects Khalid Belhajjame, Jun Zhao, Daniel Garijo, Matthew Gamble, Kristina Hettne, Raul Palma, Eleni Mina, Oscar Corcho, José Manuel Gómez-Pérez, Sean Bechhofer, … Carole Goble Web Semantics: Science, Services and Agents on the World Wide Web (2015-05) DOI: 10.1016/j.websem.2015.01.003 74. I'll take that to go: Big data bags and minimal identi ers for exchange of large, complex datasets Kyle Chard, Mike D'Arcy, Ben Heavner, Ian Foster, Carl Kesselman, Ravi Madduri, Alexis Rodriguez, Stian Soiland-Reyes, Carole Goble, Kristi Clark, … Arthur Toga 2016 IEEE International Conference on Big Data (Big Data) (2016-12-05) DOI: 10.1109/bigdata.2016.7840618 · ISBN: 978-1-4673-9005-7 112. Beyond authorship: attribution, contribution, collaboration, and credit Amy Brand, Liz Allen, Micah Altman, Marjorie Hlava, Jo Scott Learned Publishing (2015-04-01) DOI: 10.1087/20150211

Funded byView all
Providing an open collaborative space for digital biology in Europe
  • Funder: European Commission (EC)
  • Project Code: 824087
  • Funding stream: H2020 | RIA
  • Funder: Social Sciences and Humanities Research Council (SSHRC)
Synthesis of systematic resources
  • Funder: European Commission (EC)
  • Project Code: 823827
  • Funding stream: H2020 | RIA
Industrial Biotechnology Innovation and Synthetic Biology Accelerator Preparatory Phase
  • Funder: European Commission (EC)
  • Project Code: 871118
  • Funding stream: H2020 | CSA
Related to Research communities
Digital Humanities and Cultural Heritage
Download fromView all 14 sources
Data Science
Article . 2022
Providers: NARCIS