software . 2021

Auto-CORPus

Posma, Joram M.;
Python
  • Published: 19 Mar 2021
  • Publisher: bio.tools
Abstract
Auto-CORPus (Automated and Consistent Outputs from Research Publications) is a automated pipeline that cleans HTML files from biomedical literature. The output is a single JSON file that contains the text for each section, table data in machine-readable format and lists of phenotypes and abbreviations found in the article.
Subjects
ACM Computing Classification System: ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
free text keywords: Natural language processing, Workflows, Genotype and phenotype, GWAS study
Communities
Digital Humanities and Cultural Heritage
Download from
bio.tools
Software . 2021
Provider: bio.tools
Any information missing or wrong?Report an Issue