Actions
  • shareshare
  • link
  • cite
  • add
Powered by OpenAIRE graph
Found an issue? Give us feedback
add
auto_awesome_motion View all 2 versions
Research data . Dataset . 2018

Annotated Corpus For Occitan

Bras, Myriam; Esher, Louise; Sibille, Jean; Vergez-Couret, Marianne;
Open Access
Occitan (post 1500); Provençal
Published: 22 Jun 2018
Publisher: Zenodo
Abstract

This corpus contains a collection of texts in Occitan which were manually annotated with parts-of-speech, lemmas. The corpus was produced in the context of the RESTAURE project, funded by the French ANR. The current version of the corpus contains 28 documents and 12,425 tokens. The annotation process is detailed in the following article: http://hal.archives-ouvertes.fr/hal-01704806 The annotated versions are provided in a TSV CoNLL-U format.

Subjects

Occitan, Corpus, Linguistics, FOS: Languages and literature, Part Of Speech, Natural Language Processing, Lemma

Powered by OpenAIRE graph
Found an issue? Give us feedback
Related to Research communities
Digital Humanities and Cultural Heritage
Download fromView all 2 sources
lock_open
moresidebar