research data . Dataset . Other ORP type . 2017

A Part-Of-Speech (Pos) Lexicon Of Classical Tibetan For Nlp

Hill, Nathan W.; Garrett, Edward;
Open Access
  • Published: 11 May 2017
  • Publisher: Zenodo
  • Country: United Kingdom
Abstract
This part-of-speech (POS) lexicon of Classical Tibetan was prepared in the course of the research project 'Tibetan in Digital Communication' (2012-2015) hosted at SOAS, University of London and funded by the UK's Arts and Humanities Research Council (grant code: AH/J00152X/1). The data for verbs comes from a digitized version of A Lexicon of Tibetan Verb Stems as Reported by the Grammatical Tradition (Munich: Bayerische Akademie der Wissenschaften, 2010) by Nathan W. Hill. Otherwise data comes from the manually part-of-speech tagged training data produced by the corpus and a few lexical items specifically added by hand to improve rule based tagging.
Persistent Identifiers
Subjects
free text keywords: Tibetan language, Natural language processing, part-of-speech tagging, 8570, 8630, 2200, 3100
Communities
  • Digital Humanities and Cultural Heritage
Funded by
ARC| Robust speech recognition in realistic hostile environments
Project
  • Funder: Australian Research Council (ARC) (ARC)
  • Project Code: DP1096348
  • Funding stream: Discovery Projects
Any information missing or wrong?Report an Issue