• shareshare
  • link
  • cite
  • add
Other research product . 2015

Using part-of-speech tags as deep-syntax indicators in determining short-text semantic similarity

Batanović Vuk; Bojić Dragan;
Open Access
Published: 01 Jan 2015
Publisher: Computer Science and Information Systems
Country: Serbia

This paper presents POST STSS, a method of determining short-text semantic similarity in which part-of-speech tags are used as indicators of the deeper syntactic information usually extracted by more advanced tools like parsers and semantic role labelers. Our model employs a part-of-speech weighting scheme and is based on a statistical bag-of-words approach. It does not require either hand-crafted knowledge bases or advanced syntactic tools, which makes it easily applicable to languages with limited natural language processing resources. By using a paraphrase recognition test, we demonstrate that our system achieves a higher accuracy than all existing statistical similarity algorithms and solutions of a more structural kind. [Projekat Ministarstva nauke Republike Srbije, br. TR 32047]


short-text semantic similarity, statistical similarity, corpus-based measures, part-of-speech tags, POS weighting, syntactic information, bag-of words model, natural language processing

Related Organizations
Related to Research communities
Digital Humanities and Cultural Heritage