• shareshare
  • link
  • cite
  • add
Publication . Article . 2015

Using part-of-speech tags as deep-syntax indicators in determining short-text semantic similarity

Vuk Batanovic; Dragan Bojic;
Open Access
Published: 01 Jan 2015 Journal: Computer Science and Information Systems, volume 12, pages 1-31 (issn: 1820-0214, eissn: 2406-1018, Copyright policy )
Publisher: National Library of Serbia

This paper presents POST STSS, a method of determining short-text semantic similarity in which part-of-speech tags are used as indicators of the deeper syntactic information usually extracted by more advanced tools like parsers and semantic role labelers. Our model employs a part-of-speech weighting scheme and is based on a statistical bag-of-words approach. It does not require either hand-crafted knowledge bases or advanced syntactic tools, which makes it easily applicable to languages with limited natural language processing resources. By using a paraphrase recognition test, we demonstrate that our system achieves a higher accuracy than all existing statistical similarity algorithms and solutions of a more structural kind. [Projekat Ministarstva nauke Republike Srbije, br. TR 32047]

Subjects by Vocabulary

Microsoft Academic Graph classification: Part of speech Bag-of-words model Syntax Information retrieval Weighting Paraphrase Artificial intelligence business.industry business Semantic similarity Natural language processing computer.software_genre computer Similarity (psychology) Parsing Computer science


General Computer Science

Funded by
MESTD| Design and development of hardware, software and telecomunication infrastructure for turnover controlers
  • Funder: Ministry of Education, Science and Technological Development of Republic of Serbia (MESTD)
  • Project Code: 32047
  • Funding stream: Technological Development (TD or TR)
Related to Research communities
Digital Humanities and Cultural Heritage