publication . Article . 2015

Similar words analysis based on POS-CBOW language model

Dongru RUAN; Hongyan PAN; Kai GAO;
Open Access Chinese
  • Published: 01 Oct 2015 Journal: Journal of Hebei University of Science and Technology, volume 36, issue 5, pages 532-538 (issn: 1008-1542, Copyright policy)
  • Publisher: Hebei University of Science and Technology
Abstract
Similar words analysis is one of the important aspects in the field of natural language processing, and it has important research and application values in text classification, machine translation and information recommendation. Focusing on the features of Sina Weibo's short text, this paper presents a language model named as POS-CBOW, which is a kind of continuous bag-of-words language model with the filtering layer and part-of-speech tagging layer. The proposed approach can adjust the word vectors' similarity according to the cosine similarity and the word vectors' part-of-speech metrics. It can also filter those similar words set on the base of the statistical analysis model. The experimental result shows that the similar words analysis algorithm based on the proposed POS-CBOW language model is better than that based on the traditional CBOW language model.
Subjects
free text keywords: natural language processing, language model, word vector, similar words, POS-CBOW, lcsh:Technology, lcsh:T
Communities
  • Digital Humanities and Cultural Heritage
Any information missing or wrong?Report an Issue