research product . 2017

Representing Semantics of Text by Acquiring Its Canonical Form

Taiye, Mohammed Ahmed; Kamaruddin, Siti Sakira; Ahmad, Farzana Kabir;
Open Access English
  • Published: 01 Jan 2017
  • Country: Indonesia
Abstract
Canonical form is a notion stating that related idea should have the same meaning representation. It is a notion that greatly simplifies task by dealing with a single meaning representation for a wide range of expression. The issue in text representation is to generate a formal approach of capturing meaning or semantics in sentences. These issues include heterogeneity and inconsistency in text. Polysemous, synonymous, morphemes and homonymous word poses serious drawbacks when trying to capture senses in sentences. This calls for a need to capture and represent senses in order to resolve vagueness and improve understanding of senses in documents for knowledge creation purposes. We introduce a simple and straightforward method to capture canonical form of sentences. The proposed method first identifies the canonical forms using the Word Sense Disambiguation (WSD) technique and later applies the First Order Predicate Logic (FOPL) scheme to represent the identified canonical forms. We adopted two algorithms in WSD, which are Lesk and Selectional Preference Restriction. These algorithms concentrate mainly on disambiguating senses in words, phrases and sentences. Also we adopted the First order Predicate Logic scheme to analyse argument predicate in sentences, employing the consequence logic theorem to test for satisfiability, validity and completeness of information in sentences.
Subjects
free text keywords: Indonesia, Word Sense Disambiguation, First Order Predicate Logic, Canonical Form, Natural Language Processing, Semantic
Related Organizations
Communities
  • Digital Humanities and Cultural Heritage
Download from
Open Access
Neliti
2017
Providers: Neliti
Any information missing or wrong?Report an Issue