• shareshare
  • link
  • cite
  • add
Powered by OpenAIRE graph
Found an issue? Give us feedback
Other research product . 2021

Learning Robust Representations for Low-resource Information Extraction

Zhou, Yichao;
Open Access
Published: 01 Jan 2021
Publisher: eScholarship, University of California
Country: United States

Information extraction (IE) plays a significant role in automating the knowledge acquisition process from unstructured or semi-structured textual sources. Named entity recognition and relation extraction are the major tasks of IE discussed in this thesis. Traditional IE systems rely on high-quality datasets of large scale to learn the semantic and structural relationship between the observations and labels while such datasets are rare especially in the area of low-resource language processing (e.g. figurative language processing and clinical narrative curation). This leads to the problems of inadequate supervision and model over-fitting. In this thesis, we work on the low-resource IE algorithms and applications. We believe incorporating the supervision from domain-specific auxiliary knowledge and learning transferable representations can mitigate the deficiency of low-resource IE. Specifically, we explore pre-training domain-specific deep language models to acquire informative word/sentence embeddings to curate clinical narratives. We experiment with multi-modal learning techniques to recognize humor and to recommend keywords for advertisement designers. We also extract attributes of interest from the semi-structured web data by building transferable knowledge representations across different websites. For more applications of the low-resource IE, we build a COVID-19 surveillance system by inspecting users' daily social media data. Extensive experiments prove that our algorithms and systems outperform the state-of-the-art approaches and are of impressive interpretability as well.


Computer science, Information Extraction, Natural Language Processing, Text Mining

Related Organizations
Powered by OpenAIRE graph
Found an issue? Give us feedback
Related to Research communities
Digital Humanities and Cultural Heritage