- home
- Advanced Search
2 Research products, page 1 of 1
Loading
- Publication . Article . 2018Open Access EnglishAuthors:Camil Demetrescu; Andrea Ribichini; Marco Schaerf;Camil Demetrescu; Andrea Ribichini; Marco Schaerf;Publisher: Springer VerlagCountry: ItalyProject: EC | SecondHands (643950)
We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2018Open AccessAuthors:Jose Camacho-Collados; Claudio Delli Bovi; Alessandro Raganato; Roberto Navigli;Jose Camacho-Collados; Claudio Delli Bovi; Alessandro Raganato; Roberto Navigli;Publisher: Springer Science and Business Media LLCCountry: ItalyProject: EC | MOUSSE (726487)
Definitional knowledge has proved to be essential in various Natural Language Processing tasks and applications, especially when information at the level of word senses is exploited. However, the few sense-annotated corpora of textual definitions available to date are of limited size: this is mainly due to the expensive and time-consuming process of annotating a wide variety of word senses and entity mentions at a reasonably high scale. In this paper we present SenseDefs, a large-scale high-quality corpus of disambiguated definitions (or glosses) in multiple languages, comprising sense annotations of both concepts and named entities from a wide-coverage unified sense inventory. Our approach for the construction and disambiguation of this corpus builds upon the structure of a large multilingual semantic network and a state-of-the-art disambiguation system: first, we gather complementary information of equivalent definitions across different languages to provide context for disambiguation; then we refine the disambiguation output with a distributional approach based on semantic similarity. As a result, we obtain a multilingual corpus of textual definitions featuring over 38 million definitions in 263 languages, and we publicly release it to the research community. We assess the quality of SenseDefs’s sense annotations both intrinsically and extrinsically on Open Information Extraction and Sense Clustering tasks.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.
2 Research products, page 1 of 1
Loading
- Publication . Article . 2018Open Access EnglishAuthors:Camil Demetrescu; Andrea Ribichini; Marco Schaerf;Camil Demetrescu; Andrea Ribichini; Marco Schaerf;Publisher: Springer VerlagCountry: ItalyProject: EC | SecondHands (643950)
We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2018Open AccessAuthors:Jose Camacho-Collados; Claudio Delli Bovi; Alessandro Raganato; Roberto Navigli;Jose Camacho-Collados; Claudio Delli Bovi; Alessandro Raganato; Roberto Navigli;Publisher: Springer Science and Business Media LLCCountry: ItalyProject: EC | MOUSSE (726487)
Definitional knowledge has proved to be essential in various Natural Language Processing tasks and applications, especially when information at the level of word senses is exploited. However, the few sense-annotated corpora of textual definitions available to date are of limited size: this is mainly due to the expensive and time-consuming process of annotating a wide variety of word senses and entity mentions at a reasonably high scale. In this paper we present SenseDefs, a large-scale high-quality corpus of disambiguated definitions (or glosses) in multiple languages, comprising sense annotations of both concepts and named entities from a wide-coverage unified sense inventory. Our approach for the construction and disambiguation of this corpus builds upon the structure of a large multilingual semantic network and a state-of-the-art disambiguation system: first, we gather complementary information of equivalent definitions across different languages to provide context for disambiguation; then we refine the disambiguation output with a distributional approach based on semantic similarity. As a result, we obtain a multilingual corpus of textual definitions featuring over 38 million definitions in 263 languages, and we publicly release it to the research community. We assess the quality of SenseDefs’s sense annotations both intrinsically and extrinsically on Open Information Extraction and Sense Clustering tasks.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.