Creating links manually between large datasets becomes an extremely tedious task. Although the linked data production is growing massively, the interconnecting needs improvement. This paper presents our work regarding detecting and extending links between Wikidata and COURAGE entities with respect to cultural heritage data. The COURAGE project explored the methods for cultural opposition in the socialist era (cc. 1950–1990), highlighting the variety of alternative cultural scenes that flourished in Eastern Europe before 1989. We describe our methods and results in discovering common entities in the two datasets, and our solution for automating this task. Furthermore, it is shown how it was possible to enrich the data in Wikidata and to establish new, bi-directional connections between COURAGE and Wikidata. Hence, the audience of both databases will have a more complete view of the matched entities.
We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages.