Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
63 Research products, page 1 of 7

  • Digital Humanities and Cultural Heritage
  • Publications
  • Research data
  • Other research products
  • Article
  • 050905 science studies
  • Scientometrics
  • Social Science and Humanities

10
arrow_drop_down
Relevance
arrow_drop_down
  • Open Access English
    Authors: 
    Zhiqi Wang; Ronald Rousseau;
    Country: Belgium

    The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published

  • Open Access English
    Authors: 
    Mei Hsiu-Ching Ho; John S. Liu;
    Publisher: Springer Science and Business Media LLC

    Scholars all over the world have produced a large body of COVID-19 literature in an exceptionally short period after the outbreak of this rapidly-spreading virus. An analysis of the literature accumulated in the first 150 days hints that the rapid knowledge accumulation in its early-stage development was expedited through a wide variety of journal platforms, a sense and pressure of national urgency, and inspiration from journal editorials.

  • Publication . Article . Preprint . Other literature type . 2018
    Open Access English
    Authors: 
    Giovanni Colavizza;
    Country: Switzerland
    Project: SNSF | Understanding Citations i... (168489), SNSF | Linked Books: Reconstruct... (159961)

    The humanities are often characterized by sociologists as having a low mutual dependence among scholars and high task uncertainty. According to Fuchs' theory of scientific change, this leads over time to intellectual and social fragmentation, as new scholarship accumulates in the absence of shared unifying theories. We consider here a set of specialisms in the discipline of history and measure the connectivity properties of their bibliographic coupling networks over time, in order to assess whether fragmentation is indeed occurring. We construct networks using both reference overlap and textual similarity. It is shown that the connectivity of reference overlap networks is gradually and steadily declining over time, whilst that of textual similarity networks is stable. Author bibliographic coupling networks also show signs of a decline in connectivity, in the absence of an increasing propensity for collaborations. We speculate that, despite the gradual weakening of ties among historians as mapped by references, new scholarship might be continually integrated through shared vocabularies and narratives. This would support our belief that citations are but one kind of bibliometric data to consider --- perhaps even of secondary importance --- when studying the humanities, while text should play a more prominent role.

  • Closed Access
    Authors: 
    Sergio Jimenez; Youlin Avila; George Dueñas; Alexander Gelbukh;
    Publisher: Springer Science and Business Media LLC

    The decision of reading or not a research paper is commonly made while reading its title and abstract. Although content and merit should lead to that decision, other factors such as writing style may intervene. Eventually, more readings could produce more citations. We investigated the stylistic factors in the title and abstract of research papers that affect their “citability”, and built a prediction model for citations at 5, 10, and 15 years. Since the number of citations is the preferred ranking function of several academic search engines, our “citability” function could alleviate the under-representation of recent not-yet-cited papers in query results. For this study, we collected a large dataset of around 750,000 titles and abstracts from articles in Scopus, intended to be representative of the entire science. For each instance, we extracted a relatively large set of 3578 stylistic features that were extracted at different linguistic levels, i.e. characters, syllables, tokens (i.e. words), sentences, stop/content words, and part-of-speech (POS) tags. Particularly, we present a novel set of corpus-based stylistic features that we called Corpus Spectral Signatures (CSS). We found out that a linear prediction model for citations (binned into quartiles) build with only the top-250 correlated features achieved a mean absolute error of 0.805 quartiles, and that on average, predictions were highly correlated with their real values (Spearman’s $$rho=0.515$$ ). CSS features were among the top correlated features, but POS features were the most predictive group of features in an ablation study.

  • Closed Access
    Authors: 
    Cholmyong Pak; Guang Yu; Weibin Wang;
    Publisher: Springer Science and Business Media LLC

    Citation impact indicators play a significant role in evaluating the scientific research activity. Most of citation impact indicators are based on the citation count that the publication is cited as a reference in the other publications, but the difference between each citation situation was not considered. Normally, the number of citations that a publication is cited in the other publications may represent the formal quality of the publication. Similarly, the number of times that a publication is really mentioned within the citing publication, it may also represent the formal quality of the citation. We have examined about how many times each reference was really mentioned within the citing publications and studied about the citation situation within the citing publications. We verified that the citation distribution of references according to the mention frequency follows the Generalized Pareto distribution. The results showed that about 20% of total references were mentioned three and more times, and the number of citation mentions for the about 50% of total references were from about 20% of the total references in the citing publications.

  • Closed Access
    Authors: 
    Tingting Zhang; Baozhen Lee; Qinghua Zhu;
    Publisher: Springer Science and Business Media LLC

    Traditional plagiarism detection is based primarily on methods of character matching or topic similarity. Another promising methodology remains largely unexplored: employing deep mining to establish a contextual hierarchy among themes. This paper proposes a semantic approach to measuring the extent of plagiarism, based on a hierarchical graph model. The main innovations are as follows: (1) hierarchical extraction of topic feature terms and elucidation of a corresponding graph structure; (2) graph similarity calculation based on the maximum common subgraph. This semantic-measure method goes beyond semantic detection of topics to take into account the context of topic feature terms, as well as the hierarchical structure by which those topics are related. This contextual-hierarchical perspective should, in turn, improve the accuracy of plagiarism detection. In addition, by mining the implicit relationships between hierarchical feature terms, our method can detect plagiarized documents with similar themes but using different topic words: a potential boon to plagiarism detection recall. In an experiment conducted on a dataset from Chinese paper database CNKI, the semantic-measure method indeed demonstrates accuracy and recall superior to those achieved with current state-of-the-art methods.

  • Publication . Article . Preprint . 2018 . Embargo End Date: 01 Jan 2018
    Open Access
    Authors: 
    Jinseok Kim;
    Publisher: arXiv
    Project: NSF | Collaborative Research: S... (1535370)

    Author name ambiguity in a digital library may affect the findings of research that mines authorship data of the library. This study evaluates author name disambiguation in DBLP, a widely used but insufficiently evaluated digital library for its disambiguation performance. In doing so, this study takes a triangulation approach that author name disambiguation for a digital library can be better evaluated when its performance is assessed on multiple labeled datasets with comparison to baselines. Tested on three types of labeled data containing 5,000 ~ 700K disambiguated names and 6M pairs of disambiguated names, DBLP is shown to assign author names quite accurately to distinct authors, resulting in pairwise precision, recall, and F1 measures around 0.90 or above overall. DBLP's author name disambiguation performs well even on large ambiguous name blocks but deficiently on distinguishing authors with the same names. When compared to other disambiguation algorithms, DBLP's disambiguation performance is quite competitive, possibly due to its hybrid disambiguation approach combining algorithmic disambiguation and manual error correction. A discussion follows on strengths and weaknesses of labeled datasets used in this study for future efforts to evaluate author name disambiguation on a digital library scale. Comment: Scientometrics (2018)

  • Closed Access
    Authors: 
    Bruno S. Frey; Anthony Gullo;
    Publisher: Springer Science and Business Media LLC

    Can famous economics scholars extend their prominence to the time after their deaths? This question is analyzed for the period 1925–2018 for Nobel Prize laureates. We find that Nobel Prize winners who die prematurely are more likely to experience a marked reduction of attention from their peers, as measured by citations. In contrast, death does not produce this effect for famous economists dying at old age. A few scholars who died prematurely are an exception to the downward trend in attention after death. Such exceptions include Clive Granger, Elinor Ostrom, and to some extent Leonid Kantorovich.

  • Publication . Article . 2018
    Closed Access
    Authors: 
    Baitong Chen; Ying Ding; Feicheng Ma;
    Publisher: Springer Science and Business Media LLC

    Understanding semantic word shifts in scientific domains is essential for facilitating interdisciplinary communication. Using a data set of published papers in the field of information retrieval (IR), this paper studies the semantic shifts of words in IR based on mining per-word topic distribution over time. We propose that semantic word shifts not only occur over time, but also over topics. The shifts are examined from two perspectives, the topic-level and the context-level. According to the over-time word-topic distribution, stable words and unstable words are recognized. The diverging and converging trends in the unstable type reveal characteristics of the topic evolution process. The context-level shifts are further detected by similarities between word vectors. Our work associates semantic word shifts with the evolving of topics, which facilitates a better understanding of semantic word shifts from both topics and contexts.

  • Open Access
    Authors: 
    Wolfgang Glänzel; Lin Zhang;
    Publisher: Springer Science and Business Media LLC

    Proceeding from Moravcsik's paradigmatic ideas of how to build indigenous capability and sustainable science systems in developing countries, we attempted to further focus on the peculiarities of the twenty-first century and the new challenges of globalisation. In doing so, we selected three particular topics deemed relevant in this context: increase of international visibility and reception by the international community, international collaboration and the participation in research in emerging fields. We analysed these issues using the example of 16 developing countries and emerging economies. We found that several countries achieve an impressive citation impact with a considerable share of highly cited papers. The high impact proved to be associated with international collaboration. We also found two extreme situations in international collaboration, both of which might form challenges in building sustainable national science systems and research structures. Research activity in emerging research topics, finally, showed the presence of developing countries in highly topical research and their capability to contribute also to newest research trends.

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
63 Research products, page 1 of 7
  • Open Access English
    Authors: 
    Zhiqi Wang; Ronald Rousseau;
    Country: Belgium

    The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published

  • Open Access English
    Authors: 
    Mei Hsiu-Ching Ho; John S. Liu;
    Publisher: Springer Science and Business Media LLC

    Scholars all over the world have produced a large body of COVID-19 literature in an exceptionally short period after the outbreak of this rapidly-spreading virus. An analysis of the literature accumulated in the first 150 days hints that the rapid knowledge accumulation in its early-stage development was expedited through a wide variety of journal platforms, a sense and pressure of national urgency, and inspiration from journal editorials.

  • Publication . Article . Preprint . Other literature type . 2018
    Open Access English
    Authors: 
    Giovanni Colavizza;
    Country: Switzerland
    Project: SNSF | Understanding Citations i... (168489), SNSF | Linked Books: Reconstruct... (159961)

    The humanities are often characterized by sociologists as having a low mutual dependence among scholars and high task uncertainty. According to Fuchs' theory of scientific change, this leads over time to intellectual and social fragmentation, as new scholarship accumulates in the absence of shared unifying theories. We consider here a set of specialisms in the discipline of history and measure the connectivity properties of their bibliographic coupling networks over time, in order to assess whether fragmentation is indeed occurring. We construct networks using both reference overlap and textual similarity. It is shown that the connectivity of reference overlap networks is gradually and steadily declining over time, whilst that of textual similarity networks is stable. Author bibliographic coupling networks also show signs of a decline in connectivity, in the absence of an increasing propensity for collaborations. We speculate that, despite the gradual weakening of ties among historians as mapped by references, new scholarship might be continually integrated through shared vocabularies and narratives. This would support our belief that citations are but one kind of bibliometric data to consider --- perhaps even of secondary importance --- when studying the humanities, while text should play a more prominent role.

  • Closed Access
    Authors: 
    Sergio Jimenez; Youlin Avila; George Dueñas; Alexander Gelbukh;
    Publisher: Springer Science and Business Media LLC

    The decision of reading or not a research paper is commonly made while reading its title and abstract. Although content and merit should lead to that decision, other factors such as writing style may intervene. Eventually, more readings could produce more citations. We investigated the stylistic factors in the title and abstract of research papers that affect their “citability”, and built a prediction model for citations at 5, 10, and 15 years. Since the number of citations is the preferred ranking function of several academic search engines, our “citability” function could alleviate the under-representation of recent not-yet-cited papers in query results. For this study, we collected a large dataset of around 750,000 titles and abstracts from articles in Scopus, intended to be representative of the entire science. For each instance, we extracted a relatively large set of 3578 stylistic features that were extracted at different linguistic levels, i.e. characters, syllables, tokens (i.e. words), sentences, stop/content words, and part-of-speech (POS) tags. Particularly, we present a novel set of corpus-based stylistic features that we called Corpus Spectral Signatures (CSS). We found out that a linear prediction model for citations (binned into quartiles) build with only the top-250 correlated features achieved a mean absolute error of 0.805 quartiles, and that on average, predictions were highly correlated with their real values (Spearman’s $$rho=0.515$$ ). CSS features were among the top correlated features, but POS features were the most predictive group of features in an ablation study.

  • Closed Access
    Authors: 
    Cholmyong Pak; Guang Yu; Weibin Wang;
    Publisher: Springer Science and Business Media LLC

    Citation impact indicators play a significant role in evaluating the scientific research activity. Most of citation impact indicators are based on the citation count that the publication is cited as a reference in the other publications, but the difference between each citation situation was not considered. Normally, the number of citations that a publication is cited in the other publications may represent the formal quality of the publication. Similarly, the number of times that a publication is really mentioned within the citing publication, it may also represent the formal quality of the citation. We have examined about how many times each reference was really mentioned within the citing publications and studied about the citation situation within the citing publications. We verified that the citation distribution of references according to the mention frequency follows the Generalized Pareto distribution. The results showed that about 20% of total references were mentioned three and more times, and the number of citation mentions for the about 50% of total references were from about 20% of the total references in the citing publications.

  • Closed Access
    Authors: 
    Tingting Zhang; Baozhen Lee; Qinghua Zhu;
    Publisher: Springer Science and Business Media LLC

    Traditional plagiarism detection is based primarily on methods of character matching or topic similarity. Another promising methodology remains largely unexplored: employing deep mining to establish a contextual hierarchy among themes. This paper proposes a semantic approach to measuring the extent of plagiarism, based on a hierarchical graph model. The main innovations are as follows: (1) hierarchical extraction of topic feature terms and elucidation of a corresponding graph structure; (2) graph similarity calculation based on the maximum common subgraph. This semantic-measure method goes beyond semantic detection of topics to take into account the context of topic feature terms, as well as the hierarchical structure by which those topics are related. This contextual-hierarchical perspective should, in turn, improve the accuracy of plagiarism detection. In addition, by mining the implicit relationships between hierarchical feature terms, our method can detect plagiarized documents with similar themes but using different topic words: a potential boon to plagiarism detection recall. In an experiment conducted on a dataset from Chinese paper database CNKI, the semantic-measure method indeed demonstrates accuracy and recall superior to those achieved with current state-of-the-art methods.

  • Publication . Article . Preprint . 2018 . Embargo End Date: 01 Jan 2018
    Open Access
    Authors: 
    Jinseok Kim;
    Publisher: arXiv
    Project: NSF | Collaborative Research: S... (1535370)

    Author name ambiguity in a digital library may affect the findings of research that mines authorship data of the library. This study evaluates author name disambiguation in DBLP, a widely used but insufficiently evaluated digital library for its disambiguation performance. In doing so, this study takes a triangulation approach that author name disambiguation for a digital library can be better evaluated when its performance is assessed on multiple labeled datasets with comparison to baselines. Tested on three types of labeled data containing 5,000 ~ 700K disambiguated names and 6M pairs of disambiguated names, DBLP is shown to assign author names quite accurately to distinct authors, resulting in pairwise precision, recall, and F1 measures around 0.90 or above overall. DBLP's author name disambiguation performs well even on large ambiguous name blocks but deficiently on distinguishing authors with the same names. When compared to other disambiguation algorithms, DBLP's disambiguation performance is quite competitive, possibly due to its hybrid disambiguation approach combining algorithmic disambiguation and manual error correction. A discussion follows on strengths and weaknesses of labeled datasets used in this study for future efforts to evaluate author name disambiguation on a digital library scale. Comment: Scientometrics (2018)

  • Closed Access
    Authors: 
    Bruno S. Frey; Anthony Gullo;
    Publisher: Springer Science and Business Media LLC

    Can famous economics scholars extend their prominence to the time after their deaths? This question is analyzed for the period 1925–2018 for Nobel Prize laureates. We find that Nobel Prize winners who die prematurely are more likely to experience a marked reduction of attention from their peers, as measured by citations. In contrast, death does not produce this effect for famous economists dying at old age. A few scholars who died prematurely are an exception to the downward trend in attention after death. Such exceptions include Clive Granger, Elinor Ostrom, and to some extent Leonid Kantorovich.

  • Publication . Article . 2018
    Closed Access
    Authors: 
    Baitong Chen; Ying Ding; Feicheng Ma;
    Publisher: Springer Science and Business Media LLC

    Understanding semantic word shifts in scientific domains is essential for facilitating interdisciplinary communication. Using a data set of published papers in the field of information retrieval (IR), this paper studies the semantic shifts of words in IR based on mining per-word topic distribution over time. We propose that semantic word shifts not only occur over time, but also over topics. The shifts are examined from two perspectives, the topic-level and the context-level. According to the over-time word-topic distribution, stable words and unstable words are recognized. The diverging and converging trends in the unstable type reveal characteristics of the topic evolution process. The context-level shifts are further detected by similarities between word vectors. Our work associates semantic word shifts with the evolving of topics, which facilitates a better understanding of semantic word shifts from both topics and contexts.

  • Open Access
    Authors: 
    Wolfgang Glänzel; Lin Zhang;
    Publisher: Springer Science and Business Media LLC

    Proceeding from Moravcsik's paradigmatic ideas of how to build indigenous capability and sustainable science systems in developing countries, we attempted to further focus on the peculiarities of the twenty-first century and the new challenges of globalisation. In doing so, we selected three particular topics deemed relevant in this context: increase of international visibility and reception by the international community, international collaboration and the participation in research in emerging fields. We analysed these issues using the example of 16 developing countries and emerging economies. We found that several countries achieve an impressive citation impact with a considerable share of highly cited papers. The high impact proved to be associated with international collaboration. We also found two extreme situations in international collaboration, both of which might form challenges in building sustainable national science systems and research structures. Research activity in emerging research topics, finally, showed the presence of developing countries in highly topical research and their capability to contribute also to newest research trends.