Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
25 Research products, page 1 of 3

  • Digital Humanities and Cultural Heritage
  • Publications
  • Research software
  • Other research products
  • 2019-2023
  • Article
  • 0509 other social sciences
  • 0501 psychology and cognitive sciences
  • Scientometrics

10
arrow_drop_down
Relevance
arrow_drop_down
  • Open Access English
    Authors: 
    Zhiqi Wang; Ronald Rousseau;
    Publisher: Springer Science and Business Media LLC
    Country: Belgium

    The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published

  • Open Access
    Authors: 
    Mei Hsiu-Ching Ho; John S. Liu;
    Publisher: Springer Science and Business Media LLC

    Scholars all over the world have produced a large body of COVID-19 literature in an exceptionally short period after the outbreak of this rapidly-spreading virus. An analysis of the literature accumulated in the first 150 days hints that the rapid knowledge accumulation in its early-stage development was expedited through a wide variety of journal platforms, a sense and pressure of national urgency, and inspiration from journal editorials.

  • Closed Access
    Authors: 
    Sergio Jimenez; Youlin Avila; George Dueñas; Alexander Gelbukh;
    Publisher: Springer Science and Business Media LLC

    The decision of reading or not a research paper is commonly made while reading its title and abstract. Although content and merit should lead to that decision, other factors such as writing style may intervene. Eventually, more readings could produce more citations. We investigated the stylistic factors in the title and abstract of research papers that affect their “citability”, and built a prediction model for citations at 5, 10, and 15 years. Since the number of citations is the preferred ranking function of several academic search engines, our “citability” function could alleviate the under-representation of recent not-yet-cited papers in query results. For this study, we collected a large dataset of around 750,000 titles and abstracts from articles in Scopus, intended to be representative of the entire science. For each instance, we extracted a relatively large set of 3578 stylistic features that were extracted at different linguistic levels, i.e. characters, syllables, tokens (i.e. words), sentences, stop/content words, and part-of-speech (POS) tags. Particularly, we present a novel set of corpus-based stylistic features that we called Corpus Spectral Signatures (CSS). We found out that a linear prediction model for citations (binned into quartiles) build with only the top-250 correlated features achieved a mean absolute error of 0.805 quartiles, and that on average, predictions were highly correlated with their real values (Spearman’s $$rho=0.515$$ ). CSS features were among the top correlated features, but POS features were the most predictive group of features in an ablation study.

  • Closed Access
    Authors: 
    Bruno S. Frey; Anthony Gullo;
    Publisher: Springer Science and Business Media LLC

    Can famous economics scholars extend their prominence to the time after their deaths? This question is analyzed for the period 1925–2018 for Nobel Prize laureates. We find that Nobel Prize winners who die prematurely are more likely to experience a marked reduction of attention from their peers, as measured by citations. In contrast, death does not produce this effect for famous economists dying at old age. A few scholars who died prematurely are an exception to the downward trend in attention after death. Such exceptions include Clive Granger, Elinor Ostrom, and to some extent Leonid Kantorovich.

  • Open Access English
    Authors: 
    Liang Meng; Haifeng Wang; Pengfei Han;
    Publisher: Springer International Publishing

    Intriguing unforced regularities in human behaviors have been reported in varied research domains, including scientometrics. In this study we examine the manuscript submission behavior of researchers, with a focus on its monthly pattern. With a large and reliable dataset which records the submission history of articles published on 10 multidisciplinary journals and 10 management journals over a five-year period (2013-2017), we observe a prominent turn-of-the-month submission effect for accepted papers in management journals but not multidisciplinary journals. This effect gets more pronounced in submissions to top-tier journals and when the first day of a month happens to be a Saturday or Sunday. Sense of ceremony is proposed as a likely explanation of this effect, since the first day of a month is a fundamental temporal landmark which has a 'fresh start effect' on researchers. To conclude, an original and interesting day-of-the-month effect in the academia is reported in this study, which calls for more research attention.

  • Closed Access
    Authors: 
    Tingting Zhang; Baozhen Lee; Qinghua Zhu;
    Publisher: Springer Science and Business Media LLC

    Traditional plagiarism detection is based primarily on methods of character matching or topic similarity. Another promising methodology remains largely unexplored: employing deep mining to establish a contextual hierarchy among themes. This paper proposes a semantic approach to measuring the extent of plagiarism, based on a hierarchical graph model. The main innovations are as follows: (1) hierarchical extraction of topic feature terms and elucidation of a corresponding graph structure; (2) graph similarity calculation based on the maximum common subgraph. This semantic-measure method goes beyond semantic detection of topics to take into account the context of topic feature terms, as well as the hierarchical structure by which those topics are related. This contextual-hierarchical perspective should, in turn, improve the accuracy of plagiarism detection. In addition, by mining the implicit relationships between hierarchical feature terms, our method can detect plagiarized documents with similar themes but using different topic words: a potential boon to plagiarism detection recall. In an experiment conducted on a dataset from Chinese paper database CNKI, the semantic-measure method indeed demonstrates accuracy and recall superior to those achieved with current state-of-the-art methods.

  • Closed Access
    Authors: 
    Jianhua Hou; Xiucai Yang;
    Publisher: Springer Science and Business Media LLC

    Sleeping Beauties in Science have attracted a lot of attention in scientometrics and beyond. However, sleeping beauties also appear in patent. In this paper, we put forward the concept of patent sleeping beauties. Since the evolution trajectory of patents after public announcement includes citation, transformation and license, we have defined the evolution trajectories of patents through three indicators including early sudden awakening (the “Flash in the pan”), early gradual awakening (the “Pea Princess”), delay gradual awakening (the “Ugly Duckling”), delay sudden awakening (the “sleeping beauty”) and sleeping patent. Furthermore, this paper constructs a quantitative model to identify patent sleeping beauties. Taking the graphene technology patent of China as an example, this paper identified the patent sleeping beauties in graphene technology, and found that the number of sleeping beauty patents accounted for only 0.59% of all patents. In the aspect of patent awakening mode, the awakening of patents with gradual awakening is mainly caused by both cited and transferred or cited and licensed. However, both the flash-in-the-pan and the sleeping beauty patents are mainly caused by transferring or licensing single factor. At the same time, through investigation, we found that patent invalidation will not hinder patent awakening, patent awakening will extend the effective life of patents. At last, we provide policy implications for researchers and managers.

  • Open Access
    Authors: 
    Andreas Rehs;
    Publisher: Springer Science and Business Media LLC

    AbstractThe detection of differences or similarities in large numbers of scientific publications is an open problem in scientometric research. In this paper we therefore develop and apply a machine learning approach based on structural topic modelling in combination with cosine similarity and a linear regression framework in order to identify differences in dissertation titles written at East and West German universities before and after German reunification. German reunification and its surrounding time period is used because it provides a structure with both minor and major differences in research topics that could be detected by our approach. Our dataset is based on dissertation titles in economics and business administration and chemistry from 1980 to 2010. We use university affiliation and year of the dissertation to train a structural topic model and then test the model on a set of unseen dissertation titles. Subsequently, we compare the resulting topic distribution of each title to every other title with cosine similarity. The cosine similarities and the regional and temporal origin of the dissertation titles they come from are then used in a linear regression approach. Our results on research topics in economics and business administration suggest substantial differences between East and West Germany before the reunification and a rapid conformation thereafter. In chemistry we observe minor differences between East and West before the reunification and a slightly increased similarity thereafter.

  • Open Access
    Authors: 
    Bakthavachalam Elango;
    Publisher: Springer Science and Business Media LLC

    The aim of the present study is to identify retracted articles in the biomedical literature (co) authored by Indian authors and to examine the features of retracted articles. The PubMed database was searched to find the retracted articles in order to reach the goal. The search yielded 508 records and retrieved for the detailed analysis of: authorships and collaboration type, funding information, who retracts? journals and impact factors, and reasons for retraction. The results show that most of the biomedical articles retracted were published after 2010 and common reasons are plagiarism and fake data for retraction. More than half of the retracted articles were co-authored within the institutions and there is no repeat offender. 25% of retracted articles were published in the top 15 journals and 33% were published in the non-impact factor journals. Average time from publication to retraction is calculated to 2.86 years and retractions due to fake data takes longest period among the reasons. Majority of the funded research was retracted due to fake data whereas it is plagiarism for non-funded.

  • Closed Access
    Authors: 
    Nasrin Asadi; Kambiz Badie; Maryam Mahmoudi;
    Publisher: Springer Science and Business Media LLC

    Zone identification is a topic in the area of text mining which helps researchers be benefited by the content of scientific papers in a satisfactory manner. The major aim of zone identification is to classify the sentences of scientific texts into some predefined zone categories which can be useful for summarization as well as information extraction. In this paper, we propose a two-level approach to zone identification within which the first level is in charge of classifying the sentences in a given paper based on some semantic and lexical features. In this respect, several machine learning algorithms such as Simple Logistics, Logistic Model Trees and Sequential Minimal Optimization are applied. The second level is responsible for applying fusion to the classification results obtained for consecutive sentences of the first level in order to make the final decision. The proposed method is evaluated on ART and DRI corpora as two well-known data sets. Results obtained for the accuracy of zone identification for these corpora are respectively 65.75% and 84.15%, which seem to be quite promising compared to those obtained by previous approaches.

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
25 Research products, page 1 of 3
  • Open Access English
    Authors: 
    Zhiqi Wang; Ronald Rousseau;
    Publisher: Springer Science and Business Media LLC
    Country: Belgium

    The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published

  • Open Access
    Authors: 
    Mei Hsiu-Ching Ho; John S. Liu;
    Publisher: Springer Science and Business Media LLC

    Scholars all over the world have produced a large body of COVID-19 literature in an exceptionally short period after the outbreak of this rapidly-spreading virus. An analysis of the literature accumulated in the first 150 days hints that the rapid knowledge accumulation in its early-stage development was expedited through a wide variety of journal platforms, a sense and pressure of national urgency, and inspiration from journal editorials.

  • Closed Access
    Authors: 
    Sergio Jimenez; Youlin Avila; George Dueñas; Alexander Gelbukh;
    Publisher: Springer Science and Business Media LLC

    The decision of reading or not a research paper is commonly made while reading its title and abstract. Although content and merit should lead to that decision, other factors such as writing style may intervene. Eventually, more readings could produce more citations. We investigated the stylistic factors in the title and abstract of research papers that affect their “citability”, and built a prediction model for citations at 5, 10, and 15 years. Since the number of citations is the preferred ranking function of several academic search engines, our “citability” function could alleviate the under-representation of recent not-yet-cited papers in query results. For this study, we collected a large dataset of around 750,000 titles and abstracts from articles in Scopus, intended to be representative of the entire science. For each instance, we extracted a relatively large set of 3578 stylistic features that were extracted at different linguistic levels, i.e. characters, syllables, tokens (i.e. words), sentences, stop/content words, and part-of-speech (POS) tags. Particularly, we present a novel set of corpus-based stylistic features that we called Corpus Spectral Signatures (CSS). We found out that a linear prediction model for citations (binned into quartiles) build with only the top-250 correlated features achieved a mean absolute error of 0.805 quartiles, and that on average, predictions were highly correlated with their real values (Spearman’s $$rho=0.515$$ ). CSS features were among the top correlated features, but POS features were the most predictive group of features in an ablation study.

  • Closed Access
    Authors: 
    Bruno S. Frey; Anthony Gullo;
    Publisher: Springer Science and Business Media LLC

    Can famous economics scholars extend their prominence to the time after their deaths? This question is analyzed for the period 1925–2018 for Nobel Prize laureates. We find that Nobel Prize winners who die prematurely are more likely to experience a marked reduction of attention from their peers, as measured by citations. In contrast, death does not produce this effect for famous economists dying at old age. A few scholars who died prematurely are an exception to the downward trend in attention after death. Such exceptions include Clive Granger, Elinor Ostrom, and to some extent Leonid Kantorovich.

  • Open Access English
    Authors: 
    Liang Meng; Haifeng Wang; Pengfei Han;
    Publisher: Springer International Publishing

    Intriguing unforced regularities in human behaviors have been reported in varied research domains, including scientometrics. In this study we examine the manuscript submission behavior of researchers, with a focus on its monthly pattern. With a large and reliable dataset which records the submission history of articles published on 10 multidisciplinary journals and 10 management journals over a five-year period (2013-2017), we observe a prominent turn-of-the-month submission effect for accepted papers in management journals but not multidisciplinary journals. This effect gets more pronounced in submissions to top-tier journals and when the first day of a month happens to be a Saturday or Sunday. Sense of ceremony is proposed as a likely explanation of this effect, since the first day of a month is a fundamental temporal landmark which has a 'fresh start effect' on researchers. To conclude, an original and interesting day-of-the-month effect in the academia is reported in this study, which calls for more research attention.

  • Closed Access
    Authors: 
    Tingting Zhang; Baozhen Lee; Qinghua Zhu;
    Publisher: Springer Science and Business Media LLC

    Traditional plagiarism detection is based primarily on methods of character matching or topic similarity. Another promising methodology remains largely unexplored: employing deep mining to establish a contextual hierarchy among themes. This paper proposes a semantic approach to measuring the extent of plagiarism, based on a hierarchical graph model. The main innovations are as follows: (1) hierarchical extraction of topic feature terms and elucidation of a corresponding graph structure; (2) graph similarity calculation based on the maximum common subgraph. This semantic-measure method goes beyond semantic detection of topics to take into account the context of topic feature terms, as well as the hierarchical structure by which those topics are related. This contextual-hierarchical perspective should, in turn, improve the accuracy of plagiarism detection. In addition, by mining the implicit relationships between hierarchical feature terms, our method can detect plagiarized documents with similar themes but using different topic words: a potential boon to plagiarism detection recall. In an experiment conducted on a dataset from Chinese paper database CNKI, the semantic-measure method indeed demonstrates accuracy and recall superior to those achieved with current state-of-the-art methods.

  • Closed Access
    Authors: 
    Jianhua Hou; Xiucai Yang;
    Publisher: Springer Science and Business Media LLC

    Sleeping Beauties in Science have attracted a lot of attention in scientometrics and beyond. However, sleeping beauties also appear in patent. In this paper, we put forward the concept of patent sleeping beauties. Since the evolution trajectory of patents after public announcement includes citation, transformation and license, we have defined the evolution trajectories of patents through three indicators including early sudden awakening (the “Flash in the pan”), early gradual awakening (the “Pea Princess”), delay gradual awakening (the “Ugly Duckling”), delay sudden awakening (the “sleeping beauty”) and sleeping patent. Furthermore, this paper constructs a quantitative model to identify patent sleeping beauties. Taking the graphene technology patent of China as an example, this paper identified the patent sleeping beauties in graphene technology, and found that the number of sleeping beauty patents accounted for only 0.59% of all patents. In the aspect of patent awakening mode, the awakening of patents with gradual awakening is mainly caused by both cited and transferred or cited and licensed. However, both the flash-in-the-pan and the sleeping beauty patents are mainly caused by transferring or licensing single factor. At the same time, through investigation, we found that patent invalidation will not hinder patent awakening, patent awakening will extend the effective life of patents. At last, we provide policy implications for researchers and managers.

  • Open Access
    Authors: 
    Andreas Rehs;
    Publisher: Springer Science and Business Media LLC

    AbstractThe detection of differences or similarities in large numbers of scientific publications is an open problem in scientometric research. In this paper we therefore develop and apply a machine learning approach based on structural topic modelling in combination with cosine similarity and a linear regression framework in order to identify differences in dissertation titles written at East and West German universities before and after German reunification. German reunification and its surrounding time period is used because it provides a structure with both minor and major differences in research topics that could be detected by our approach. Our dataset is based on dissertation titles in economics and business administration and chemistry from 1980 to 2010. We use university affiliation and year of the dissertation to train a structural topic model and then test the model on a set of unseen dissertation titles. Subsequently, we compare the resulting topic distribution of each title to every other title with cosine similarity. The cosine similarities and the regional and temporal origin of the dissertation titles they come from are then used in a linear regression approach. Our results on research topics in economics and business administration suggest substantial differences between East and West Germany before the reunification and a rapid conformation thereafter. In chemistry we observe minor differences between East and West before the reunification and a slightly increased similarity thereafter.

  • Open Access
    Authors: 
    Bakthavachalam Elango;
    Publisher: Springer Science and Business Media LLC

    The aim of the present study is to identify retracted articles in the biomedical literature (co) authored by Indian authors and to examine the features of retracted articles. The PubMed database was searched to find the retracted articles in order to reach the goal. The search yielded 508 records and retrieved for the detailed analysis of: authorships and collaboration type, funding information, who retracts? journals and impact factors, and reasons for retraction. The results show that most of the biomedical articles retracted were published after 2010 and common reasons are plagiarism and fake data for retraction. More than half of the retracted articles were co-authored within the institutions and there is no repeat offender. 25% of retracted articles were published in the top 15 journals and 33% were published in the non-impact factor journals. Average time from publication to retraction is calculated to 2.86 years and retractions due to fake data takes longest period among the reasons. Majority of the funded research was retracted due to fake data whereas it is plagiarism for non-funded.

  • Closed Access
    Authors: 
    Nasrin Asadi; Kambiz Badie; Maryam Mahmoudi;
    Publisher: Springer Science and Business Media LLC

    Zone identification is a topic in the area of text mining which helps researchers be benefited by the content of scientific papers in a satisfactory manner. The major aim of zone identification is to classify the sentences of scientific texts into some predefined zone categories which can be useful for summarization as well as information extraction. In this paper, we propose a two-level approach to zone identification within which the first level is in charge of classifying the sentences in a given paper based on some semantic and lexical features. In this respect, several machine learning algorithms such as Simple Logistics, Logistic Model Trees and Sequential Minimal Optimization are applied. The second level is responsible for applying fusion to the classification results obtained for consecutive sentences of the first level in order to make the final decision. The proposed method is evaluated on ART and DRI corpora as two well-known data sets. Results obtained for the accuracy of zone identification for these corpora are respectively 65.75% and 84.15%, which seem to be quite promising compared to those obtained by previous approaches.