Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
229 Research products, page 1 of 23

  • Digital Humanities and Cultural Heritage
  • Publications
  • Research software
  • Other research products
  • Article
  • Scientometrics
  • Digital Humanities and Cultural Heritage

10
arrow_drop_down
Relevance
arrow_drop_down
  • Closed Access
    Authors: 
    Gunnar Sivertsen; Birger Larsen;
    Publisher: Springer Science and Business Media LLC

    A well-designed and comprehensive citation index for the social sciences and humanities has many potential uses, but has yet to be realised. Significant parts of the scholarly production in these areas are not published in international journals, but in national scholarly journals, in book chapters or in monographs. The potential for covering these literatures more comprehensively can now be investigated empirically using a complete publication output data set from the higher education sector of an entire country (Norway). We find that while the international journals in the social sciences and humanities are rather small and more dispersed in specialties, representing a large but not unlimited number of outlets, the domestic journal publishing, as well as book publishing on both the international and domestic levels, show a concentration of many publications in few publication channels. These findings are promising for a more comprehensive coverage of the social sciences and humanities.

  • Closed Access
    Authors: 
    Mingyang Wang; Jiaqi Zhang; Shijia Jiao; Xiangrong Zhang; Na Zhu; Guangsheng Chen;
    Publisher: Springer Science and Business Media LLC

    Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.

  • Open Access
    Authors: 
    Iman Tahamtan; Lutz Bornmann;
    Publisher: Springer Science and Business Media LLC

    The purpose of this paper is to update the review of Bornmann and Daniel (2008) presenting a narrative review of studies on citations in scientific documents. The current review covers 41 studies published between 2006 and 2018. Bornmann and Daniel (2008) focused on earlier years. The current review describes the (new) studies on citation content and context analyses as well as the studies that explore the citation motivation of scholars through surveys or interviews. One focus in this paper is on the technical developments in the last decade, such as the richer meta-data available and machine-readable formats of scientific papers. These developments have resulted in citation context analyses of large datasets in comprehensive studies (which was not possible previously). Many studies in recent years have used computational and machine learning techniques to determine citation functions and polarities, some of which have attempted to overcome the methodological weaknesses of previous studies. The automated recognition of citation functions seems to have the potential to greatly enhance citation indices and information retrieval capabilities. Our review of the empirical studies demonstrates that a paper may be cited for very different scientific and non-scientific reasons. This result accords with the finding by Bornmann and Daniel (2008). The current review also shows that to better understand the relationship between citing and cited documents, a variety of features should be analyzed, primarily the citation context, the semantics and linguistic patterns in citations, citation locations within the citing document, and citation polarity (negative, neutral, positive). 56 pages, 4 figures, 11 tables

  • Closed Access
    Authors: 
    Imran Ihsan; M. Abdul Qadir;
    Publisher: Springer Science and Business Media LLC

    In recent scientific advances, Artificial Intelligence and Natural Language Processing are the major contributors to classifying documents and extracting information. Classifying citations in different classes have gathered a lot of attention due to the large volume of citations available in different digital libraries. Typical citation classification uses sentiment analysis, where various techniques are applied to citations texts to mainly classify them in “Positive”, “Negative” and “Neutral” sentiments. However, there can be innumerable reasons why an author selects another research for citation. Citations’ Context and Reasons Ontology—CCRO uses a clear scientific method to articulate eight basic reasons for citing by using an iterative process of sentiment analysis, collaborative meanings, and experts' opinions. Using CCRO, this research paper adopts an ontology-based approach to extract citation's reasons and instantiate ontology classes and properties on two different corpora of citation sentences. One corpus of citation sentences is a publicly available dataset, while the other is our own manually curated. The process uses a two-step approach. The first part is an interface to manually annotate each citation text in the selected corpora on CCRO properties. A team of carefully selected annotators has annotated each citation to achieve a high inter-annotator agreement. The second part focuses on the automatic extraction of these reasons. Using Natural Language Processing, Mapping Graph, and Reporting Verb in a citation sentence, citation's reason is extracted and mapped onto a CCRO property. After comparing both manual and automatic mapping, accuracy is calculated. Based on experiments and results, accuracy is calculated for both publicly available and own corpora of citation sentences.

  • Publication . Article . 1998
    Closed Access
    Authors: 
    Alexander V. Nemtsov; Nikita A. Zorin;
    Publisher: Springer Science and Business Media LLC

    A comparative study was carried out to determine the trend in the use of statistical methods in the papers published in the leading Russian, American and British psychiatric journals of the 1980–90-ies. Within 10 years the quota of papers with statistics increased considerably in the American and British journals (from 58.6% to 67.6%), especially in theArchives of General Psychiatry (88%). Qualitative changes were notable as well, tending towards the use of non-ordinary innovative methods. As regards the Russian psychiatric papers the use of statistical methods was a rare occurrence (21.8% in 1980s), that never changed within 10 years.

  • Closed Access
    Authors: 
    Andrés Carvallo; Denis Parra; Hans Lobel; Alvaro Soto;
    Publisher: Springer Science and Business Media LLC

    Document screening is a fundamental task within Evidence-based Medicine (EBM), a practice that provides scientific evidence to support medical decisions. Several approaches have tried to reduce physicians’ workload of screening and labeling vast amounts of documents to answer clinical questions. Previous works tried to semi-automate document screening, reporting promising results, but their evaluation was conducted on small datasets, which hinders generalization. Moreover, recent works in natural language processing have introduced neural language models, but none have compared their performance in EBM. In this paper, we evaluate the impact of several document representations such as TF-IDF along with neural language models (BioBERT, BERT, Word2Vec, and GloVe) on an active learning-based setting for document screening in EBM. Our goal is to reduce the number of documents that physicians need to label to answer clinical questions. We evaluate these methods using both a small challenging dataset (CLEF eHealth 2017) as well as a larger one but easier to rank (Epistemonikos). Our results indicate that word as well as textual neural embeddings always outperform the traditional TF-IDF representation. When comparing among neural and textual embeddings, in the CLEF eHealth dataset the models BERT and BioBERT yielded the best results. On the larger dataset, Epistemonikos, Word2Vec and BERT were the most competitive, showing that BERT was the most consistent model across different corpuses. In terms of active learning, an uncertainty sampling strategy combined with a logistic regression achieved the best performance overall, above other methods under evaluation, and in fewer iterations. Finally, we compared the results of evaluating our best models, trained using active learning, with other authors methods from CLEF eHealth, showing better results in terms of work saved for physicians in the document-screening task.

  • Open Access English
    Authors: 
    Zhiqi Wang; Ronald Rousseau;
    Publisher: Springer International Publishing
    Country: Belgium

    The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published

  • Publication . Article . 1994
    Closed Access
    Authors: 
    András Schubert;
    Publisher: Springer Science and Business Media LLC
  • Publication . Preprint . Article . 2018
    Open Access
    Authors: 
    Moustafa, Khaled;
    Publisher: Center for Open Science

    Nature has recently published a Correspondence claiming the absence of fame biases in the editorial choice. The topic is interesting and deserves a deeper analysis than it was presented because the reported brief analysis and its conclusion are somewhat biased for many reasons, some of them are discussed here. Since the editorial assessment is a form of peer-review, the biases reported on external peer-reviews would, thus, apply to the editorial assessment, too. The biases would be proportional to the elitist level of a journal; the more elitist a journal, the more biased its decisions, unavoidably. The bias could be intentional or unintentional, conscious or subconscious, reflecting our imperfect human nature.

  • Closed Access
    Authors: 
    Feng Zou; Mingxing Wu; Kaili Wu;
    Publisher: Springer Science and Business Media LLC

    Bibliographic data on ophthalmology, optometry and visual science (OOVS) literature of China drawn from the SCI-Expanded database covering the period 2000–2007 (961 publications) were analyzed to create a comprehensive overview of research output. Of 961 articles, 480 were published in 2006 and 2007. The majority of researchers worked in university hospitals (53%). 21% of the publications included one or more international co-authors. For each article, the average author number was 4.96±2.73, which increased from 3.96 in 2000 to 5.36 in 2007. The most cited references came from Investigative Ophthalmology & Visual Science and Ophthalmology. The greatest number of studies was focused on the retina.

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
229 Research products, page 1 of 23
  • Closed Access
    Authors: 
    Gunnar Sivertsen; Birger Larsen;
    Publisher: Springer Science and Business Media LLC

    A well-designed and comprehensive citation index for the social sciences and humanities has many potential uses, but has yet to be realised. Significant parts of the scholarly production in these areas are not published in international journals, but in national scholarly journals, in book chapters or in monographs. The potential for covering these literatures more comprehensively can now be investigated empirically using a complete publication output data set from the higher education sector of an entire country (Norway). We find that while the international journals in the social sciences and humanities are rather small and more dispersed in specialties, representing a large but not unlimited number of outlets, the domestic journal publishing, as well as book publishing on both the international and domestic levels, show a concentration of many publications in few publication channels. These findings are promising for a more comprehensive coverage of the social sciences and humanities.

  • Closed Access
    Authors: 
    Mingyang Wang; Jiaqi Zhang; Shijia Jiao; Xiangrong Zhang; Na Zhu; Guangsheng Chen;
    Publisher: Springer Science and Business Media LLC

    Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.

  • Open Access
    Authors: 
    Iman Tahamtan; Lutz Bornmann;
    Publisher: Springer Science and Business Media LLC

    The purpose of this paper is to update the review of Bornmann and Daniel (2008) presenting a narrative review of studies on citations in scientific documents. The current review covers 41 studies published between 2006 and 2018. Bornmann and Daniel (2008) focused on earlier years. The current review describes the (new) studies on citation content and context analyses as well as the studies that explore the citation motivation of scholars through surveys or interviews. One focus in this paper is on the technical developments in the last decade, such as the richer meta-data available and machine-readable formats of scientific papers. These developments have resulted in citation context analyses of large datasets in comprehensive studies (which was not possible previously). Many studies in recent years have used computational and machine learning techniques to determine citation functions and polarities, some of which have attempted to overcome the methodological weaknesses of previous studies. The automated recognition of citation functions seems to have the potential to greatly enhance citation indices and information retrieval capabilities. Our review of the empirical studies demonstrates that a paper may be cited for very different scientific and non-scientific reasons. This result accords with the finding by Bornmann and Daniel (2008). The current review also shows that to better understand the relationship between citing and cited documents, a variety of features should be analyzed, primarily the citation context, the semantics and linguistic patterns in citations, citation locations within the citing document, and citation polarity (negative, neutral, positive). 56 pages, 4 figures, 11 tables

  • Closed Access
    Authors: 
    Imran Ihsan; M. Abdul Qadir;
    Publisher: Springer Science and Business Media LLC

    In recent scientific advances, Artificial Intelligence and Natural Language Processing are the major contributors to classifying documents and extracting information. Classifying citations in different classes have gathered a lot of attention due to the large volume of citations available in different digital libraries. Typical citation classification uses sentiment analysis, where various techniques are applied to citations texts to mainly classify them in “Positive”, “Negative” and “Neutral” sentiments. However, there can be innumerable reasons why an author selects another research for citation. Citations’ Context and Reasons Ontology—CCRO uses a clear scientific method to articulate eight basic reasons for citing by using an iterative process of sentiment analysis, collaborative meanings, and experts' opinions. Using CCRO, this research paper adopts an ontology-based approach to extract citation's reasons and instantiate ontology classes and properties on two different corpora of citation sentences. One corpus of citation sentences is a publicly available dataset, while the other is our own manually curated. The process uses a two-step approach. The first part is an interface to manually annotate each citation text in the selected corpora on CCRO properties. A team of carefully selected annotators has annotated each citation to achieve a high inter-annotator agreement. The second part focuses on the automatic extraction of these reasons. Using Natural Language Processing, Mapping Graph, and Reporting Verb in a citation sentence, citation's reason is extracted and mapped onto a CCRO property. After comparing both manual and automatic mapping, accuracy is calculated. Based on experiments and results, accuracy is calculated for both publicly available and own corpora of citation sentences.

  • Publication . Article . 1998
    Closed Access
    Authors: 
    Alexander V. Nemtsov; Nikita A. Zorin;
    Publisher: Springer Science and Business Media LLC

    A comparative study was carried out to determine the trend in the use of statistical methods in the papers published in the leading Russian, American and British psychiatric journals of the 1980–90-ies. Within 10 years the quota of papers with statistics increased considerably in the American and British journals (from 58.6% to 67.6%), especially in theArchives of General Psychiatry (88%). Qualitative changes were notable as well, tending towards the use of non-ordinary innovative methods. As regards the Russian psychiatric papers the use of statistical methods was a rare occurrence (21.8% in 1980s), that never changed within 10 years.

  • Closed Access
    Authors: 
    Andrés Carvallo; Denis Parra; Hans Lobel; Alvaro Soto;
    Publisher: Springer Science and Business Media LLC

    Document screening is a fundamental task within Evidence-based Medicine (EBM), a practice that provides scientific evidence to support medical decisions. Several approaches have tried to reduce physicians’ workload of screening and labeling vast amounts of documents to answer clinical questions. Previous works tried to semi-automate document screening, reporting promising results, but their evaluation was conducted on small datasets, which hinders generalization. Moreover, recent works in natural language processing have introduced neural language models, but none have compared their performance in EBM. In this paper, we evaluate the impact of several document representations such as TF-IDF along with neural language models (BioBERT, BERT, Word2Vec, and GloVe) on an active learning-based setting for document screening in EBM. Our goal is to reduce the number of documents that physicians need to label to answer clinical questions. We evaluate these methods using both a small challenging dataset (CLEF eHealth 2017) as well as a larger one but easier to rank (Epistemonikos). Our results indicate that word as well as textual neural embeddings always outperform the traditional TF-IDF representation. When comparing among neural and textual embeddings, in the CLEF eHealth dataset the models BERT and BioBERT yielded the best results. On the larger dataset, Epistemonikos, Word2Vec and BERT were the most competitive, showing that BERT was the most consistent model across different corpuses. In terms of active learning, an uncertainty sampling strategy combined with a logistic regression achieved the best performance overall, above other methods under evaluation, and in fewer iterations. Finally, we compared the results of evaluating our best models, trained using active learning, with other authors methods from CLEF eHealth, showing better results in terms of work saved for physicians in the document-screening task.

  • Open Access English
    Authors: 
    Zhiqi Wang; Ronald Rousseau;
    Publisher: Springer International Publishing
    Country: Belgium

    The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published

  • Publication . Article . 1994
    Closed Access
    Authors: 
    András Schubert;
    Publisher: Springer Science and Business Media LLC
  • Publication . Preprint . Article . 2018
    Open Access
    Authors: 
    Moustafa, Khaled;
    Publisher: Center for Open Science

    Nature has recently published a Correspondence claiming the absence of fame biases in the editorial choice. The topic is interesting and deserves a deeper analysis than it was presented because the reported brief analysis and its conclusion are somewhat biased for many reasons, some of them are discussed here. Since the editorial assessment is a form of peer-review, the biases reported on external peer-reviews would, thus, apply to the editorial assessment, too. The biases would be proportional to the elitist level of a journal; the more elitist a journal, the more biased its decisions, unavoidably. The bias could be intentional or unintentional, conscious or subconscious, reflecting our imperfect human nature.

  • Closed Access
    Authors: 
    Feng Zou; Mingxing Wu; Kaili Wu;
    Publisher: Springer Science and Business Media LLC

    Bibliographic data on ophthalmology, optometry and visual science (OOVS) literature of China drawn from the SCI-Expanded database covering the period 2000–2007 (961 publications) were analyzed to create a comprehensive overview of research output. Of 961 articles, 480 were published in 2006 and 2007. The majority of researchers worked in university hospitals (53%). 21% of the publications included one or more international co-authors. For each article, the average author number was 4.96±2.73, which increased from 3.96 in 2000 to 5.36 in 2007. The most cited references came from Investigative Ophthalmology & Visual Science and Ophthalmology. The greatest number of studies was focused on the retina.