- home
- Advanced Search
66 Research products, page 1 of 7
Loading
- Publication . Article . 2020Closed AccessAuthors:Mingyang Wang; Jiaqi Zhang; Shijia Jiao; Xiangrong Zhang; Na Zhu; Guangsheng Chen;Mingyang Wang; Jiaqi Zhang; Shijia Jiao; Xiangrong Zhang; Na Zhu; Guangsheng Chen;Publisher: Springer Science and Business Media LLC
Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2019 . Embargo End Date: 01 Jan 2019Open AccessAuthors:Iman Tahamtan; Lutz Bornmann;Iman Tahamtan; Lutz Bornmann;Publisher: arXiv
The purpose of this paper is to update the review of Bornmann and Daniel (2008) presenting a narrative review of studies on citations in scientific documents. The current review covers 41 studies published between 2006 and 2018. Bornmann and Daniel (2008) focused on earlier years. The current review describes the (new) studies on citation content and context analyses as well as the studies that explore the citation motivation of scholars through surveys or interviews. One focus in this paper is on the technical developments in the last decade, such as the richer meta-data available and machine-readable formats of scientific papers. These developments have resulted in citation context analyses of large datasets in comprehensive studies (which was not possible previously). Many studies in recent years have used computational and machine learning techniques to determine citation functions and polarities, some of which have attempted to overcome the methodological weaknesses of previous studies. The automated recognition of citation functions seems to have the potential to greatly enhance citation indices and information retrieval capabilities. Our review of the empirical studies demonstrates that a paper may be cited for very different scientific and non-scientific reasons. This result accords with the finding by Bornmann and Daniel (2008). The current review also shows that to better understand the relationship between citing and cited documents, a variety of features should be analyzed, primarily the citation context, the semantics and linguistic patterns in citations, citation locations within the citing document, and citation polarity (negative, neutral, positive). Comment: 56 pages, 4 figures, 11 tables
Substantial popularitySubstantial popularity In top 1%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Closed AccessAuthors:Imran Ihsan; M. Abdul Qadir;Imran Ihsan; M. Abdul Qadir;Publisher: Springer Science and Business Media LLC
In recent scientific advances, Artificial Intelligence and Natural Language Processing are the major contributors to classifying documents and extracting information. Classifying citations in different classes have gathered a lot of attention due to the large volume of citations available in different digital libraries. Typical citation classification uses sentiment analysis, where various techniques are applied to citations texts to mainly classify them in “Positive”, “Negative” and “Neutral” sentiments. However, there can be innumerable reasons why an author selects another research for citation. Citations’ Context and Reasons Ontology—CCRO uses a clear scientific method to articulate eight basic reasons for citing by using an iterative process of sentiment analysis, collaborative meanings, and experts' opinions. Using CCRO, this research paper adopts an ontology-based approach to extract citation's reasons and instantiate ontology classes and properties on two different corpora of citation sentences. One corpus of citation sentences is a publicly available dataset, while the other is our own manually curated. The process uses a two-step approach. The first part is an interface to manually annotate each citation text in the selected corpora on CCRO properties. A team of carefully selected annotators has annotated each citation to achieve a high inter-annotator agreement. The second part focuses on the automatic extraction of these reasons. Using Natural Language Processing, Mapping Graph, and Reporting Verb in a citation sentence, citation's reason is extracted and mapped onto a CCRO property. After comparing both manual and automatic mapping, accuracy is calculated. Based on experiments and results, accuracy is calculated for both publicly available and own corpora of citation sentences.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Open Access EnglishAuthors:Zhiqi Wang; Ronald Rousseau;Zhiqi Wang; Ronald Rousseau;Publisher: Springer Science and Business Media LLCCountry: Belgium
The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Closed AccessAuthors:Andrés Carvallo; Denis Parra; Hans Lobel; Alvaro Soto;Andrés Carvallo; Denis Parra; Hans Lobel; Alvaro Soto;Publisher: Springer Science and Business Media LLC
Document screening is a fundamental task within Evidence-based Medicine (EBM), a practice that provides scientific evidence to support medical decisions. Several approaches have tried to reduce physicians’ workload of screening and labeling vast amounts of documents to answer clinical questions. Previous works tried to semi-automate document screening, reporting promising results, but their evaluation was conducted on small datasets, which hinders generalization. Moreover, recent works in natural language processing have introduced neural language models, but none have compared their performance in EBM. In this paper, we evaluate the impact of several document representations such as TF-IDF along with neural language models (BioBERT, BERT, Word2Vec, and GloVe) on an active learning-based setting for document screening in EBM. Our goal is to reduce the number of documents that physicians need to label to answer clinical questions. We evaluate these methods using both a small challenging dataset (CLEF eHealth 2017) as well as a larger one but easier to rank (Epistemonikos). Our results indicate that word as well as textual neural embeddings always outperform the traditional TF-IDF representation. When comparing among neural and textual embeddings, in the CLEF eHealth dataset the models BERT and BioBERT yielded the best results. On the larger dataset, Epistemonikos, Word2Vec and BERT were the most competitive, showing that BERT was the most consistent model across different corpuses. In terms of active learning, an uncertainty sampling strategy combined with a logistic regression achieved the best performance overall, above other methods under evaluation, and in fewer iterations. Finally, we compared the results of evaluating our best models, trained using active learning, with other authors methods from CLEF eHealth, showing better results in terms of work saved for physicians in the document-screening task.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . Other literature type . Article . 2020Open AccessAuthors:Martin Wieland; Juan Gorraiz;Martin Wieland; Juan Gorraiz;
handle: 11353/10.1218407
Country: AustriaAbstractFrom a historical point of view, Rome and especially the University of La Sapienza, are closely linked to two geniuses of Baroque art: Bernini and Borromini. In this study, we analyze the rivalry between them from a scientometric perspective. This study also serves as a basis for exploring which data sources may be appropriate for broad impact assessment of individuals and/or celebrities. We pay special attention to encyclopaedias, library catalogues and other databases or types of publications that are not normally used for this purpose. The results show that some sources such as Wikipedia are not exploited according to the possibilities they offer, especially those related to different languages and cultures. Moreover, analyses are often reduced to a minimum number of data sources, which can distort the relevance of the outcome. Our results show that other sources normally not considered for this purpose, like JSTOR, PQDT, Google Scholar, Catalogue Holdings, etc. can provide more relevant or abundant information than the typically used Web of Science Core Collection and Scopus. Finally, we also contrast opportunities and limitation of old and new (YouTube, Twitter) data sources (particularly the aspects quality and accuracy of the search methods). Much room for improvement has been identified in order to use data sources more efficiently and with higher accuracy.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Closed AccessAuthors:YiJun Liu; Li Zhang; Xiaoli Lian;YiJun Liu; Li Zhang; Xiaoli Lian;Publisher: Springer Science and Business Media LLC
Keywords serving a dense summary of documents, are widely used in search engine and library to do information retrieval, content classification, speech recognition and automated text summarization. However, massive documents are lack of keywords, and the rapid generation of the large amount of content every day makes the human annotation really time-consuming. Lots of researches show that network-based approaches have remarkable performance for extracting text keywords. Traditionally, words are connected based upon their occurrence in documents. One recent work shows the significant influence of sentences on keywords extraction beyond the traditional methods only considering words. While in addition to words and sentences, chapters are the essential parts that are organized as the higher level semantic logic of the documents. Inspired by this idea, we therefore assume that chapters should contribute to the keyword extraction too. We further add the chapter factor to build a three-layer network model and propose a Word-Sentence-Chapter network-based approach for keywords extraction. Two experiments with Chinese and English documents respectively indicate that our approach outperforms the state of arts.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Open AccessAuthors:Mei Hsiu-Ching Ho; John S. Liu;Mei Hsiu-Ching Ho; John S. Liu;Publisher: Springer Science and Business Media LLC
Scholars all over the world have produced a large body of COVID-19 literature in an exceptionally short period after the outbreak of this rapidly-spreading virus. An analysis of the literature accumulated in the first 150 days hints that the rapid knowledge accumulation in its early-stage development was expedited through a wide variety of journal platforms, a sense and pressure of national urgency, and inspiration from journal editorials.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Closed AccessAuthors:Kiran Sharma;Kiran Sharma;Publisher: Springer Science and Business Media LLC
The growth of the retraction databases reveals the disturbing trend in science and also the rising trend of citations of retracted papers is a serious concern. The objective of the study is to investigate the patterns of retractions through the team size and retracted citations. The publication records of 12,231 retracted papers indexed by Web of Science (WoS) are analyzed to investigate (i) the patterns of retraction associated with collaboration and team size; and (ii) the impact of retracted papers on the papers that are citing the retracted papers (retracted citations). The study demonstrates the collaboration patterns of retracted publications where 61.5% of authors have only one and 24.6% have two retracted papers; however, 2% of authors have more than retracted papers. Also, the temporal evolution of the team size reveals that teams smaller in size have more retractions. The impact of citing retracted papers reveals that 55.2% of retracted papers have been cited at least once. 1/4th of the citations to the retracted papers are self-citations which themselves are retractions. On average 71.4% citations are the non-retracted citations and 28.6% citations are retracted citations which are mostly the self-citations. Last, the variation in average team size and average retracted citations in various research areas (having high retraction) is presented. Retracted publications in high-impact journals are highly cited.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Closed AccessAuthors:Marek Kosmulski;Marek Kosmulski;Publisher: Springer Science and Business Media LLCAverage popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.
add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.
66 Research products, page 1 of 7
Loading
- Publication . Article . 2020Closed AccessAuthors:Mingyang Wang; Jiaqi Zhang; Shijia Jiao; Xiangrong Zhang; Na Zhu; Guangsheng Chen;Mingyang Wang; Jiaqi Zhang; Shijia Jiao; Xiangrong Zhang; Na Zhu; Guangsheng Chen;Publisher: Springer Science and Business Media LLC
Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . Preprint . 2019 . Embargo End Date: 01 Jan 2019Open AccessAuthors:Iman Tahamtan; Lutz Bornmann;Iman Tahamtan; Lutz Bornmann;Publisher: arXiv
The purpose of this paper is to update the review of Bornmann and Daniel (2008) presenting a narrative review of studies on citations in scientific documents. The current review covers 41 studies published between 2006 and 2018. Bornmann and Daniel (2008) focused on earlier years. The current review describes the (new) studies on citation content and context analyses as well as the studies that explore the citation motivation of scholars through surveys or interviews. One focus in this paper is on the technical developments in the last decade, such as the richer meta-data available and machine-readable formats of scientific papers. These developments have resulted in citation context analyses of large datasets in comprehensive studies (which was not possible previously). Many studies in recent years have used computational and machine learning techniques to determine citation functions and polarities, some of which have attempted to overcome the methodological weaknesses of previous studies. The automated recognition of citation functions seems to have the potential to greatly enhance citation indices and information retrieval capabilities. Our review of the empirical studies demonstrates that a paper may be cited for very different scientific and non-scientific reasons. This result accords with the finding by Bornmann and Daniel (2008). The current review also shows that to better understand the relationship between citing and cited documents, a variety of features should be analyzed, primarily the citation context, the semantics and linguistic patterns in citations, citation locations within the citing document, and citation polarity (negative, neutral, positive). Comment: 56 pages, 4 figures, 11 tables
Substantial popularitySubstantial popularity In top 1%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Closed AccessAuthors:Imran Ihsan; M. Abdul Qadir;Imran Ihsan; M. Abdul Qadir;Publisher: Springer Science and Business Media LLC
In recent scientific advances, Artificial Intelligence and Natural Language Processing are the major contributors to classifying documents and extracting information. Classifying citations in different classes have gathered a lot of attention due to the large volume of citations available in different digital libraries. Typical citation classification uses sentiment analysis, where various techniques are applied to citations texts to mainly classify them in “Positive”, “Negative” and “Neutral” sentiments. However, there can be innumerable reasons why an author selects another research for citation. Citations’ Context and Reasons Ontology—CCRO uses a clear scientific method to articulate eight basic reasons for citing by using an iterative process of sentiment analysis, collaborative meanings, and experts' opinions. Using CCRO, this research paper adopts an ontology-based approach to extract citation's reasons and instantiate ontology classes and properties on two different corpora of citation sentences. One corpus of citation sentences is a publicly available dataset, while the other is our own manually curated. The process uses a two-step approach. The first part is an interface to manually annotate each citation text in the selected corpora on CCRO properties. A team of carefully selected annotators has annotated each citation to achieve a high inter-annotator agreement. The second part focuses on the automatic extraction of these reasons. Using Natural Language Processing, Mapping Graph, and Reporting Verb in a citation sentence, citation's reason is extracted and mapped onto a CCRO property. After comparing both manual and automatic mapping, accuracy is calculated. Based on experiments and results, accuracy is calculated for both publicly available and own corpora of citation sentences.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Open Access EnglishAuthors:Zhiqi Wang; Ronald Rousseau;Zhiqi Wang; Ronald Rousseau;Publisher: Springer Science and Business Media LLCCountry: Belgium
The Yule-Simpson paradox refers to the fact that outcomes of comparisons between groups are reversed when groups are combined. Using Essential Sciences Indicators, a part of InCites (Clarivate), data for countries, it is shown that although the Yule-Simpson phenomenon in citation analysis and research evaluation is not common, it isn't extremely rare either. The Yule-Simpson paradox is a phenomenon one should be aware of, otherwise one may encounter unforeseen surprises in scientometric studies. ispartof: SCIENTOMETRICS vol:126 issue:4 pages:3501-3511 ispartof: location:Switzerland status: published
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Closed AccessAuthors:Andrés Carvallo; Denis Parra; Hans Lobel; Alvaro Soto;Andrés Carvallo; Denis Parra; Hans Lobel; Alvaro Soto;Publisher: Springer Science and Business Media LLC
Document screening is a fundamental task within Evidence-based Medicine (EBM), a practice that provides scientific evidence to support medical decisions. Several approaches have tried to reduce physicians’ workload of screening and labeling vast amounts of documents to answer clinical questions. Previous works tried to semi-automate document screening, reporting promising results, but their evaluation was conducted on small datasets, which hinders generalization. Moreover, recent works in natural language processing have introduced neural language models, but none have compared their performance in EBM. In this paper, we evaluate the impact of several document representations such as TF-IDF along with neural language models (BioBERT, BERT, Word2Vec, and GloVe) on an active learning-based setting for document screening in EBM. Our goal is to reduce the number of documents that physicians need to label to answer clinical questions. We evaluate these methods using both a small challenging dataset (CLEF eHealth 2017) as well as a larger one but easier to rank (Epistemonikos). Our results indicate that word as well as textual neural embeddings always outperform the traditional TF-IDF representation. When comparing among neural and textual embeddings, in the CLEF eHealth dataset the models BERT and BioBERT yielded the best results. On the larger dataset, Epistemonikos, Word2Vec and BERT were the most competitive, showing that BERT was the most consistent model across different corpuses. In terms of active learning, an uncertainty sampling strategy combined with a logistic regression achieved the best performance overall, above other methods under evaluation, and in fewer iterations. Finally, we compared the results of evaluating our best models, trained using active learning, with other authors methods from CLEF eHealth, showing better results in terms of work saved for physicians in the document-screening task.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Conference object . Other literature type . Article . 2020Open AccessAuthors:Martin Wieland; Juan Gorraiz;Martin Wieland; Juan Gorraiz;
handle: 11353/10.1218407
Country: AustriaAbstractFrom a historical point of view, Rome and especially the University of La Sapienza, are closely linked to two geniuses of Baroque art: Bernini and Borromini. In this study, we analyze the rivalry between them from a scientometric perspective. This study also serves as a basis for exploring which data sources may be appropriate for broad impact assessment of individuals and/or celebrities. We pay special attention to encyclopaedias, library catalogues and other databases or types of publications that are not normally used for this purpose. The results show that some sources such as Wikipedia are not exploited according to the possibilities they offer, especially those related to different languages and cultures. Moreover, analyses are often reduced to a minimum number of data sources, which can distort the relevance of the outcome. Our results show that other sources normally not considered for this purpose, like JSTOR, PQDT, Google Scholar, Catalogue Holdings, etc. can provide more relevant or abundant information than the typically used Web of Science Core Collection and Scopus. Finally, we also contrast opportunities and limitation of old and new (YouTube, Twitter) data sources (particularly the aspects quality and accuracy of the search methods). Much room for improvement has been identified in order to use data sources more efficiently and with higher accuracy.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2020Closed AccessAuthors:YiJun Liu; Li Zhang; Xiaoli Lian;YiJun Liu; Li Zhang; Xiaoli Lian;Publisher: Springer Science and Business Media LLC
Keywords serving a dense summary of documents, are widely used in search engine and library to do information retrieval, content classification, speech recognition and automated text summarization. However, massive documents are lack of keywords, and the rapid generation of the large amount of content every day makes the human annotation really time-consuming. Lots of researches show that network-based approaches have remarkable performance for extracting text keywords. Traditionally, words are connected based upon their occurrence in documents. One recent work shows the significant influence of sentences on keywords extraction beyond the traditional methods only considering words. While in addition to words and sentences, chapters are the essential parts that are organized as the higher level semantic logic of the documents. Inspired by this idea, we therefore assume that chapters should contribute to the keyword extraction too. We further add the chapter factor to build a three-layer network model and propose a Word-Sentence-Chapter network-based approach for keywords extraction. Two experiments with Chinese and English documents respectively indicate that our approach outperforms the state of arts.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Open AccessAuthors:Mei Hsiu-Ching Ho; John S. Liu;Mei Hsiu-Ching Ho; John S. Liu;Publisher: Springer Science and Business Media LLC
Scholars all over the world have produced a large body of COVID-19 literature in an exceptionally short period after the outbreak of this rapidly-spreading virus. An analysis of the literature accumulated in the first 150 days hints that the rapid knowledge accumulation in its early-stage development was expedited through a wide variety of journal platforms, a sense and pressure of national urgency, and inspiration from journal editorials.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Closed AccessAuthors:Kiran Sharma;Kiran Sharma;Publisher: Springer Science and Business Media LLC
The growth of the retraction databases reveals the disturbing trend in science and also the rising trend of citations of retracted papers is a serious concern. The objective of the study is to investigate the patterns of retractions through the team size and retracted citations. The publication records of 12,231 retracted papers indexed by Web of Science (WoS) are analyzed to investigate (i) the patterns of retraction associated with collaboration and team size; and (ii) the impact of retracted papers on the papers that are citing the retracted papers (retracted citations). The study demonstrates the collaboration patterns of retracted publications where 61.5% of authors have only one and 24.6% have two retracted papers; however, 2% of authors have more than retracted papers. Also, the temporal evolution of the team size reveals that teams smaller in size have more retractions. The impact of citing retracted papers reveals that 55.2% of retracted papers have been cited at least once. 1/4th of the citations to the retracted papers are self-citations which themselves are retractions. On average 71.4% citations are the non-retracted citations and 28.6% citations are retracted citations which are mostly the self-citations. Last, the variation in average team size and average retracted citations in various research areas (having high retraction) is presented. Retracted publications in high-impact journals are highly cited.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Publication . Article . 2021Closed AccessAuthors:Marek Kosmulski;Marek Kosmulski;Publisher: Springer Science and Business Media LLCAverage popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.
add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.