- home
- Advanced Search
25 Research products, page 1 of 3
Loading
- Research data . 2021Authors:Mulwafu, Watipaso; Xiao, Guangyi;Mulwafu, Watipaso; Xiao, Guangyi;
doi: 10.21227/38e1-s785
Publisher: IEEE DataPortThis is a dataset containing 1,661 movie scripts. Movies scripts extracted thanks to the RiTUAL Lab. It is a subset and variation of this dataset. On our part, we added age certificates and severity levels to it. These severity levels cover profanity, violence and sex content.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Authors:Melendez Barros, Jose; De Bona, Glauber;Melendez Barros, Jose; De Bona, Glauber;
doi: 10.21227/0ej1-br13
Publisher: IEEE DataPortAspect Sentiment Triplet Extraction (ASTE) is an Aspect-Based Sentiment Analysis subtask (ABSA). It aims to extract aspect-opinion pairs from a sentence and identify the sentiment polarity associated with them. For instance, given the sentence ``Large rooms and great breakfast", ASTE outputs the triplet T = {(rooms, large, positive), (breakfast, great, positive)}. Although several approaches to ASBA have recently been proposed, those for Portuguese have been mostly limited to extracting only aspects without addressing ASTE tasks. This work aims to develop a framework based on Deep Learning to perform the Aspect Sentiment Triplet Extraction task in Portuguese. The framework uses BERT as a context-awareness sentence encoder, multiple parallel non-linear layers to get aspect and opinion representations, and a Graph Attention layer along with a Biaffine scorer to determine the sentiment dependency between each aspect-opinion pair. The comparison results show that our proposed framework significantly outperforms the baselines in Portuguese and is competitive with its counterparts in English.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Authors:Ilyevsky, Thomas Victor; Johansen, Jared Sigurd; Siskind, Jeffrey Mark;Ilyevsky, Thomas Victor; Johansen, Jared Sigurd; Siskind, Jeffrey Mark;
doi: 10.21227/zxk7-ca24
Publisher: IEEE DataPortDataset asscociated with a paper in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems"Talk the talk and walk the walk: Dialogue-driven navigation in unknown indoor environments"If you use this code or data, please cite the above paper.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Authors:Aberkane, Abdel-Jaouad;Aberkane, Abdel-Jaouad;
doi: 10.21227/0mz8-ws69
Publisher: IEEE DataPortThe General Data Protection Regulation (GDPR), adopted in 2018, profoundly impacts information processing organizations as they must comply with this regulation. In this research, we consider GDPR-compliance as a high-level goal in software development that should be addressed at the offset of software development, meaning during requirements engineering (RE). In this work, we hypothesize that Natural Language Processing (NLP) can offer a viable means to automate this process. We conducted a systematic mapping study to explore the existing literature on the intersection of GDPR, RE, and NLP. As a result, we identified 448 relevant studies, of which the majority (420) were related to NLP and RE. Research on the intersection of GDPR and NLP yielded nine studies, while 20 studies were related to GDPR and RE. Even though only one study was identified on the convergence of GDPR, NLP, and RE, the mapping results indicate opportunities for bridging the gap between these fields. In particular, we identified possibilities for introducing NLP techniques to automate manual RE tasks in the crossing of GDPR and RE, in addition to possibilities of using NLP-based machine learning techniques to achieve GDPR-compliance in RE.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021
A selection of theCOVID-19 Open Research Dataset used for exploring the efficacy of the LDaRM text analytics technique.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Open AccessAuthors:Chowdhury, Sawrav;Chowdhury, Sawrav;Publisher: Data Archiving and Networked Services (DANS)
Right now we see that depression is one of the most common problems in our society. Most of the time people are committed suicide only cause of depression. And till now there is no proper lab test way for detecting depression. Generally, doctors are detecting depression by asking some knowledge-base questions. On the other hand, there are a good number of people using social media platforms right now, where they are sharing their daily experiences, emotion, and other activity with their friends. Twitter is one of the common social platforms and also popular for data collection. I was collecting these datasets from twitter based on some depressive words. I hope that this twitter datasets will help researchers to detect depression more precisely.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020
Wine has been popular with the public for centuries; in the market, there are a variety of wines to choose from. Among all, Bordeaux, France, is considered as the most famous wine region in the world. In this paper, we try to understand Bordeaux wines made in the 21st century through Wineinformatics study. We developed and studied two datasets: the first dataset is all the Bordeaux wine from 2000 to 2016; and the second one is all wines listed in a famous collection of Bordeaux wines, 1855 Bordeaux Wine Official Classification, from 2000 to 2016. A total of 14,349 wine reviews are collected in the first dataset, and 1359 wine reviews in the second dataset. In order to understand the relation between wine quality and characteristics, Naïve Bayes classifier is applied to predict the qualities (90+/89−) of wines. Support Vector Machine (SVM) classifier is also applied as a comparison. In the first dataset, SVM classifier achieves the best accuracy of 86.97%; in the second dataset, Naïve Bayes classifier achieves the best accuracy of 84.62%. Precision, recall, and f-score are also used as our measures to describe the performance of our models. Meaningful features associate with high quality 21 century Bordeaux wines are able to be presented through this research paper.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Authors:Chen, Bernard;Chen, Bernard;Publisher: IEEE DataPort
Wine has been popular with the public for centuries; in the market, there are a variety of wines to choose from. Among all, Bordeaux, France, is considered as the most famous wine region in the world. In this paper, we try to understand Bordeaux wines made in the 21st century through Wineinformatics study. We developed and studied two datasets: the first dataset is all the Bordeaux wine from 2000 to 2016; and the second one is all wines listed in a famous collection of Bordeaux wines, 1855 Bordeaux Wine Official Classification, from 2000 to 2016. A total of 14,349 wine reviews are collected in the first dataset, and 1359 wine reviews in the second dataset. In order to understand the relation between wine quality and characteristics, Naïve Bayes classifier is applied to predict the qualities (90+/89−) of wines. Support Vector Machine (SVM) classifier is also applied as a comparison. In the first dataset, SVM classifier achieves the best accuracy of 86.97%; in the second dataset, Naïve Bayes classifier achieves the best accuracy of 84.62%. Precision, recall, and f-score are also used as our measures to describe the performance of our models. Meaningful features associate with high quality 21 century Bordeaux wines are able to be presented through this research paper.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Authors:Moneda, Luis; Yonekura, David; Guedes, Elloá;Moneda, Luis; Yonekura, David; Guedes, Elloá;Publisher: IEEE DataPort
This dataset is a collection of images and their respective labels containing examples of multiple Brazilian coins, the primary purpose is to support the development of Computer Vision techniques for automatic detection of such objects, i.e., localization and classification tasks. It contains coins of R$ 0.05, 0.10, 0.25, 0.50 and 1.00 in Brazilian currency from the 2nd family, as manufactured by Casa da Moeda (http://www.casadamoeda.gov.br) since 2010. The samples were collected with a mobile phone and contain multiple coins placed upon a flat white A4 sheet of paper. Labels were obtained from a group with several individuals from both sexes and detailed reviewed. Each label has a circular or polygon shape and denotes the corresponding value in cents of the coin it is related to. This dataset is an improvement from Brazilian Coins (available at https://www.kaggle.com/lgmoneda/br-coins) where location labels were created.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Authors:Hyun, Young Geun; Ko, Jindeuk; Han, Jeong Hyeon;Hyun, Young Geun; Ko, Jindeuk; Han, Jeong Hyeon;Publisher: IEEE DataPort
The age of Artificial Intelligence (AI) is coming. Since Natural Language Processing (NLP) is a core AI technology for communication between humans and devices, it is vital to understand technological trends. Early research on NLP focused on syntactic processing such as information extraction and subject modeling but later developed into the semantic-oriented analysis. To analyze technological trends concerning NLP, especially semantic analysis, patent data that contains objective and extensive information is analyzed. The analysis procedures follow text mining to collect patent information, pre-processing, and analysis in keyword frequency, keyword network, and time series. The results reveal that there is a difference in the direction of technological development as the core keywords are at different frequencies and centrality among countries. Besides, from the time series analysis for five intervals over 20 years, twelve keywords of the rising / falling trend are observed in the US, seven in the EU, and five in Korea. The greater number of keywords infer that the US underwent further technological progress as compared to other countries. Moreover, the technical linkage of the US-EU is presumed to be sturdier than the US-Korea based on the keyword similarity over time. The analysis results of this study can be used as valuable references for future technical predictions related to NLP.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.
25 Research products, page 1 of 3
Loading
- Research data . 2021Authors:Mulwafu, Watipaso; Xiao, Guangyi;Mulwafu, Watipaso; Xiao, Guangyi;
doi: 10.21227/38e1-s785
Publisher: IEEE DataPortThis is a dataset containing 1,661 movie scripts. Movies scripts extracted thanks to the RiTUAL Lab. It is a subset and variation of this dataset. On our part, we added age certificates and severity levels to it. These severity levels cover profanity, violence and sex content.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Authors:Melendez Barros, Jose; De Bona, Glauber;Melendez Barros, Jose; De Bona, Glauber;
doi: 10.21227/0ej1-br13
Publisher: IEEE DataPortAspect Sentiment Triplet Extraction (ASTE) is an Aspect-Based Sentiment Analysis subtask (ABSA). It aims to extract aspect-opinion pairs from a sentence and identify the sentiment polarity associated with them. For instance, given the sentence ``Large rooms and great breakfast", ASTE outputs the triplet T = {(rooms, large, positive), (breakfast, great, positive)}. Although several approaches to ASBA have recently been proposed, those for Portuguese have been mostly limited to extracting only aspects without addressing ASTE tasks. This work aims to develop a framework based on Deep Learning to perform the Aspect Sentiment Triplet Extraction task in Portuguese. The framework uses BERT as a context-awareness sentence encoder, multiple parallel non-linear layers to get aspect and opinion representations, and a Graph Attention layer along with a Biaffine scorer to determine the sentiment dependency between each aspect-opinion pair. The comparison results show that our proposed framework significantly outperforms the baselines in Portuguese and is competitive with its counterparts in English.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Authors:Ilyevsky, Thomas Victor; Johansen, Jared Sigurd; Siskind, Jeffrey Mark;Ilyevsky, Thomas Victor; Johansen, Jared Sigurd; Siskind, Jeffrey Mark;
doi: 10.21227/zxk7-ca24
Publisher: IEEE DataPortDataset asscociated with a paper in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems"Talk the talk and walk the walk: Dialogue-driven navigation in unknown indoor environments"If you use this code or data, please cite the above paper.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Authors:Aberkane, Abdel-Jaouad;Aberkane, Abdel-Jaouad;
doi: 10.21227/0mz8-ws69
Publisher: IEEE DataPortThe General Data Protection Regulation (GDPR), adopted in 2018, profoundly impacts information processing organizations as they must comply with this regulation. In this research, we consider GDPR-compliance as a high-level goal in software development that should be addressed at the offset of software development, meaning during requirements engineering (RE). In this work, we hypothesize that Natural Language Processing (NLP) can offer a viable means to automate this process. We conducted a systematic mapping study to explore the existing literature on the intersection of GDPR, RE, and NLP. As a result, we identified 448 relevant studies, of which the majority (420) were related to NLP and RE. Research on the intersection of GDPR and NLP yielded nine studies, while 20 studies were related to GDPR and RE. Even though only one study was identified on the convergence of GDPR, NLP, and RE, the mapping results indicate opportunities for bridging the gap between these fields. In particular, we identified possibilities for introducing NLP techniques to automate manual RE tasks in the crossing of GDPR and RE, in addition to possibilities of using NLP-based machine learning techniques to achieve GDPR-compliance in RE.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021
A selection of theCOVID-19 Open Research Dataset used for exploring the efficacy of the LDaRM text analytics technique.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Open AccessAuthors:Chowdhury, Sawrav;Chowdhury, Sawrav;Publisher: Data Archiving and Networked Services (DANS)
Right now we see that depression is one of the most common problems in our society. Most of the time people are committed suicide only cause of depression. And till now there is no proper lab test way for detecting depression. Generally, doctors are detecting depression by asking some knowledge-base questions. On the other hand, there are a good number of people using social media platforms right now, where they are sharing their daily experiences, emotion, and other activity with their friends. Twitter is one of the common social platforms and also popular for data collection. I was collecting these datasets from twitter based on some depressive words. I hope that this twitter datasets will help researchers to detect depression more precisely.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020
Wine has been popular with the public for centuries; in the market, there are a variety of wines to choose from. Among all, Bordeaux, France, is considered as the most famous wine region in the world. In this paper, we try to understand Bordeaux wines made in the 21st century through Wineinformatics study. We developed and studied two datasets: the first dataset is all the Bordeaux wine from 2000 to 2016; and the second one is all wines listed in a famous collection of Bordeaux wines, 1855 Bordeaux Wine Official Classification, from 2000 to 2016. A total of 14,349 wine reviews are collected in the first dataset, and 1359 wine reviews in the second dataset. In order to understand the relation between wine quality and characteristics, Naïve Bayes classifier is applied to predict the qualities (90+/89−) of wines. Support Vector Machine (SVM) classifier is also applied as a comparison. In the first dataset, SVM classifier achieves the best accuracy of 86.97%; in the second dataset, Naïve Bayes classifier achieves the best accuracy of 84.62%. Precision, recall, and f-score are also used as our measures to describe the performance of our models. Meaningful features associate with high quality 21 century Bordeaux wines are able to be presented through this research paper.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Authors:Chen, Bernard;Chen, Bernard;Publisher: IEEE DataPort
Wine has been popular with the public for centuries; in the market, there are a variety of wines to choose from. Among all, Bordeaux, France, is considered as the most famous wine region in the world. In this paper, we try to understand Bordeaux wines made in the 21st century through Wineinformatics study. We developed and studied two datasets: the first dataset is all the Bordeaux wine from 2000 to 2016; and the second one is all wines listed in a famous collection of Bordeaux wines, 1855 Bordeaux Wine Official Classification, from 2000 to 2016. A total of 14,349 wine reviews are collected in the first dataset, and 1359 wine reviews in the second dataset. In order to understand the relation between wine quality and characteristics, Naïve Bayes classifier is applied to predict the qualities (90+/89−) of wines. Support Vector Machine (SVM) classifier is also applied as a comparison. In the first dataset, SVM classifier achieves the best accuracy of 86.97%; in the second dataset, Naïve Bayes classifier achieves the best accuracy of 84.62%. Precision, recall, and f-score are also used as our measures to describe the performance of our models. Meaningful features associate with high quality 21 century Bordeaux wines are able to be presented through this research paper.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Authors:Moneda, Luis; Yonekura, David; Guedes, Elloá;Moneda, Luis; Yonekura, David; Guedes, Elloá;Publisher: IEEE DataPort
This dataset is a collection of images and their respective labels containing examples of multiple Brazilian coins, the primary purpose is to support the development of Computer Vision techniques for automatic detection of such objects, i.e., localization and classification tasks. It contains coins of R$ 0.05, 0.10, 0.25, 0.50 and 1.00 in Brazilian currency from the 2nd family, as manufactured by Casa da Moeda (http://www.casadamoeda.gov.br) since 2010. The samples were collected with a mobile phone and contain multiple coins placed upon a flat white A4 sheet of paper. Labels were obtained from a group with several individuals from both sexes and detailed reviewed. Each label has a circular or polygon shape and denotes the corresponding value in cents of the coin it is related to. This dataset is an improvement from Brazilian Coins (available at https://www.kaggle.com/lgmoneda/br-coins) where location labels were created.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2020Authors:Hyun, Young Geun; Ko, Jindeuk; Han, Jeong Hyeon;Hyun, Young Geun; Ko, Jindeuk; Han, Jeong Hyeon;Publisher: IEEE DataPort
The age of Artificial Intelligence (AI) is coming. Since Natural Language Processing (NLP) is a core AI technology for communication between humans and devices, it is vital to understand technological trends. Early research on NLP focused on syntactic processing such as information extraction and subject modeling but later developed into the semantic-oriented analysis. To analyze technological trends concerning NLP, especially semantic analysis, patent data that contains objective and extensive information is analyzed. The analysis procedures follow text mining to collect patent information, pre-processing, and analysis in keyword frequency, keyword network, and time series. The results reveal that there is a difference in the direction of technological development as the core keywords are at different frequencies and centrality among countries. Besides, from the time series analysis for five intervals over 20 years, twelve keywords of the rising / falling trend are observed in the US, seven in the EU, and five in Korea. The greater number of keywords infer that the US underwent further technological progress as compared to other countries. Moreover, the technical linkage of the US-EU is presumed to be sturdier than the US-Korea based on the keyword similarity over time. The analysis results of this study can be used as valuable references for future technical predictions related to NLP.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.