- home
- Advanced Search
- Digital Humanities and Cultural Heritage
- Open Access
- Publications
- Conference object
- AU
- Publikationer från Umeå universitet
- Digital Humanities and Cultural Her...
- Digital Humanities and Cultural Heritage
- Open Access
- Publications
- Conference object
- AU
- Publikationer från Umeå universitet
- Digital Humanities and Cultural Her...
Loading
description Publicationkeyboard_double_arrow_right Article , Conference object 2019 SwedenPublisher:Incoma Ltd., Shoumen, Bulgaria Authors: Vu, Xuan-Son; Vu, Thanh; Tran, Son N.; Jiang, Lili;Vu, Xuan-Son; Vu, Thanh; Tran, Son N.; Jiang, Lili;Given many recent advanced embedding models, selecting pre-trained word embedding (a.k.a., word representation) models best fit for a specific downstream task is non-trivial. In this paper, we propose a systematic approach, called ETNLP, for extracting, evaluating, and visualizing multiple sets of pre-trained word embeddings to determine which embeddings should be used in a downstream task. For extraction, we provide a method to extract subsets of the embeddings to be used in the downstream task. For evaluation, we analyse the quality of pre-trained embeddings using an input word analogy list. Finally, we visualize the word representations in the embedding space to explore the embedded words interactively. We demonstrate the effectiveness of the proposed approach on our pre-trained word embedding models in Vietnamese to select which models are suitable for a named entity recognition (NER) task. Specifically, we create a large Vietnamese word analogy list to evaluate and select the pre-trained embedding models for the task. We then utilize the selected embeddings for the NER task and achieve the new state-of-the-art results on the task benchmark dataset. We also apply the approach to another downstream task of privacy-guaranteed embedding selection, and show that it helps users quickly select the most suitable embeddings. In addition, we create an open-source system using the proposed systematic approach to facilitate similar studies on other NLP tasks. The source code and data are available at https://github.com/vietnlp/etnlp. Comment: 10 pages
arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print Archivehttps://doi.org/10.26615/978-9...Conference object . 2019 . Peer-reviewedPublikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2019 . Peer-reviewedhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.26615/978-954-452-056-4_147&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen bronze 12 citations 12 popularity Top 10% influence Top 10% impulse Top 10% Powered by BIP!more_vert arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print Archivehttps://doi.org/10.26615/978-9...Conference object . 2019 . Peer-reviewedPublikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2019 . Peer-reviewedhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.26615/978-954-452-056-4_147&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Conference object , Article 2018 SwedenPublisher:Association for Computational Linguistics (ACL) Vu, Thanh; Nguyen, Dat Quoc; Vu, Xuan-Son; Nguyen, Dai Quoc; Catt, Michael; Trenell, Michael;This paper describes our NIHRIO system for SemEval-2018 Task 3 "Irony detection in English tweets". We propose to use a simple neural network architecture of Multilayer Perceptron with various types of input features including: lexical, syntactic, semantic and polarity features. Our system achieves very high performance in both subtasks of binary and multi-class irony detection in tweets. In particular, we rank third using the accuracy metric and fifth using the F1 metric. Our code is available at https://github.com/NIHRIO/IronyDetectionInTwitter In proceedings of the 12th International Workshop on Semantic Evaluation, SemEval 2018, to appear (6 pages, 2 figures)
Publikationer från U... arrow_drop_down Publikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2018 . Peer-reviewedarXiv.org e-Print ArchiveOther literature type . Preprint . 2018Data sources: arXiv.org e-Print Archivehttps://doi.org/10.48550/arxiv...Article . 2018License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18653/v1/s18-1085&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen hybrid 12 citations 12 popularity Top 10% influence Average impulse Top 10% Powered by BIP!more_vert Publikationer från U... arrow_drop_down Publikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2018 . Peer-reviewedarXiv.org e-Print ArchiveOther literature type . Preprint . 2018Data sources: arXiv.org e-Print Archivehttps://doi.org/10.48550/arxiv...Article . 2018License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18653/v1/s18-1085&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu
Loading
description Publicationkeyboard_double_arrow_right Article , Conference object 2019 SwedenPublisher:Incoma Ltd., Shoumen, Bulgaria Authors: Vu, Xuan-Son; Vu, Thanh; Tran, Son N.; Jiang, Lili;Vu, Xuan-Son; Vu, Thanh; Tran, Son N.; Jiang, Lili;Given many recent advanced embedding models, selecting pre-trained word embedding (a.k.a., word representation) models best fit for a specific downstream task is non-trivial. In this paper, we propose a systematic approach, called ETNLP, for extracting, evaluating, and visualizing multiple sets of pre-trained word embeddings to determine which embeddings should be used in a downstream task. For extraction, we provide a method to extract subsets of the embeddings to be used in the downstream task. For evaluation, we analyse the quality of pre-trained embeddings using an input word analogy list. Finally, we visualize the word representations in the embedding space to explore the embedded words interactively. We demonstrate the effectiveness of the proposed approach on our pre-trained word embedding models in Vietnamese to select which models are suitable for a named entity recognition (NER) task. Specifically, we create a large Vietnamese word analogy list to evaluate and select the pre-trained embedding models for the task. We then utilize the selected embeddings for the NER task and achieve the new state-of-the-art results on the task benchmark dataset. We also apply the approach to another downstream task of privacy-guaranteed embedding selection, and show that it helps users quickly select the most suitable embeddings. In addition, we create an open-source system using the proposed systematic approach to facilitate similar studies on other NLP tasks. The source code and data are available at https://github.com/vietnlp/etnlp. Comment: 10 pages
arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print Archivehttps://doi.org/10.26615/978-9...Conference object . 2019 . Peer-reviewedPublikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2019 . Peer-reviewedhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.26615/978-954-452-056-4_147&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen bronze 12 citations 12 popularity Top 10% influence Top 10% impulse Top 10% Powered by BIP!more_vert arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print Archivehttps://doi.org/10.26615/978-9...Conference object . 2019 . Peer-reviewedPublikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2019 . Peer-reviewedhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.26615/978-954-452-056-4_147&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Conference object , Article 2018 SwedenPublisher:Association for Computational Linguistics (ACL) Vu, Thanh; Nguyen, Dat Quoc; Vu, Xuan-Son; Nguyen, Dai Quoc; Catt, Michael; Trenell, Michael;This paper describes our NIHRIO system for SemEval-2018 Task 3 "Irony detection in English tweets". We propose to use a simple neural network architecture of Multilayer Perceptron with various types of input features including: lexical, syntactic, semantic and polarity features. Our system achieves very high performance in both subtasks of binary and multi-class irony detection in tweets. In particular, we rank third using the accuracy metric and fifth using the F1 metric. Our code is available at https://github.com/NIHRIO/IronyDetectionInTwitter In proceedings of the 12th International Workshop on Semantic Evaluation, SemEval 2018, to appear (6 pages, 2 figures)
Publikationer från U... arrow_drop_down Publikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2018 . Peer-reviewedarXiv.org e-Print ArchiveOther literature type . Preprint . 2018Data sources: arXiv.org e-Print Archivehttps://doi.org/10.48550/arxiv...Article . 2018License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18653/v1/s18-1085&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen hybrid 12 citations 12 popularity Top 10% influence Average impulse Top 10% Powered by BIP!more_vert Publikationer från U... arrow_drop_down Publikationer från Umeå universitet; Digitala Vetenskapliga Arkivet - Academic Archive On-lineConference object . 2018 . Peer-reviewedarXiv.org e-Print ArchiveOther literature type . Preprint . 2018Data sources: arXiv.org e-Print Archivehttps://doi.org/10.48550/arxiv...Article . 2018License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18653/v1/s18-1085&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu