- home
- Advanced Search
- Digital Humanities and Cultural Heritage
- Publications
- CN
- IEEE/ACM Transactions on Audio Spee...
- Digital Humanities and Cultural Heritage
- Publications
- CN
- IEEE/ACM Transactions on Audio Spee...
Loading
description Publicationkeyboard_double_arrow_right Article 2022Publisher:Institute of Electrical and Electronics Engineers (IEEE) Authors: Zhuosheng Zhang; Haojie Yu; Hai Zhao; Masao Utiyama;Zhuosheng Zhang; Haojie Yu; Hai Zhao; Masao Utiyama;Recent pre-trained language models (PrLMs) offer a new performant method of contextualized word representations by leveraging the sequence-level context for modeling. Although the PrLMs generally provide more effective contextualized word representations than non-contextualized models, they are still subject to a sequence of text contexts without diverse hints from multimodality. This paper thus proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance. In detail, we build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images. Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach. Analysis shows that our method with visual guidance pays more attention to content words, improves the representation diversity, and is potentially beneficial for enhancing the accuracy of disambiguation.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3130972&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3130972&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2022Publisher:Institute of Electrical and Electronics Engineers (IEEE) Ziyao Lu; Li Xiang; Yang Liu; Chulun Zhou; Jianwei Cui; Bin Wang; Min Zhang; Jinsong Su;Existing studies for multi-source neural machine translation (NMT) either separately model different source sentences or resort to the conventional single-source NMT by simply concatenating all source sentences. However, there exist two drawbacks in these approaches. First, they ignore the explicit word-level semantic interactions between source sentences, which have been shown effective in the embeddings of multilingual texts. Second, multiple source sentences are simultaneously encoded by an NMT model, which is unable to fully exploit the semantic information of each source sentence. In this paper, we explore multi-stage information interactions for multi-source NMT. Specifically, we first propose a multi-source NMT model that performs information interactions at the encoding stage. Its encoder contains multiple semantic interaction layers, each of which sequentially consists of (1) monolingual semantic interaction sub-layer, which is based on the self-attention mechanism and used to learn word-level monolingual contextual representations of source sentences, and (2) cross-lingual semantic interaction sub-layer, which leverages word alignments to perform fine-grained semantic transitions among hidden states of different source sentences. Furthermore, at the training stage, we introduce a mutual distillation based training framework, where single-source models and ours perform information interactions. Such framework can fully exploit the semantic information of each source sentence to enhance our model. Extensive experimental results on the WMT14 English-German-French dataset show our model exhibits significant improvements upon competitive baselines.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3120592&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3120592&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Guanlin Li; Lemao Liu; Conghui Zhu; Rui Wang; Tiejun Zhao; Shuming Shi;In machine translation evaluation, the traditional wisdom measures model's generalization ability in an average sense, for example by using corpus BLEU. However, the statistics of corpus BLEU cannot provide comprehensive understanding and fine-grained analysis on model's generalization ability. As a remedy, this paper attempts to understand NMT at fine-grained level, by detecting contextual barriers within an unseen input sentence that cause the degradation in model's translation quality. It proposes a principled definition of source contextual barriers as well as its modified version which is tractable in computation and operates at word-level. Based on the modified one, three simple methods are proposed for barrier detection by search-aware risk estimation through counterfactual generation. Extensive analyses are conducted on those detected contextual barrier words on both Zh $\Leftrightarrow$ En NIST benchmarks. Potential usages motivated from barrier words are also discussed.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3085119&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3085119&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Xixin Wu; Yuewen Cao; Hui Lu; Songxiang Liu; Shiyin Kang; Zhiyong Wu; Xunying Liu; Helen Meng;Expressive text-to-speech (E-TTS) synthesis is important for enhancing user experience in communication with machines using the speech modality. However, one of the challenges in E-TTS is the lack of a precise description of emotions. Previous categorical specifications may be insufficient for describing complex emotions. The dimensional specifications face the difficulty of ambiguity in annotation. This work advocates a new approach of describing emotive speech acoustics using spoken exemplars. We investigate methods to extract emotion descriptions from the input exemplar of emotive speech. The measures are combined to form two descriptors, based on capsule network (CapNet) and residual error network (RENet). The first is designed to consider the spatial information in the input exemplary spectrogram, and the latter is to capture the contrastive information between emotive acoustic expressions. Two different approaches are applied for conversion from the variable-length feature sequence to fixed-size description vector: (1) dynamic routing groups similar capsules to the output description; and (2) recurrent neural network's hidden states store the temporal information for the description. The two descriptors are integrated to a state-of-the-art sequence-to-sequence architecture to obtain an end-to-end architecture that is optimized as a whole towards the same goal of generating correct emotive speech. Experimental results on a public audiobook dataset demonstrate that the two exemplar-based approaches achieve significant performance improvement over the baseline system in both emotion similarity and speech quality.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3052688&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu3 citations 3 popularity Top 10% influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3052688&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Authors: Yi Zhou; Xiaoqing Zheng; Xuanjing Huang;Yi Zhou; Xiaoqing Zheng; Xuanjing Huang;Recently, many efforts have been devoted to generating responses expressing a specific emotion or relating to a given topic in a controlled manner. However, limited attention has been given to generating responses with a specified syntactic pattern, which makes it possible to imitate someone’s way of speaking in dialogue. To fulfill this goal, we propose two models to generate syntax-aware responses: a gross-constraint and a specific-constraint model. The former controls the syntactic patterns of generated responses at sentence-level, while the latter works at smaller language units, such as words or phrases, being capable of manipulating the syntactic structures of responses in a more subtle manner. The extensive experimental results on two different datasets show that both the two models not only can generate meaningful responses with a specific and coherent structure but also improve on the diversity of generated responses, with similar gains in readability, relevance, and diversity as measured by human judges.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3110124&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3110124&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Jiacheng Zhang; Huanbo Luan; Maosong Sun; Feifei Zhai; Jingfang Xu; Yang Liu;While neural machine translation has achieved state-of-the-art translation performance, it is unable to capture the alignment between the input and output during the translation process. The lack of alignment in neural machine translation models leads to three problems: it is hard to (1) interpret the translation process, (2) impose lexical constraints, and (3) impose structural constraints. These problems not only increase the difficulty of designing new architectures for neural machine translation, but also limit its applications in practice. To alleviate these problems, we propose to introduce explicit phrase alignment into the translation process of arbitrary neural machine translation models. The key idea is to build a search space similar to that of phrase-based statistical machine translation for neural machine translation where phrase alignment is readily available. We design a new decoding algorithm that can easily impose lexical and structural constraints. Experiments show that our approach makes the translation process of neural machine translation more interpretable without sacrificing translation quality. In addition, our approach achieves significant improvements in lexically and structurally constrained translation tasks.
arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3057831&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen bronze 3 citations 3 popularity Average influence Average impulse Average Powered by BIP!more_vert arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3057831&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Licheng Zhang; Zhendong Mao; Benfeng Xu; Quan Wang; Yongdong Zhang;With the notable success of pretrained language models, the pretraining-fine-tuning paradigm has become a dominant solution for natural language understanding (NLU) tasks. Typically, the training instances of a target NLU task are introduced in a completely random order and treated equally at the fine-tuning stage. However, these instances can vary greatly in difficulty, and similar to human learning procedures, language models can benefit from an easy-to-difficult curriculum. Based on this concept, we propose a curriculum learning (CL) framework. Our framework consists of two stages, Review and Arrange, targeting the two main challenges in curriculum learning, i.e., how to define the difficulty of instances and how to arrange a curriculum based on the difficulty, respectively. In the first stage, we devise a cross-review (CR) method to train several teacher models first and then review the training set in a crossed manner to distinguish easy instances from difficult instances. In the second stage, two sampling algorithms, a coarse-grained arrangement (CGA) and a fine-grained arrangement (FGA), are proposed to arrange a curriculum for language models in which the learning materials start from the easiest instances, and more difficult instances are gradually added into the training procedure. Compared to previous heuristic CL methods, our framework can avoid the errors caused by a gap in difficulty between humans and machines and has strong generalization ability. We conduct comprehensive experiments, and the results show that our curriculum learning framework, without any manual model architecture design or use of external data, obtains significant and universal performance improvements on a wide range of NLU tasks in different languages.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3121986&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3121986&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Authors: Qing Liu; Lei Chen; Yuan Yuan; Huarui Wu;Qing Liu; Lei Chen; Yuan Yuan; Huarui Wu;Recurrent Neural Network (RNN) based abstractive text summarization models have made great progress over the past few years, largely triggered by the encoder-decoder architecture. However, there has been little work improving the generation of relatively long summaries. In this paper, we concentrate on two prominent problems in long summary generation. First, although significant efforts have been made to assist the encoder in handling long sequences, the decoder struggles with long sequences owing to the limited storage capacity of RNN. We propose a simple and effective approach called history reuse, which first mines critical information from the history summary sequence and then transmits the information to the decoder. Second, since encoder-decoder models are typically trained to produce exactly the same summary as the target summary, certain word order deviations between the predicted summary and target summary are excessively punished. Accordingly, we introduce a fully differentiable loss called bag-of-words (BoW) loss, which takes advantage of the feature of BoW discarding word order information in texts, and computes the difference between the two summaries at the BoW space. Experiments on two benchmark datasets, CNN/Daily Mail and Pubmed, demonstrate that our methods significantly improve the baseline.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3100281&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess Routesbronze 1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3100281&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Yusheng Su; Xu Han; Yankai Lin; Zhengyan Zhang; Zhiyuan Liu; Peng Li; Jie Zhou; Maosong Sun;Fine-tuning pre-trained language models (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many scenarios with limited supervised data, the conventional fine-tuning strategies cannot sufficiently capture the important semantic features for downstream tasks. To address this issue, we introduce a novel framework (named “CSS-LM”) to improve the fine-tuning phase of PLMs via contrastive semi-supervised learning. Specifically, given a specific task, we retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to the task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled instances and original labeled instances to help PLMs capture crucial task-related semantic features. The experimental results show that CSS-LM achieves better results than the conventional fine-tuning strategy on a series of downstream tasks with few-shot settings by up to 7.8%, and outperforms the latest supervised contrastive fine-tuning strategy by up to 7.1%. Our datasets and source code will be available to provide more details.
IEEE/ACM Transaction... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2021Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2021License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3105013&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen bronze 1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2021Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2021License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3105013&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Jipeng Qiang; Xinyu Lu; Yun Li; Yun-Hao Yuan; Xindong Wu;Lexical simplification has attracted much attention in many languages, which is the process of replacing complex words in a given sentence with simpler alternatives of equivalent meaning. Although the richness of vocabulary in Chinese makes the text very difficult to read for children and non-native speakers, there is no research work for the Chinese lexical simplification (CLS) task. To circumvent difficulties in acquiring annotations, we manually create the first benchmark dataset for CLS, which can be used for evaluating the lexical simplification systems automatically. To acquire a more thorough comparison, we present five different types of methods as baselines to generate substitute candidates for the complex word that includes synonym-based approach, word embedding-based approach, BERT-based approach, sememe-based approach, and a hybrid approach. Finally, we design the experimental evaluation of these baselines and discuss their advantages and disadvantages. To our best knowledge, this is the first study for CLS task.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3078361&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess Routesbronze 4 citations 4 popularity Top 10% influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3078361&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu
Loading
description Publicationkeyboard_double_arrow_right Article 2022Publisher:Institute of Electrical and Electronics Engineers (IEEE) Authors: Zhuosheng Zhang; Haojie Yu; Hai Zhao; Masao Utiyama;Zhuosheng Zhang; Haojie Yu; Hai Zhao; Masao Utiyama;Recent pre-trained language models (PrLMs) offer a new performant method of contextualized word representations by leveraging the sequence-level context for modeling. Although the PrLMs generally provide more effective contextualized word representations than non-contextualized models, they are still subject to a sequence of text contexts without diverse hints from multimodality. This paper thus proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance. In detail, we build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images. Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach. Analysis shows that our method with visual guidance pays more attention to content words, improves the representation diversity, and is potentially beneficial for enhancing the accuracy of disambiguation.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3130972&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3130972&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2022Publisher:Institute of Electrical and Electronics Engineers (IEEE) Ziyao Lu; Li Xiang; Yang Liu; Chulun Zhou; Jianwei Cui; Bin Wang; Min Zhang; Jinsong Su;Existing studies for multi-source neural machine translation (NMT) either separately model different source sentences or resort to the conventional single-source NMT by simply concatenating all source sentences. However, there exist two drawbacks in these approaches. First, they ignore the explicit word-level semantic interactions between source sentences, which have been shown effective in the embeddings of multilingual texts. Second, multiple source sentences are simultaneously encoded by an NMT model, which is unable to fully exploit the semantic information of each source sentence. In this paper, we explore multi-stage information interactions for multi-source NMT. Specifically, we first propose a multi-source NMT model that performs information interactions at the encoding stage. Its encoder contains multiple semantic interaction layers, each of which sequentially consists of (1) monolingual semantic interaction sub-layer, which is based on the self-attention mechanism and used to learn word-level monolingual contextual representations of source sentences, and (2) cross-lingual semantic interaction sub-layer, which leverages word alignments to perform fine-grained semantic transitions among hidden states of different source sentences. Furthermore, at the training stage, we introduce a mutual distillation based training framework, where single-source models and ours perform information interactions. Such framework can fully exploit the semantic information of each source sentence to enhance our model. Extensive experimental results on the WMT14 English-German-French dataset show our model exhibits significant improvements upon competitive baselines.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3120592&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2022 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3120592&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Guanlin Li; Lemao Liu; Conghui Zhu; Rui Wang; Tiejun Zhao; Shuming Shi;In machine translation evaluation, the traditional wisdom measures model's generalization ability in an average sense, for example by using corpus BLEU. However, the statistics of corpus BLEU cannot provide comprehensive understanding and fine-grained analysis on model's generalization ability. As a remedy, this paper attempts to understand NMT at fine-grained level, by detecting contextual barriers within an unseen input sentence that cause the degradation in model's translation quality. It proposes a principled definition of source contextual barriers as well as its modified version which is tractable in computation and operates at word-level. Based on the modified one, three simple methods are proposed for barrier detection by search-aware risk estimation through counterfactual generation. Extensive analyses are conducted on those detected contextual barrier words on both Zh $\Leftrightarrow$ En NIST benchmarks. Potential usages motivated from barrier words are also discussed.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3085119&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3085119&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Xixin Wu; Yuewen Cao; Hui Lu; Songxiang Liu; Shiyin Kang; Zhiyong Wu; Xunying Liu; Helen Meng;Expressive text-to-speech (E-TTS) synthesis is important for enhancing user experience in communication with machines using the speech modality. However, one of the challenges in E-TTS is the lack of a precise description of emotions. Previous categorical specifications may be insufficient for describing complex emotions. The dimensional specifications face the difficulty of ambiguity in annotation. This work advocates a new approach of describing emotive speech acoustics using spoken exemplars. We investigate methods to extract emotion descriptions from the input exemplar of emotive speech. The measures are combined to form two descriptors, based on capsule network (CapNet) and residual error network (RENet). The first is designed to consider the spatial information in the input exemplary spectrogram, and the latter is to capture the contrastive information between emotive acoustic expressions. Two different approaches are applied for conversion from the variable-length feature sequence to fixed-size description vector: (1) dynamic routing groups similar capsules to the output description; and (2) recurrent neural network's hidden states store the temporal information for the description. The two descriptors are integrated to a state-of-the-art sequence-to-sequence architecture to obtain an end-to-end architecture that is optimized as a whole towards the same goal of generating correct emotive speech. Experimental results on a public audiobook dataset demonstrate that the two exemplar-based approaches achieve significant performance improvement over the baseline system in both emotion similarity and speech quality.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3052688&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu3 citations 3 popularity Top 10% influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3052688&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Authors: Yi Zhou; Xiaoqing Zheng; Xuanjing Huang;Yi Zhou; Xiaoqing Zheng; Xuanjing Huang;Recently, many efforts have been devoted to generating responses expressing a specific emotion or relating to a given topic in a controlled manner. However, limited attention has been given to generating responses with a specified syntactic pattern, which makes it possible to imitate someone’s way of speaking in dialogue. To fulfill this goal, we propose two models to generate syntax-aware responses: a gross-constraint and a specific-constraint model. The former controls the syntactic patterns of generated responses at sentence-level, while the latter works at smaller language units, such as words or phrases, being capable of manipulating the syntactic structures of responses in a more subtle manner. The extensive experimental results on two different datasets show that both the two models not only can generate meaningful responses with a specific and coherent structure but also improve on the diversity of generated responses, with similar gains in readability, relevance, and diversity as measured by human judges.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3110124&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3110124&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Jiacheng Zhang; Huanbo Luan; Maosong Sun; Feifei Zhai; Jingfang Xu; Yang Liu;While neural machine translation has achieved state-of-the-art translation performance, it is unable to capture the alignment between the input and output during the translation process. The lack of alignment in neural machine translation models leads to three problems: it is hard to (1) interpret the translation process, (2) impose lexical constraints, and (3) impose structural constraints. These problems not only increase the difficulty of designing new architectures for neural machine translation, but also limit its applications in practice. To alleviate these problems, we propose to introduce explicit phrase alignment into the translation process of arbitrary neural machine translation models. The key idea is to build a search space similar to that of phrase-based statistical machine translation for neural machine translation where phrase alignment is readily available. We design a new decoding algorithm that can easily impose lexical and structural constraints. Experiments show that our approach makes the translation process of neural machine translation more interpretable without sacrificing translation quality. In addition, our approach achieves significant improvements in lexically and structurally constrained translation tasks.
arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3057831&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen bronze 3 citations 3 popularity Average influence Average impulse Average Powered by BIP!more_vert arXiv.org e-Print Ar... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2019Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2019License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3057831&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Licheng Zhang; Zhendong Mao; Benfeng Xu; Quan Wang; Yongdong Zhang;With the notable success of pretrained language models, the pretraining-fine-tuning paradigm has become a dominant solution for natural language understanding (NLU) tasks. Typically, the training instances of a target NLU task are introduced in a completely random order and treated equally at the fine-tuning stage. However, these instances can vary greatly in difficulty, and similar to human learning procedures, language models can benefit from an easy-to-difficult curriculum. Based on this concept, we propose a curriculum learning (CL) framework. Our framework consists of two stages, Review and Arrange, targeting the two main challenges in curriculum learning, i.e., how to define the difficulty of instances and how to arrange a curriculum based on the difficulty, respectively. In the first stage, we devise a cross-review (CR) method to train several teacher models first and then review the training set in a crossed manner to distinguish easy instances from difficult instances. In the second stage, two sampling algorithms, a coarse-grained arrangement (CGA) and a fine-grained arrangement (FGA), are proposed to arrange a curriculum for language models in which the learning materials start from the easiest instances, and more difficult instances are gradually added into the training procedure. Compared to previous heuristic CL methods, our framework can avoid the errors caused by a gap in difficulty between humans and machines and has strong generalization ability. We conduct comprehensive experiments, and the results show that our curriculum learning framework, without any manual model architecture design or use of external data, obtains significant and universal performance improvements on a wide range of NLU tasks in different languages.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3121986&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3121986&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Authors: Qing Liu; Lei Chen; Yuan Yuan; Huarui Wu;Qing Liu; Lei Chen; Yuan Yuan; Huarui Wu;Recurrent Neural Network (RNN) based abstractive text summarization models have made great progress over the past few years, largely triggered by the encoder-decoder architecture. However, there has been little work improving the generation of relatively long summaries. In this paper, we concentrate on two prominent problems in long summary generation. First, although significant efforts have been made to assist the encoder in handling long sequences, the decoder struggles with long sequences owing to the limited storage capacity of RNN. We propose a simple and effective approach called history reuse, which first mines critical information from the history summary sequence and then transmits the information to the decoder. Second, since encoder-decoder models are typically trained to produce exactly the same summary as the target summary, certain word order deviations between the predicted summary and target summary are excessively punished. Accordingly, we introduce a fully differentiable loss called bag-of-words (BoW) loss, which takes advantage of the feature of BoW discarding word order information in texts, and computes the difference between the two summaries at the BoW space. Experiments on two benchmark datasets, CNN/Daily Mail and Pubmed, demonstrate that our methods significantly improve the baseline.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3100281&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess Routesbronze 1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3100281&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Yusheng Su; Xu Han; Yankai Lin; Zhengyan Zhang; Zhiyuan Liu; Peng Li; Jie Zhou; Maosong Sun;Fine-tuning pre-trained language models (PLMs) has demonstrated its effectiveness on various downstream NLP tasks recently. However, in many scenarios with limited supervised data, the conventional fine-tuning strategies cannot sufficiently capture the important semantic features for downstream tasks. To address this issue, we introduce a novel framework (named “CSS-LM”) to improve the fine-tuning phase of PLMs via contrastive semi-supervised learning. Specifically, given a specific task, we retrieve positive and negative instances from large-scale unlabeled corpora according to their domain-level and class-level semantic relatedness to the task. We then perform contrastive semi-supervised learning on both the retrieved unlabeled instances and original labeled instances to help PLMs capture crucial task-related semantic features. The experimental results show that CSS-LM achieves better results than the conventional fine-tuning strategy on a series of downstream tasks with few-shot settings by up to 7.8%, and outperforms the latest supervised contrastive fine-tuning strategy by up to 7.1%. Our datasets and source code will be available to provide more details.
IEEE/ACM Transaction... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2021Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2021License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3105013&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess RoutesGreen bronze 1 citations 1 popularity Average influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down arXiv.org e-Print ArchiveOther literature type . Preprint . 2021Data sources: arXiv.org e-Print ArchiveIEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefhttps://doi.org/10.48550/arxiv...Article . 2021License: arXiv Non-Exclusive DistributionData sources: Dataciteadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3105013&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eudescription Publicationkeyboard_double_arrow_right Article 2021Publisher:Institute of Electrical and Electronics Engineers (IEEE) Jipeng Qiang; Xinyu Lu; Yun Li; Yun-Hao Yuan; Xindong Wu;Lexical simplification has attracted much attention in many languages, which is the process of replacing complex words in a given sentence with simpler alternatives of equivalent meaning. Although the richness of vocabulary in Chinese makes the text very difficult to read for children and non-native speakers, there is no research work for the Chinese lexical simplification (CLS) task. To circumvent difficulties in acquiring annotations, we manually create the first benchmark dataset for CLS, which can be used for evaluating the lexical simplification systems automatically. To acquire a more thorough comparison, we present five different types of methods as baselines to generate substitute candidates for the complex word that includes synonym-based approach, word embedding-based approach, BERT-based approach, sememe-based approach, and a hybrid approach. Finally, we design the experimental evaluation of these baselines and discuss their advantages and disadvantages. To our best knowledge, this is the first study for CLS task.
IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3078361&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euAccess Routesbronze 4 citations 4 popularity Top 10% influence Average impulse Average Powered by BIP!more_vert IEEE/ACM Transaction... arrow_drop_down IEEE/ACM Transactions on Audio Speech and Language ProcessingArticle . 2021 . Peer-reviewedLicense: IEEE CopyrightData sources: Crossrefadd ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.1109/taslp.2021.3078361&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu