<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
The Unreasonable Volatility of Neural Machine Translation Models
The Unreasonable Volatility of Neural Machine Translation Models
Recent works have shown that Neural Machine Translation (NMT) models achieve impressive performance, however, questions about understanding the behavior of these models remain unanswered. We investigate the unexpected volatility of NMT models where the input is semantically and syntactically correct. We discover that with trivial modifications of source sentences, we can identify cases where \textit{unexpected changes} happen in the translation and in the worst case lead to mistranslations. This volatile behavior of translating extremely similar sentences in surprisingly different ways highlights the underlying generalization problem of current NMT models. We find that both RNN and Transformer models display volatile behavior in 26% and 19% of sentence variations, respectively.
Comment: Accepted to Neural Generation and Translation Workshop (WNGT) at ACL 2020
- Universiteit van Amsterdam
- University of Tehran Iran (Islamic Republic of)
- Universiteit van Amsterdam
- Universiteit van Amsterdam
- Universiteit van Amsterdam
Microsoft Academic Graph classification: Machine translation business.industry Computer science Artificial intelligence computer.software_genre business computer Sentence Natural language processing Transformer (machine learning model)
FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
Microsoft Academic Graph classification: Machine translation business.industry Computer science Artificial intelligence computer.software_genre business computer Sentence Natural language processing Transformer (machine learning model)
29 references, page 1 of 3
Jacob Andreas. 2019. Measuring compositionality in representation learning. CoRR, abs/1902.07181.
Marco Baroni. 2019. Linguistic generalization and compositionality in modern artificial neural networks. CoRR, abs/1904.00157.
Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine translation. In Proceedings of the International Conference on Learning Representations (ICLR).
Ondrˇej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Neveol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, and Marcos Zampieri. 2016. Findings of the 2016 conference on machine translation. In Proceedings of the First Conference on Machine Translation, pages 131- 198, Berlin, Germany. Association for Computational Linguistics. [OpenAIRE]
Ondrˇej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, and Christof Monz. 2018. Findings of the 2018 conference on machine translation (wmt18). In Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Task Papers, pages 272-307, Belgium, Brussels. Association for Computational Linguistics. [OpenAIRE]
Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, and Yoshua Bengio. 2019. BabyAI: First steps towards grounded language learning with a human in the loop. In International Conference on Learning Representations.
Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 85-91, Edinburgh, Scotland. Association for Computational Linguistics.
Joel Escude´ Font and Marta R. Costa-jussa`. 2019. Equalizing gender bias in neural machine translation with word embeddings techniques. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pages 147-154, Florence, Italy. Association for Computational Linguistics.
Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2018. Examining the tip of the iceberg: A data set for idiom translation. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). European Language Resource Association.
J.A. Fodor and E. LePore. 2002. The Compositionality Papers. Clarendon Press.
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).1 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).1 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average Powered byBIP!