Actions
  • shareshare
  • link
  • cite
  • add
add
auto_awesome_motion View all 9 versions
Publication . Article . Preprint . Other literature type . Conference object . Contribution for newspaper or weekly magazine . 2020

Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

Biao Zhang; Philip Williams; Ivan Titov; Rico Sennrich;
Open Access
English
Abstract
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by ~10 BLEU, approaching conventional pivot-based methods.
Comment: ACL2020
Subjects by Vocabulary

Microsoft Academic Graph classification: Shot (filmmaking) Artificial intelligence business.industry business Natural language processing computer.software_genre computer BLEU Translation (geometry) Machine translation Computer science Zero (linguistics)

Subjects

Computer Science - Computation and Language, Computation and Language (cs.CL), FOS: Computer and information sciences, 000 Computer science, knowledge & systems, 410 Linguistics, 10105 Institute of Computational Linguistics

42 references, page 1 of 5

Roee Aharoni, Melvin Johnson, and Orhan Firat. 2019. Massively multilingual neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3874-3884, Minneapolis, Minnesota. Association for Computational Linguistics.

Maruan Al-Shedivat and Ankur Parikh. 2019. Consistency by agreement in zero-shot neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1184-1197, Minneapolis, Minnesota. Association for Computational Linguistics. [OpenAIRE]

Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Roee Aharoni, Melvin Johnson, and Wolfgang Macherey. 2019a. The missing ingredient in zero-shot neural machine translation. CoRR, abs/1903.07091. [OpenAIRE]

Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Dmitry Lepikhin, Melvin Johnson, Maxim Krikun, Mia Xu Chen, Yuan Cao, George Foster, Colin Cherry, Wolfgang Macherey, Zhifeng Chen, and Yonghui Wu. 2019b. Massively multilingual neural machine translation in the wild: Findings and challenges. CoRR, abs/1907.05019.

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450.

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.

Ankur Bapna, Mia Chen, Orhan Firat, Yuan Cao, and Yonghui Wu. 2018. Training deeper neural machine translation models with transparent attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3028-3033, Brussels, Belgium. Association for Computational Linguistics. [OpenAIRE]

Ankur Bapna and Orhan Firat. 2019. Simple, scalable adaptation for neural machine translation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1538- 1548, Hong Kong, China. Association for Computational Linguistics.

Loïc Barrault, Ondrˇej Bojar, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias Müller, Santanu Pal, Matt Post, and Marcos Zampieri. 2019. Findings of the 2019 conference on machine translation (WMT19). In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 1-61, Florence, Italy. Association for Computational Linguistics.

Graeme Blackwood, Miguel Ballesteros, and Todd Ward. 2018. Multilingual neural machine translation with task-specific attention. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3112-3122, Santa Fe, New Mexico, USA. Association for Computational Linguistics.

Funded by
EC| ELITR
Project
ELITR
European Live Translator
  • Funder: European Commission (EC)
  • Project Code: 825460
  • Funding stream: H2020 | RIA
Validated by funder
,
EC| GoURMET
Project
GoURMET
Global Under-Resourced MEedia Translation
  • Funder: European Commission (EC)
  • Project Code: 825299
  • Funding stream: H2020 | RIA
Validated by funder
,
SNSF| Multi-Task Learning with Multilingual Resources for Better Natural Language Understanding
Project
  • Funder: Swiss National Science Foundation (SNSF)
  • Project Code: PP00P1_176727
  • Funding stream: Careers | SNSF Professorships
Related to Research communities
Digital Humanities and Cultural Heritage
moresidebar