- Sapienza University of Rome Italy
- Federal University of Technology Owerri Nigeria
- University of Hannover Germany
- Technische Universität Munchen Finland
- Jomo Kenyatta University of Agriculture and Technology Kenya
- Naver Labs Europe
- Federal University of Technology – Paraná Brazil
- InstaDeep Ltd
- Lancaster University United Kingdom
- Pompeu Fabra University Spain
- University of Porto Portugal
- University of Johannesburg South Africa
- Max Planck Institute for Informatics Germany
- Federal University of Technology Nigeria
- Technical University of Munich Germany
- University of Pretoria South Africa
- Stellenbosch University South Africa
- University of Pretoria
- University of Roma La Sapienza
- Council of Scientific and Industrial Research India
- University of Saarland
- Sapienza University of Rome
- Namibia University of Science and Technology Namibia
- African Institute for Mathematical Sciences South Africa
- Lancaster University (Security Lancaster Research Centre) United Kingdom
- Lancaster University
- Federal University of Technology Nigeria
- University of Electronic Science and Technology of China China (People's Republic of)
- Stellenbosch University
- National University of Science and Technology Russian Federation
- UNIVERSITY OF PORTO
- University of California, Berkeley United States
- Jomo Kenyatta University of Agriculture and Technology
- Georgia Institute of Technology United States
- universidade Porto
- Naver (South Korea) Korea (Republic of)
- Federal University of Technology Minna
- Universidade Lusófona do Porto Portugal
- TECHNICAL UNIVERSITY OF MUNICH
- Carnegie Mellon University United States
- Google (United States) United States
- Translators Without Borders
- Technische Universität München
- SIL International United States
- Florida State University United States
- Technische Universitat Munchen
- Jacobs University Germany
- Technische Universität München Brazil
- National University of Science and Technology Zimbabwe
- Max Planck Society Germany
- University of Waterloo Canada
- Federal University of Technology Minna Nigeria
- Technische Universität München (TUM)
- UNIVERSITA DEGLI STUDI DI ROMA LA SAPIENZA Italy
- Federal University of Technology
- Technical University of Munich (TUM)
- Bayero University
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to low-resourced languages has not yet been adequately solved. "Low-resourced"-ness is a complex problem going beyond data availability and reflects systemic problems in society. In this paper, we focus on the task of Machine Translation (MT), that plays a crucial role for information accessibility and communication worldwide. Despite immense improvements in MT over the past decade, MT is centered around a few high-resourced languages. As MT researchers cannot solve the problem of low-resourcedness alone, we propose participatory research as a means to involve all necessary agents required in the MT development process. We demonstrate the feasibility and scalability of participatory research with a case study on MT for African languages. Its implementation leads to a collection of novel translation datasets, MT benchmarks for over 30 languages, with human evaluations for a third of them, and enables participants without formal training to make a unique scientific contribution. Benchmarks, models, data, code, and evaluation results are released under https://github.com/masakhane-io/masakhane-mt.
Comment: Findings of EMNLP 2020; updated benchmarks