publication . Report . Other literature type . Conference object . Article . 2017

Consistent Translation of Repeated Nouns using Syntactic and Semantic Cues

Xiao Pu; Laura Mascarell; Andrei Popescu-Belis;
Open Access
  • Published: 01 Jan 2017
  • Publisher: Idiap
  • Country: Switzerland
We propose a method to decide whether two occurrences of the same noun in a source text should be translated consistently, i.e. using the same noun in the target text as well. We train and test classifiers that predict consistent translations based on lexical, syntactic, and semantic features. We first evaluate the accuracy of our classifiers intrinsically, in terms of the accuracy of consistency predictions, over a subset of the UN Corpus. Then, we also evaluate them in combination with phrase-based statistical MT systems for Chinese-to-English and German-to-English. We compare the automatic post-editing of noun translations with the re-ranking of the translati...
free text keywords: Institute of Computational Linguistics, 000 Computer science, knowledge & systems, 410 Linguistics, Source text, Computer science, Classifier (linguistics), Target text, Artificial intelligence, business.industry, business, Oracle, Syntax, Phrase, Natural language processing, computer.software_genre, computer, Noun
Funded by
Scalable Understanding of Multilingual Media
  • Funder: European Commission (EC)
  • Project Code: 688139
  • Funding stream: H2020 | RIA
SNSF| MODERN: Modeling discourse entities and relations for coherent machine translation
  • Funder: Swiss National Science Foundation (SNSF)
  • Project Code: CRSII2_147653
  • Funding stream: Programmes | Sinergia
Digital Humanities and Cultural Heritage
22 references, page 1 of 2

Leo Breiman. 2001. Random Forests. Machine Learning, 45(1):5-32.

Marine Carpuat and Michel Simard. 2012. The trouble with SMT consistency. In Proceedings of the Seventh Workshop on Statistical Machine Translation, WMT '12, pages 442-449. [OpenAIRE]

Marine Carpuat. 2009. One translation per discourse. In Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, pages 19-27.

Mauro Cettolo, Christian Girardi, and Marcello Federico. 2012. WIT3: Web inventory of transcribed and translated talks. In Proceedings of the 16th Conference of the European Association for Machine Translation (EAMT), pages 261-268, Trento, Italy.

Jacob Cohen. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, 20:37-46.

Corinna Cortes and Vladimir Vapnik. 1995. Supportvector networks. Machine Learning, 20(3):273- 297.

William A Gale, Kenneth W Church, and David Yarowsky. 1992. One sense per discourse. In Proceedings of the Workshop on Speech and Natural Language, pages 233-237.

Zhengxian Gong, Min Zhang, and Guodong Zhou. 2011. Cache-based document-level statistical machine translation. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 909-919, Edinburgh.

Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jo¨rg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, and Andrei Popescu-Belis. 2016. Findings of the 2016 WMT shared task on cross-lingual pronoun prediction. In Proceedings of the First Conference on Machine Translation (WMT16), Berlin, Germany.

J. R. Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.

Christian Hardmeier, Joakim Nivre, and Jo¨rg Tiedemann. 2012. Document-Wide Decoding for PhraseBased Statistical Machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL), Jeju, Korea.

Christian Hardmeier, Preslav Nakov, Sara Stymne, Jo¨rg Tiedemann, Yannick Versley, and Mauro Cettolo. 2015. Pronoun-focused MT and cross-lingual pronoun prediction: Findings of the 2015 DiscoMT shared task on pronoun translation. In Proceedings of the Second Workshop on Discourse in Machine Translation, pages 1-16, Lisbon, Portugal.

Shuan Fan Huang. 1995. Chinese as a metonymic language. In In Honor of William Wang: Interdisciplinary studies on Language and Language Change, pages 223-252, Taipei, Taiwan. Pyramid Press.

Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbs. 2007. Moses: Open Source Toolkit for Statistical Machine Translation. In Proceedings of 45th Annual Meeting of the Association for Computational Linguistics (ACL), Demonstration Session, pages 177-180, Prague, Czech Republic.

Christopher Manning and Dan Klein. 2003. Optimization, MaxEnt Models, and Conditional Estimation without Magic. In Tutorial at HLT-NAACL and 41st ACL conferences, Edmonton, Canada and Sapporo, Japan.

22 references, page 1 of 2
Any information missing or wrong?Report an Issue