Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
67 Research products, page 1 of 7

  • Digital Humanities and Cultural Heritage
  • Publications
  • Research data
  • Research software
  • European Commission
  • OpenAIRE
  • Scientometrics

10
arrow_drop_down
Relevance
arrow_drop_down
  • Open Access Spanish; Castilian
    Authors: 
    Margarita Serna Vallejo;
    Publisher: Ediciones Universidad de Valladolid
    Country: Spain
    Project: EC | RESISTANCE (778076)

    RESUMEN: Desde finales de la Baja Edad Media y a lo largo de Época Moderna, algunas de las cofradías de pescadores establecidas en el corregimiento de las Cuatro Villas de la Costa consiguieron que la Monarquía les reconociera el privilegio de disfrutar de una jurisdicción marítima en cada corporación. El establecimiento de estas jurisdicciones disgustó a otras instituciones que vieron disminuidas sus competencias jurisdiccionales. Y de esta situación surgieron distintos conflictos en los que las hermandades tuvieron que luchar por la conservación de la jurisdicción marítima. ABSTRACT: Since the end of the Late Middle Ages and throughout the Modern Era, some of the fishermen's associations established in the corregimiento of the Four Villas of the Coast managed to get the Monarchy to recognize the privilege of enjoying a maritime jurisdiction in each brotherhood. The establishment of these jurisdictions disgusted other institutions that saw their jurisdiction diminished. From this situation arose different conflicts in which the brotherhoods had to fight for the preservation of the maritime jurisdiction. Este trabajo se ha realizado en el marco del Proyecto de Investigación Culturas urbanas en la España Moderna: policía, gobernanza e imaginarios (siglos XVI-XIX) con referencia HAR2015-64014-C3-1-R, financiado por el Ministerio de Economía y Competitividad) y del europeo (Rebellion and Resistance in the Iberian Empires, 16th-19th Centuries que ha recibido financiación del programa de investigación e innovación Horizonte 2020 de la Unión Europea en virtud del acuerdo de subvención Marie Skłodowska-Curie No 778076.

  • Open Access English
    Authors: 
    Matthews, Roger; Matthews, Wendy; Rasheed Raheem, Kamal; Richardson, Amy;
    Publisher: Zenodo
    Project: EC | MENTICA (787264)

    ­The Eastern Fertile Crescent region of western Iran and eastern Iraq hosted major developments in the transition from hunting and gathering to more sedentary agricultural lifestyles through the Early Neolithic period, 10,000-7000 BC. Within the scope of the Central Zagros Archaeological Project, excavations have been conducted at two Early Neolithic sites in the Kurdistan region of Iraq: Bestansur and Shimshara, as well as survey in the region of the Epipalaeolithic site of Zarzi since 2012. Bestansur represents an early stage in the transition to sedentary, agricultural life, where the inhabitants pursued a biodiverse strategy of hunting, gathering, herding and cultivating, maximising the new opportunities afforded by the warmer climate of the Early Holocene. They also constructed a substantial settlement of mudbrick, including a major building with a minimum of 78 human individuals buried under its floor in association with hundreds of beads. ­ These buildings and human remains provide new insights into social relations, mortuary practices, demography, diet, health and disease during the early stages of sedentarisation. ­ The material culture of Bestansur and Shimshara is rich in imported items such as obsidian, carnelian and sea-shells, indicating the extent to which Early Neolithic communities were networked across the Eastern Fertile Crescent and beyond along routes that later became the Silk Roads. This volume includes final reports by a large-scale interdisciplinary team on a wealth of new data from excavations at Bestansur and Shimshara, through application of state-of-the-art scientific techniques, integrated ecological and social approaches and sustainability studies. ­ The net result is to re-emphasise the enormous significance of the Eastern Fertile Crescent in one of the most important episodes in human history: the Neolithic transition.

  • Publication . Article . Conference object . Preprint . 2016 . Embargo End Date: 01 Jan 2016
    Open Access
    Authors: 
    Guntis Barzdins; Didzis Gosko;
    Publisher: arXiv
    Project: EC | SUMMA (688139)

    Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic AMR parsers: a CAMR+wrapper parser and a novel character-level neural translation AMR parser. For AMR parsing task the character-level neural translation attains surprising 7% gain over the carefully optimized word-level neural translation. Overall, we achieve smatch F1=62% on the SemEval-2016 official scor-ing set and F1=67% on the LDC2015E86 test set. Comment: NAACL HLT 2016, SemEval-2016 Task 8 submission

  • Open Access
    Authors: 
    Najafabadipour, Marjan; Zanin, Massimiliano; Rodríguez-González, Alejandro; Torrente, Maria; Nuñez García, Beatriz; Cruz Bermudez, Juan Luis; Provencio, Mariano; Menasalvas, Ernestina;
    Publisher: Zenodo
    Project: EC | IASIS (727658)

    The automatic extraction of a patient’s natural history from Electronic Health Records (EHRs) is a critical step towards building intelligent systems that can reason about clinical variables and support decision making. Although EHRs contain a large amount of valuable information about the patient’s medical care, this information can only be fully understood when analyzed in a temporal context. Any intelligent system should then be able to extract medical concepts, date expressions, temporal relations and the temporal ordering of medical events from the free texts of EHRs; yet, this task is hard to tackle, due to the domain specific nature of EHRs, writing quality and lack of structure of these texts, and more generally the presence of redundant information. In this paper, we introduce a new Natural Language Processing (NLP) framework, capable of extracting the aforementioned elements from EHRs written in Spanish using rule-based methods. We focus on building medical timelines, which include disease diagnosis and its progression over time. By using a large dataset of EHRs comprising information about patients suffering from lung cancer, we show that our framework has an adequate level of performance by correctly building the timeline for 843 patients from a pool of 989 patients, achieving a correct result in 85% of instances.

  • Publication . Contribution for newspaper or weekly magazine . Conference object . Article . Preprint . 2016
    Open Access
    Authors: 
    Ahmed Ali; Najim Dehak; Patrick Cardinal; Sameer Khurana; Sree Harsha Yella; James Glass; Peter Bell; Steve Renals;
    Publisher: ISCA
    Country: United Kingdom
    Project: EC | SUMMA (688139)

    In this paper, we investigate different approaches for dialect identification in Arabic broadcast speech. These methods are based on phonetic and lexical features obtained from a speech recognition system, and bottleneck features using the i-vector framework. We studied both generative and discriminative classifiers, and we combined these features using a multi-class Support Vector Machine (SVM). We validated our results on an Arabic/English language identification task, with an accuracy of 100%. We also evaluated these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%. We further reported results using the proposed methods to discriminate between the five most widely used dialects of Arabic: namely Egyptian, Gulf, Levantine, North African, and MSA, with an accuracy of 59.2%. We discuss dialect identification errors in the context of dialect code-switching between Dialectal Arabic and MSA, and compare the error pattern between manually labeled data, and the output from our classifier. All the data used on our experiments have been released to the public as a language identification corpus.

  • Open Access
    Authors: 
    Muñoz Ros, Salvador; González-Blanco, Elena;
    Publisher: Zenodo
    Project: EC | POSTDATA (679528)

    POSTDATA focused on poetry analysis, the publication of poetic resources and their exploration, applying Digital Humanities methods. This is a trans-domain project, as it combines traditional philological studies with digital humanities technologies and tools. It is focused on poetry analysis (rhythm, accent, typology) analyzing and comparing databases of poetic repertories, analysing TEI-XML documents and databases that described poetic resources. The project is combining semantic web technologies to represent and publish the information as Linked Open Data in order to make the aforementioned resources interoperable, for a field, poetry, that has never been analyzed as a standardized and interoperable area. POSTDATA is conceived as a digital humanities project that uses the richest poetry collections from 16 languages and literary traditions in combination with the most updated technologies: the construction of an ontology, Semantic Engineering Modelling, Natural Language Processing (NLP) and artificial intelligence, applying automation technologies and data mining to non-standard literary texts (i.e. Medieval Spanish) Poetry Lab combines the application of Artificial Intelligence and NLP technologies with the creation of poetic corpora in different languages producing different tools and innovative applications, such as our geolocation tool for Spanish Medieval texts HisMeTag. On the other hand, the project has made great progress in the automatic detection of enjambment, a difficult prosodic phenomenon to analyse and understand even from the literary point of view.

  • Publication . Preprint . Conference object . Contribution for newspaper or weekly magazine . Article . 2016
    Open Access English
    Authors: 
    Marcin Junczys-Dowmunt; Tomasz Dwojak; Rico Sennrich;
    Country: United Kingdom
    Project: EC | SUMMA (688139), EC | TraMOOC (644333)

    This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration ofattention-based neural translation models with phrase-based statistical machinetranslation. Efficient batch-algorithms for GPU-querying are proposed and implemented. For English-Russian, our system stays behind the state-of-the-art pure neural models in terms of BLEU. Among restricted systems, manual evaluation places it in the first cluster tied with the pure neural model. For the Russian-English task, our submission achieves the top BLEU result, outperforming the best pure neural system by 1.1 BLEU points and our ownphrase-based baseline by 1.6 BLEU. After manual evaluation, this system is thebest restricted system in its own cluster. In follow-up experiments we improve results by additional 0.8 BLEU.

  • Publication . Conference object . Other literature type . Article . 2016
    Open Access
    Authors: 
    Ngoc Quang Luong; Andrei Popescu-Belis;
    Publisher: Association for Computational Linguistics
    Country: Switzerland
    Project: SNSF | MODERN: Modeling discours... (147653), EC | SUMMA (688139)
  • Open Access Spanish
    Authors: 
    Rio Riande, María Gimena del; González Blanco García, Elena; Martínez Cantón, Clara; Robles, Antonio;
    Publisher: Jagiellonian University & Pedagogical University (Cracovia)
    Project: EC | POSTDATA (679528)

    Although Digital Humanities have been defined from a discipline perspective in many ways, it is surely a field still looking for its own objects, practices and methodologies. Their development in the Spanish-speaking countries is no exception to this process and, even it is complex to trace a unique genealogy to give account for the evolving field in Spain and Latin America (Gonzalez-Blanco, 2013; Spence and Gonzalez-Blanco, 2014; Rio Riande 2014a, 2014b), the emergence of various associations in Mexico (RedDH), Spain (HDH) and Argentina (AAHD) that seek for a constant dialogue (Galina, González-Blanco and Rio Riande, 2015), and academic lab and DH center initiatives such as LINHD (Spain and Argentina), GRINUGR (Spain), Medialab USAL, LABTEC (Argentina), TadeoLab (Colombia), Elabora HD (Mexico), among others, make it clear that research has become increasingly “global, multipolar and networked” (Llewellyn Smith, et al., 2011) and that the academic field is looking for a global outreach and aims to open spaces of shared virtual work. Virtual Research Communities (VRCs) are a consequence of these changes. (Párrafo extraído a modo de resumen)

  • Publication . Article . Preprint . Other literature type . Conference object . Contribution for newspaper or weekly magazine . 2020
    Open Access English
    Authors: 
    Biao Zhang; Philip Williams; Ivan Titov; Rico Sennrich;
    Countries: Switzerland, United Kingdom
    Project: EC | ELITR (825460), SNSF | Multi-Task Learning with ... (176727), EC | GoURMET (825299)

    Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by ~10 BLEU, approaching conventional pivot-based methods. Comment: ACL2020

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
67 Research products, page 1 of 7
  • Open Access Spanish; Castilian
    Authors: 
    Margarita Serna Vallejo;
    Publisher: Ediciones Universidad de Valladolid
    Country: Spain
    Project: EC | RESISTANCE (778076)

    RESUMEN: Desde finales de la Baja Edad Media y a lo largo de Época Moderna, algunas de las cofradías de pescadores establecidas en el corregimiento de las Cuatro Villas de la Costa consiguieron que la Monarquía les reconociera el privilegio de disfrutar de una jurisdicción marítima en cada corporación. El establecimiento de estas jurisdicciones disgustó a otras instituciones que vieron disminuidas sus competencias jurisdiccionales. Y de esta situación surgieron distintos conflictos en los que las hermandades tuvieron que luchar por la conservación de la jurisdicción marítima. ABSTRACT: Since the end of the Late Middle Ages and throughout the Modern Era, some of the fishermen's associations established in the corregimiento of the Four Villas of the Coast managed to get the Monarchy to recognize the privilege of enjoying a maritime jurisdiction in each brotherhood. The establishment of these jurisdictions disgusted other institutions that saw their jurisdiction diminished. From this situation arose different conflicts in which the brotherhoods had to fight for the preservation of the maritime jurisdiction. Este trabajo se ha realizado en el marco del Proyecto de Investigación Culturas urbanas en la España Moderna: policía, gobernanza e imaginarios (siglos XVI-XIX) con referencia HAR2015-64014-C3-1-R, financiado por el Ministerio de Economía y Competitividad) y del europeo (Rebellion and Resistance in the Iberian Empires, 16th-19th Centuries que ha recibido financiación del programa de investigación e innovación Horizonte 2020 de la Unión Europea en virtud del acuerdo de subvención Marie Skłodowska-Curie No 778076.

  • Open Access English
    Authors: 
    Matthews, Roger; Matthews, Wendy; Rasheed Raheem, Kamal; Richardson, Amy;
    Publisher: Zenodo
    Project: EC | MENTICA (787264)

    ­The Eastern Fertile Crescent region of western Iran and eastern Iraq hosted major developments in the transition from hunting and gathering to more sedentary agricultural lifestyles through the Early Neolithic period, 10,000-7000 BC. Within the scope of the Central Zagros Archaeological Project, excavations have been conducted at two Early Neolithic sites in the Kurdistan region of Iraq: Bestansur and Shimshara, as well as survey in the region of the Epipalaeolithic site of Zarzi since 2012. Bestansur represents an early stage in the transition to sedentary, agricultural life, where the inhabitants pursued a biodiverse strategy of hunting, gathering, herding and cultivating, maximising the new opportunities afforded by the warmer climate of the Early Holocene. They also constructed a substantial settlement of mudbrick, including a major building with a minimum of 78 human individuals buried under its floor in association with hundreds of beads. ­ These buildings and human remains provide new insights into social relations, mortuary practices, demography, diet, health and disease during the early stages of sedentarisation. ­ The material culture of Bestansur and Shimshara is rich in imported items such as obsidian, carnelian and sea-shells, indicating the extent to which Early Neolithic communities were networked across the Eastern Fertile Crescent and beyond along routes that later became the Silk Roads. This volume includes final reports by a large-scale interdisciplinary team on a wealth of new data from excavations at Bestansur and Shimshara, through application of state-of-the-art scientific techniques, integrated ecological and social approaches and sustainability studies. ­ The net result is to re-emphasise the enormous significance of the Eastern Fertile Crescent in one of the most important episodes in human history: the Neolithic transition.

  • Publication . Article . Conference object . Preprint . 2016 . Embargo End Date: 01 Jan 2016
    Open Access
    Authors: 
    Guntis Barzdins; Didzis Gosko;
    Publisher: arXiv
    Project: EC | SUMMA (688139)

    Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic AMR parsers: a CAMR+wrapper parser and a novel character-level neural translation AMR parser. For AMR parsing task the character-level neural translation attains surprising 7% gain over the carefully optimized word-level neural translation. Overall, we achieve smatch F1=62% on the SemEval-2016 official scor-ing set and F1=67% on the LDC2015E86 test set. Comment: NAACL HLT 2016, SemEval-2016 Task 8 submission

  • Open Access
    Authors: 
    Najafabadipour, Marjan; Zanin, Massimiliano; Rodríguez-González, Alejandro; Torrente, Maria; Nuñez García, Beatriz; Cruz Bermudez, Juan Luis; Provencio, Mariano; Menasalvas, Ernestina;
    Publisher: Zenodo
    Project: EC | IASIS (727658)

    The automatic extraction of a patient’s natural history from Electronic Health Records (EHRs) is a critical step towards building intelligent systems that can reason about clinical variables and support decision making. Although EHRs contain a large amount of valuable information about the patient’s medical care, this information can only be fully understood when analyzed in a temporal context. Any intelligent system should then be able to extract medical concepts, date expressions, temporal relations and the temporal ordering of medical events from the free texts of EHRs; yet, this task is hard to tackle, due to the domain specific nature of EHRs, writing quality and lack of structure of these texts, and more generally the presence of redundant information. In this paper, we introduce a new Natural Language Processing (NLP) framework, capable of extracting the aforementioned elements from EHRs written in Spanish using rule-based methods. We focus on building medical timelines, which include disease diagnosis and its progression over time. By using a large dataset of EHRs comprising information about patients suffering from lung cancer, we show that our framework has an adequate level of performance by correctly building the timeline for 843 patients from a pool of 989 patients, achieving a correct result in 85% of instances.

  • Publication . Contribution for newspaper or weekly magazine . Conference object . Article . Preprint . 2016
    Open Access
    Authors: 
    Ahmed Ali; Najim Dehak; Patrick Cardinal; Sameer Khurana; Sree Harsha Yella; James Glass; Peter Bell; Steve Renals;
    Publisher: ISCA
    Country: United Kingdom
    Project: EC | SUMMA (688139)

    In this paper, we investigate different approaches for dialect identification in Arabic broadcast speech. These methods are based on phonetic and lexical features obtained from a speech recognition system, and bottleneck features using the i-vector framework. We studied both generative and discriminative classifiers, and we combined these features using a multi-class Support Vector Machine (SVM). We validated our results on an Arabic/English language identification task, with an accuracy of 100%. We also evaluated these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%. We further reported results using the proposed methods to discriminate between the five most widely used dialects of Arabic: namely Egyptian, Gulf, Levantine, North African, and MSA, with an accuracy of 59.2%. We discuss dialect identification errors in the context of dialect code-switching between Dialectal Arabic and MSA, and compare the error pattern between manually labeled data, and the output from our classifier. All the data used on our experiments have been released to the public as a language identification corpus.

  • Open Access
    Authors: 
    Muñoz Ros, Salvador; González-Blanco, Elena;
    Publisher: Zenodo
    Project: EC | POSTDATA (679528)

    POSTDATA focused on poetry analysis, the publication of poetic resources and their exploration, applying Digital Humanities methods. This is a trans-domain project, as it combines traditional philological studies with digital humanities technologies and tools. It is focused on poetry analysis (rhythm, accent, typology) analyzing and comparing databases of poetic repertories, analysing TEI-XML documents and databases that described poetic resources. The project is combining semantic web technologies to represent and publish the information as Linked Open Data in order to make the aforementioned resources interoperable, for a field, poetry, that has never been analyzed as a standardized and interoperable area. POSTDATA is conceived as a digital humanities project that uses the richest poetry collections from 16 languages and literary traditions in combination with the most updated technologies: the construction of an ontology, Semantic Engineering Modelling, Natural Language Processing (NLP) and artificial intelligence, applying automation technologies and data mining to non-standard literary texts (i.e. Medieval Spanish) Poetry Lab combines the application of Artificial Intelligence and NLP technologies with the creation of poetic corpora in different languages producing different tools and innovative applications, such as our geolocation tool for Spanish Medieval texts HisMeTag. On the other hand, the project has made great progress in the automatic detection of enjambment, a difficult prosodic phenomenon to analyse and understand even from the literary point of view.

  • Publication . Preprint . Conference object . Contribution for newspaper or weekly magazine . Article . 2016
    Open Access English
    Authors: 
    Marcin Junczys-Dowmunt; Tomasz Dwojak; Rico Sennrich;
    Country: United Kingdom
    Project: EC | SUMMA (688139), EC | TraMOOC (644333)

    This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration ofattention-based neural translation models with phrase-based statistical machinetranslation. Efficient batch-algorithms for GPU-querying are proposed and implemented. For English-Russian, our system stays behind the state-of-the-art pure neural models in terms of BLEU. Among restricted systems, manual evaluation places it in the first cluster tied with the pure neural model. For the Russian-English task, our submission achieves the top BLEU result, outperforming the best pure neural system by 1.1 BLEU points and our ownphrase-based baseline by 1.6 BLEU. After manual evaluation, this system is thebest restricted system in its own cluster. In follow-up experiments we improve results by additional 0.8 BLEU.

  • Publication . Conference object . Other literature type . Article . 2016
    Open Access
    Authors: 
    Ngoc Quang Luong; Andrei Popescu-Belis;
    Publisher: Association for Computational Linguistics
    Country: Switzerland
    Project: SNSF | MODERN: Modeling discours... (147653), EC | SUMMA (688139)
  • Open Access Spanish
    Authors: 
    Rio Riande, María Gimena del; González Blanco García, Elena; Martínez Cantón, Clara; Robles, Antonio;
    Publisher: Jagiellonian University & Pedagogical University (Cracovia)
    Project: EC | POSTDATA (679528)

    Although Digital Humanities have been defined from a discipline perspective in many ways, it is surely a field still looking for its own objects, practices and methodologies. Their development in the Spanish-speaking countries is no exception to this process and, even it is complex to trace a unique genealogy to give account for the evolving field in Spain and Latin America (Gonzalez-Blanco, 2013; Spence and Gonzalez-Blanco, 2014; Rio Riande 2014a, 2014b), the emergence of various associations in Mexico (RedDH), Spain (HDH) and Argentina (AAHD) that seek for a constant dialogue (Galina, González-Blanco and Rio Riande, 2015), and academic lab and DH center initiatives such as LINHD (Spain and Argentina), GRINUGR (Spain), Medialab USAL, LABTEC (Argentina), TadeoLab (Colombia), Elabora HD (Mexico), among others, make it clear that research has become increasingly “global, multipolar and networked” (Llewellyn Smith, et al., 2011) and that the academic field is looking for a global outreach and aims to open spaces of shared virtual work. Virtual Research Communities (VRCs) are a consequence of these changes. (Párrafo extraído a modo de resumen)

  • Publication . Article . Preprint . Other literature type . Conference object . Contribution for newspaper or weekly magazine . 2020
    Open Access English
    Authors: 
    Biao Zhang; Philip Williams; Ivan Titov; Rico Sennrich;
    Countries: Switzerland, United Kingdom
    Project: EC | ELITR (825460), SNSF | Multi-Task Learning with ... (176727), EC | GoURMET (825299)

    Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by ~10 BLEU, approaching conventional pivot-based methods. Comment: ACL2020