Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
2 Research products, page 1 of 1

  • Digital Humanities and Cultural Heritage
  • Research data
  • Dataset
  • European Commission
  • EC|H2020
  • MeMAD
  • EU
  • FI

Date (most recent)
arrow_drop_down
  • Open Access English
    Authors: 
    Tiedemann, Jörg; Scherrer, Yves;
    Publisher: Zenodo
    Project: EC | MeMAD (780069), EC | FoTran (771113)

    This release contains data sets for experiments with document-level machine translation. The data sets have been used in previous studies and provided here for replicability and comparison with other systems. The data sets are taken from the English-German news translation task at WMT 2019 and the English-German bitext in the OpenSubtitles collection v2016 from OPUS. All data sets are sentence aligned with corresponding lines being aligned to each other. Document boundaries are marked with empty lines (on both sides of the parallel corpus). The data set has been used in the following publication: @inproceedings{scherrer-tiedemann-loaiciga-2019, title = "Analysing concatenation approaches to document-level NMT in two different domains", author = {Scherrer, Yves and Tiedemann, J{\"o}rg and Lo{\'a}iciga, Sharid}, booktitle = "Proceedings of the Third Workshop on Discourse in Machine Translation", month = nov, year = "2019", address = "Hong-Kong", publisher = "Association for Computational Linguistics", } Please, cite that paper if you use the data set in your own work. {"references": ["Scherrer, Tiedemann and Lo\u00e1iciga: \"Analysing concatenation approaches to document-level NMT in two different domains\", in Proceedings of DiscoMT2019 at EMNLP 2019, Hong-Kong"]}

  • Open Access English
    Authors: 
    Tiedemann, Jörg; Scherrer, Yves;
    Publisher: Zenodo
    Project: EC | MeMAD (780069), EC | FoTran (771113)

    This release contains data sets for experiments with document-level machine translation. The data sets have been used in previous studies and provided here for replicability and comparison with other systems. The data sets are taken from the English-German news translation task at WMT 2019 and the English-German bitext in the OpenSubtitles collection v2016 from OPUS. All data sets are sentence aligned with corresponding lines being aligned to each other. Document boundaries are marked with empty lines (on both sides of the parallel corpus). The data set has been used in the following publication: @inproceedings{scherrer-tiedemann-loaiciga-2019, title = "Analysing concatenation approaches to document-level NMT in two different domains", author = {Scherrer, Yves and Tiedemann, J{\"o}rg and Lo{\'a}iciga, Sharid}, booktitle = "Proceedings of the Third Workshop on Discourse in Machine Translation", month = nov, year = "2019", address = "Hong-Kong", publisher = "Association for Computational Linguistics", } Please, cite that paper if you use the data set in your own work.

Powered by OpenAIRE graph
Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
2 Research products, page 1 of 1
  • Open Access English
    Authors: 
    Tiedemann, Jörg; Scherrer, Yves;
    Publisher: Zenodo
    Project: EC | MeMAD (780069), EC | FoTran (771113)

    This release contains data sets for experiments with document-level machine translation. The data sets have been used in previous studies and provided here for replicability and comparison with other systems. The data sets are taken from the English-German news translation task at WMT 2019 and the English-German bitext in the OpenSubtitles collection v2016 from OPUS. All data sets are sentence aligned with corresponding lines being aligned to each other. Document boundaries are marked with empty lines (on both sides of the parallel corpus). The data set has been used in the following publication: @inproceedings{scherrer-tiedemann-loaiciga-2019, title = "Analysing concatenation approaches to document-level NMT in two different domains", author = {Scherrer, Yves and Tiedemann, J{\"o}rg and Lo{\'a}iciga, Sharid}, booktitle = "Proceedings of the Third Workshop on Discourse in Machine Translation", month = nov, year = "2019", address = "Hong-Kong", publisher = "Association for Computational Linguistics", } Please, cite that paper if you use the data set in your own work. {"references": ["Scherrer, Tiedemann and Lo\u00e1iciga: \"Analysing concatenation approaches to document-level NMT in two different domains\", in Proceedings of DiscoMT2019 at EMNLP 2019, Hong-Kong"]}

  • Open Access English
    Authors: 
    Tiedemann, Jörg; Scherrer, Yves;
    Publisher: Zenodo
    Project: EC | MeMAD (780069), EC | FoTran (771113)

    This release contains data sets for experiments with document-level machine translation. The data sets have been used in previous studies and provided here for replicability and comparison with other systems. The data sets are taken from the English-German news translation task at WMT 2019 and the English-German bitext in the OpenSubtitles collection v2016 from OPUS. All data sets are sentence aligned with corresponding lines being aligned to each other. Document boundaries are marked with empty lines (on both sides of the parallel corpus). The data set has been used in the following publication: @inproceedings{scherrer-tiedemann-loaiciga-2019, title = "Analysing concatenation approaches to document-level NMT in two different domains", author = {Scherrer, Yves and Tiedemann, J{\"o}rg and Lo{\'a}iciga, Sharid}, booktitle = "Proceedings of the Third Workshop on Discourse in Machine Translation", month = nov, year = "2019", address = "Hong-Kong", publisher = "Association for Computational Linguistics", } Please, cite that paper if you use the data set in your own work.

Powered by OpenAIRE graph