Quick search
Advanced search in
Research outcomes
Field to searchTerm
Add rule
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
Download Results
54 research outcomes, page 5 of 6
  • research data . 2015 . Embargo End Date: 16 May 2015
    Open Access
    Authors:
    Agirre, Eneko; Branco, António; Popel, Martin; Simov, Kiril;
    Persistent Identifiers
    Publisher: University of the Basque Country, UPV/EHU
    Project: EC | QTLEAP (610516)

    This corpora is part of Deliverable 5.5 of the European Commission project QTLeap FP7-ICT-2013.4.1-610516 (http://qtleap.eu). The texts are sentences from the Europarl parallel corpus (Koehn, 2005). We selected the monolingual sentences from parallel corpora for the fol...

    Add to ORCID
  • research data . 2014 . Embargo End Date: 28 Apr 2014
    Open Access
    Authors:
    Dušek, Ondřej; Hajič, Jan; Hlaváčová, Jaroslava; Pecina, Pavel; Tamchyna, Aleš; Urešová, Zdeňka;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | KHRESMOI (257528)

    This package contains data sets for development and testing of machine translation of sentences from summaries of medical articles between Czech, English, French, and German.

    Add to ORCID
  • research data . 2014 . Embargo End Date: 27 Mar 2014
    Open Access
    Authors:
    Jawaid, Bushra; Kamran, Amir; Bojar, Ondřej;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | MOSESCORE (288487)

    We release a sizeable monolingual Urdu corpus automatically tagged with part-of-speech tags. We extend the work of Jawaid and Bojar (2012) who use three different taggers and then apply a voting scheme to disambiguate among the different choices suggested by each tagger...

    Add to ORCID
  • research data . 2013 . Embargo End Date: 02 Apr 2014
    Open Access
    Authors:
    Pecina, Pavel; Dušek, Ondřej; Hajič, Jan; Urešová, Zdeňka;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | KHRESMOI (257528)

    This package contains data sets for development and testing of machine translation of medical search short queries between Czech, English, French, and German. The queries come from general public and medical experts.

    Add to ORCID
  • research data . 2013 . Embargo End Date: 10 Dec 2013
    Open Access
    Authors:
    Bojar, Ondřej; Macháček, Matouš; Tamchyna, Aleš; Zeman, Daniel;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | MOSESCORE (288487)

    This dataset contains the whole set of very many Czech translations for 50 English source sentences coming from WMT11 test set (http://www.statmt.org/wmt11). In total, there are 15431447 Czech sentences, i.e. 300k reference translations per source English sentence on av...

    Add to ORCID
  • research data . 2012 . Embargo End Date: 13 Nov 2012
    Open Access
    Authors:
    Bojar, Ondřej; Zeman, Daniel; Dušek, Ondřej; Břečková, Jana; Farkačová, Hana; Grošpic, Pavel; Kačenová, Kristýna; Knechtová, Eva; Koubová, Anna; Lukavská, Jana; ...
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | EUROMATRIXPLUS (231720)

    Additional three Czech reference translations of the whole WMT 2011 data set (http://www.statmt.org/wmt11/test.tgz), translated from the German originals. Original segmentation of the WMT 2011 data is preserved.

    Add to ORCID
  • research data . 2012 . Embargo End Date: 28 Mar 2013
    Open Access
    Authors:
    Eva Hajičová; Jarmila Panevová; Sgall, Petr; Silvie Cinková; Eva Fučíková; Marie Mikulová; Pajas, Petr; Popelka, Jan; Jiří Semecký; Jana Šindlerová; ...
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | EUROMATRIXPLUS (231720)

    Texts The Prague Czech-English Dependency Treebank 2.0 (PCEDT 2.0) is a major update of the Prague Czech-English Dependency Treebank 1.0 (LDC2004T25). It is a manually parsed Czech-English parallel corpus sized over 1.2 million running words in almost 50,000 sentences f...

    Add to ORCID
  • research data . 2012 . Embargo End Date: 15 May 2012
    Open Access
    Authors:
    Galuščáková, Petra; Garabík, Radovan; Bojar, Ondřej;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | EUROMATRIXPLUS (231720)

    Czech-Slovak parallel corpus consisting of several freely available corpora (Acquis [1], Europarl [2], Official Journal of the European Union [3] and part of OPUS corpus [4] – EMEA, EUConst, KDE4 and PHP) and downloaded website of European Commission [5]. Corpus is publ...

    Add to ORCID
  • research data . 2012 . Embargo End Date: 15 May 2012
    Open Access
    Authors:
    Bojar, Ondřej; Galuščáková, Petra;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | EUROMATRIXPLUS (231720)

    Manually ranked outputs of Czech-Slovak translations. Three annotators manually ranked outputs of five MT systems (Česílko, Česílko2, Google Translate and two Moses setups) on three data sets (100 sentences randomly selected from books, 100 sentences randomly selected f...

    Add to ORCID
  • research data . 2012 . Embargo End Date: 15 May 2012
    Open Access
    Authors:
    Galuščáková, Petra; Garabík, Radovan; Bojar, Ondřej;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | EUROMATRIXPLUS (231720)

    English-Slovak parallel corpus consisting of several freely available corpora (Acquis [1], Europarl [2], Official Journal of the European Union [3] and part of OPUS corpus [4] – EMEA, EUConst, KDE4 and PHP) and downloaded website of European Commission [5]. Corpus is pu...

    Add to ORCID
54 research outcomes, page 5 of 6