Quick search
Advanced search in
Research outcomes
Field to searchTerm
Add rule
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
Download Results
38 research outcomes, page 2 of 4
  • research data . 2019 . Embargo End Date: 15 Jul 2019
    Open Access
    Authors:
    Macháček, Dominik; Kratochvíl, Jonáš; Vojtěchová, Tereza; Bojar, Ondřej;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | ELITR (825460)

    We present a test corpus of audio recordings and transcriptions of presentations of students' enterprises together with their slides and web-pages. The corpus is intended for evaluation of automatic speech recognition (ASR) systems, especially in conditions where the pr...

    Add to ORCID
  • research data . 2019 . Embargo End Date: 08 Mar 2019
    Open Access
    Authors:
    Çano, Erion;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | ELITR (825460)

    OAGK is a keyword extraction/generation dataset consisting of 2.2 million abstracts, titles and keyword strings from cientific articles. Texts were lowercased and tokenized with Stanford CoreNLP tokenizer. No other preprocessing steps were applied in this release versio...

    Add to ORCID
  • research data . 2018 . Embargo End Date: 20 Feb 2018
    Open Access
    Authors:
    Hajič, Jan; Bejček, Eduard; Bémová, Alevtina; Buráňová, Eva; Hajičová, Eva; Havelka, Jiří; Homola, Petr; Kárník, Jiří; Kettnerová, Václava; Klyueva, Natalia; ...
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | T4ME NET (249119)

    The Prague Dependency Treebank 3.5 is the 2018 edition of the core Prague Dependency Treebank (PDT). It contains all PDT annotation made at the Institute of Formal and Applied Linguistics under various projects between 1996 and 2018 on the original texts, i.e., all anno...

    Add to ORCID
  • research data . 2017 . Embargo End Date: 03 Apr 2017
    Open Access
    Authors:
    Pecina, Pavel; Dušek, Ondřej; Hajič, Jan; Libovický, Jindřich; Urešová, Zdeňka;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | KHRESMOI (257528)

    This package contains data sets for development and testing of machine translation of medical queries between Czech, English, French, German, Hungarian, Polish, Spanish ans Swedish. The queries come from general public and medical experts. This is version 2.0 extending ...

    Add to ORCID
  • research data . 2017 . Embargo End Date: 03 Apr 2017
    Open Access
    Authors:
    Dušek, Ondřej; Hajič, Jan; Hlaváčová, Jaroslava; Libovický, Jindřich; Pecina, Pavel; Tamchyna, Aleš; Urešová, Zdeňka;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | KHRESMOI (257528)

    This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech, English, French, German, Hungarian, Polish, Spanish and Swedish. Version 2.0 extends the previo...

    Add to ORCID
  • research data . 2016 . Embargo End Date: 14 Jun 2016
    Open Access
    Authors:
    Cífka, Ondřej; Bojar, Ondřej;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | QT21 (645452)

    This small dataset contains 3 speech corpora collected using the Alex Translate telephone service (https://ufal.mff.cuni.cz/alex#alex-translate). The "part1" and "part2" corpora contain English speech with transcriptions and Czech translations. These recordings were col...

    Add to ORCID
  • research data . 2016 . Embargo End Date: 01 Apr 2016
    Open Access
    Authors:
    Bojar, Ondřej; Děchtěrenko, Filip; Zelenina, Maria;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | QT21 (645452)

    This package contains the eye-tracker recordings of 8 subjects evaluating English-to-Czech machine translation quality using the WMT-style ranking of sentences. We provide the set of sentences evaluated, the exact screens presented to the annotators (including bounding ...

    Add to ORCID
  • research data . 2016 . Embargo End Date: 30 Mar 2016
    Open Access
    Authors:
    Nedoluzhko, Anna; Novák, Michal; Cinková, Silvie; Mikulová, Marie; Mírovský, Jiří;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | QTLEAP (610516)

    The Prague Czech-English Dependency Treebank 2.0 Coref (PCEDT 2.0 Coref) is a parallel treebank building upon the original PCEDT 2.0 release and enriching it with the extended manual annotation of coreference, as well as with an improved automatic annotation of the core...

    Add to ORCID
  • research data . 2016 . Embargo End Date: 22 Mar 2016
    Open Access
    Authors:
    Kamran, Amir; Jawaid, Bushra; Bojar, Ondřej; Stanojevic, Milos;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | QT21 (645452)

    This item contains models to tune for the WMT16 Tuning shared task for English-to-Czech. CzEng 1.6pre (http://ufal.mff.cuni.cz/czeng/czeng16pre) corpus is used for the training of the translation models. The data is tokenized (using Moses tokenizer), lowercased and sent...

    Add to ORCID
  • research data . 2016 . Embargo End Date: 22 Mar 2016
    Open Access
    Authors:
    Kamran, Amir; Jawaid, Bushra; Bojar, Ondřej; Stanojevic, Milos;
    Persistent Identifiers
    Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
    Project: EC | QT21 (645452)

    The item contains models to tune for the WMT16 Tuning shared task for Czech-to-English. CzEng 1.6pre (http://ufal.mff.cuni.cz/czeng/czeng16pre) corpus is used for the training of the translation models. The data is tokenized (using Moses tokenizer), lowercased and sente...

    Add to ORCID
38 research outcomes, page 2 of 4