Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
175 Research products, page 1 of 18

  • Digital Humanities and Cultural Heritage
  • Publications
  • Research data
  • Research software
  • Other research products
  • European Commission
  • EU
  • IT
  • Archivio della ricerca- Università di Roma La Sapienza
  • OpenAIRE
  • Scientometrics

10
arrow_drop_down
Relevance
arrow_drop_down
  • Open Access Spanish; Castilian
    Authors: 
    Margarita Serna Vallejo;
    Country: Spain
    Project: EC | RESISTANCE (778076)

    RESUMEN: Desde finales de la Baja Edad Media y a lo largo de Época Moderna, algunas de las cofradías de pescadores establecidas en el corregimiento de las Cuatro Villas de la Costa consiguieron que la Monarquía les reconociera el privilegio de disfrutar de una jurisdicción marítima en cada corporación. El establecimiento de estas jurisdicciones disgustó a otras instituciones que vieron disminuidas sus competencias jurisdiccionales. Y de esta situación surgieron distintos conflictos en los que las hermandades tuvieron que luchar por la conservación de la jurisdicción marítima. ABSTRACT: Since the end of the Late Middle Ages and throughout the Modern Era, some of the fishermen's associations established in the corregimiento of the Four Villas of the Coast managed to get the Monarchy to recognize the privilege of enjoying a maritime jurisdiction in each brotherhood. The establishment of these jurisdictions disgusted other institutions that saw their jurisdiction diminished. From this situation arose different conflicts in which the brotherhoods had to fight for the preservation of the maritime jurisdiction. Este trabajo se ha realizado en el marco del Proyecto de Investigación Culturas urbanas en la España Moderna: policía, gobernanza e imaginarios (siglos XVI-XIX) con referencia HAR2015-64014-C3-1-R, financiado por el Ministerio de Economía y Competitividad) y del europeo (Rebellion and Resistance in the Iberian Empires, 16th-19th Centuries que ha recibido financiación del programa de investigación e innovación Horizonte 2020 de la Unión Europea en virtud del acuerdo de subvención Marie Skłodowska-Curie No 778076.

  • Open Access
    Authors: 
    Guntis Barzdins; Didzis Gosko;
    Publisher: Association for Computational Linguistics
    Project: EC | SUMMA (688139)

    Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic AMR parsers: a CAMR+wrapper parser and a novel character-level neural translation AMR parser. For AMR parsing task the character-level neural translation attains surprising 7% gain over the carefully optimized word-level neural translation. Overall, we achieve smatch F1=62% on the SemEval-2016 official scor-ing set and F1=67% on the LDC2015E86 test set. Comment: NAACL HLT 2016, SemEval-2016 Task 8 submission

  • Open Access English
    Authors: 
    Matthews, Roger; Matthews, Wendy; Rasheed Raheem, Kamal; Richardson, Amy;
    Publisher: Zenodo
    Project: EC | MENTICA (787264)

    ­The Eastern Fertile Crescent region of western Iran and eastern Iraq hosted major developments in the transition from hunting and gathering to more sedentary agricultural lifestyles through the Early Neolithic period, 10,000-7000 BC. Within the scope of the Central Zagros Archaeological Project, excavations have been conducted at two Early Neolithic sites in the Kurdistan region of Iraq: Bestansur and Shimshara, as well as survey in the region of the Epipalaeolithic site of Zarzi since 2012. Bestansur represents an early stage in the transition to sedentary, agricultural life, where the inhabitants pursued a biodiverse strategy of hunting, gathering, herding and cultivating, maximising the new opportunities afforded by the warmer climate of the Early Holocene. They also constructed a substantial settlement of mudbrick, including a major building with a minimum of 78 human individuals buried under its floor in association with hundreds of beads. ­ These buildings and human remains provide new insights into social relations, mortuary practices, demography, diet, health and disease during the early stages of sedentarisation. ­ The material culture of Bestansur and Shimshara is rich in imported items such as obsidian, carnelian and sea-shells, indicating the extent to which Early Neolithic communities were networked across the Eastern Fertile Crescent and beyond along routes that later became the Silk Roads. This volume includes final reports by a large-scale interdisciplinary team on a wealth of new data from excavations at Bestansur and Shimshara, through application of state-of-the-art scientific techniques, integrated ecological and social approaches and sustainability studies. ­ The net result is to re-emphasise the enormous significance of the Eastern Fertile Crescent in one of the most important episodes in human history: the Neolithic transition.

  • Open Access English
    Authors: 
    Luigi Procopio; Rocco Tripodi; Roberto Navigli;
    Country: Italy
    Project: EC | MOUSSE (726487)

    Graph-based semantic parsing aims to represent textual meaning through directed graphs. As one of the most promising general-purpose meaning representations, these structures and their parsing have gained a significant interest momentum during recent years, with several diverse formalisms being proposed. Yet, owing to this very heterogeneity, most of the research effort has focused mainly on solutions specific to a given formalism. In this work, instead, we reframe semantic parsing towards multiple formalisms as Multilingual Neural Machine Translation (MNMT), and propose SGL, a many-to-many seq2seq architecture trained with an MNMT objective. Backed by several experiments, we show that this framework is indeed effective once the learning procedure is enhanced with large parallel corpora coming from Machine Translation: we report competitive performances on AMR and UCCA parsing, especially once paired with pre-trained architectures. Furthermore, we find that models trained under this configuration scale remarkably well to tasks such as cross-lingual AMR parsing: SGL outperforms all its competitors by a large margin without even explicitly seeing non-English to AMR examples at training time and, once these examples are included as well, sets an unprecedented state of the art in this task. We release our code and our models for research purposes at https://github.com/SapienzaNLP/sgl.

  • Open Access
    Authors: 
    Najafabadipour, Marjan; Zanin, Massimiliano; Rodríguez-González, Alejandro; Torrente, Maria; Nuñez García, Beatriz; Cruz Bermudez, Juan Luis; Provencio, Mariano; Menasalvas, Ernestina;
    Publisher: Zenodo
    Project: EC | IASIS (727658)

    The automatic extraction of a patient’s natural history from Electronic Health Records (EHRs) is a critical step towards building intelligent systems that can reason about clinical variables and support decision making. Although EHRs contain a large amount of valuable information about the patient’s medical care, this information can only be fully understood when analyzed in a temporal context. Any intelligent system should then be able to extract medical concepts, date expressions, temporal relations and the temporal ordering of medical events from the free texts of EHRs; yet, this task is hard to tackle, due to the domain specific nature of EHRs, writing quality and lack of structure of these texts, and more generally the presence of redundant information. In this paper, we introduce a new Natural Language Processing (NLP) framework, capable of extracting the aforementioned elements from EHRs written in Spanish using rule-based methods. We focus on building medical timelines, which include disease diagnosis and its progression over time. By using a large dataset of EHRs comprising information about patients suffering from lung cancer, we show that our framework has an adequate level of performance by correctly building the timeline for 843 patients from a pool of 989 patients, achieving a correct result in 85% of instances.

  • Open Access English
    Authors: 
    Bevilacqua, Michele; Rexhina Blloshmi; Navigli, Roberto;
    Publisher: AAAI Press
    Country: Italy
    Project: EC | MOUSSE (726487), EC | ELEXIS (731015)

    In Text-to-AMR parsing, current state-of-the-art semantic parsers use cumbersome pipelines integrating several different modules or components, and exploit graph recategorization, i.e., a set of content-specific heuristics that are developed on the basis of the training set. However, the generalizability of graph recategorization in an out-of-distribution setting is unclear. In contrast, state-of-the-art AMR-to-Text generation, which can be seen as the inverse to parsing, is based on simpler seq2seq. In this paper, we cast Text-to-AMR and AMR-to-Text as a symmetric transduction task and show that by devising a careful graph linearization and extending a pretrained encoder-decoder model, it is possible to obtain state-of-the-art performances in both tasks using the very same seq2seq approach, i.e., SPRING (Symmetric PaRsIng aNd Generation). Our model does not require complex pipelines, nor heuristics built on heavy assumptions. In fact, we drop the need for graph recategorization, showing that this technique is actually harmful outside of the standard benchmark. Finally, we outperform the previous state of the art on the English AMR 2.0 dataset by a large margin: on Text-to-AMR we obtain an improvement of 3.6 Smatch points, while on AMR-to-Text we outperform the state of the art by 11.2 BLEU points. We release the software at github.com/SapienzaNLP/spring.

  • Publication . Conference object . Article . Preprint . Contribution for newspaper or weekly magazine . 2016
    Open Access
    Authors: 
    Ahmed Ali; Najim Dehak; Patrick Cardinal; Sameer Khurana; Sree Harsha Yella; James Glass; Peter Bell; Steve Renals;
    Country: United Kingdom
    Project: EC | SUMMA (688139)

    In this paper, we investigate different approaches for dialect identification in Arabic broadcast speech. These methods are based on phonetic and lexical features obtained from a speech recognition system, and bottleneck features using the i-vector framework. We studied both generative and discriminative classifiers, and we combined these features using a multi-class Support Vector Machine (SVM). We validated our results on an Arabic/English language identification task, with an accuracy of 100%. We also evaluated these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%. We further reported results using the proposed methods to discriminate between the five most widely used dialects of Arabic: namely Egyptian, Gulf, Levantine, North African, and MSA, with an accuracy of 59.2%. We discuss dialect identification errors in the context of dialect code-switching between Dialectal Arabic and MSA, and compare the error pattern between manually labeled data, and the output from our classifier. All the data used on our experiments have been released to the public as a language identification corpus.

  • Open Access English
    Authors: 
    Dusan Boric; Thomas Higham; Emanuela Cristiani; Vesna Dimitrijević; Olaf Nehlich; Seren Griffiths; Craig Alexander; Bojana Mihailović; Dragana Filipović; Ethel Allué; +1 more
    Countries: Italy, Serbia, United Kingdom, United Kingdom
    Project: EC | HIDDEN FOODS (639286), EC | MESO-NEO TECHNOLOGY (273575)

    AbstractThe archaeological site of Lepenski Vir is widely known after its remarkable stone art sculptures that represent a unique and unprecedented case of Holocene hunter-gatherer creativity. These artworks were found largely associated with equally unique trapezoidal limestone building floors around their centrally located rectangular stone-lined hearths. A debate has raged since the discovery of the site about the chronological place of various discovered features. While over years different views from that of the excavator about the stratigraphy and chronology of the site have been put forward, some major disagreements about the chronological position of the features that make this site a key point of reference in European Prehistory persist. Despite challenges of re-analyzing the site’s stratigraphy from the original excavation records, taphonomic problems, and issues of reservoir offsets when providing radiocarbon measurements on human and dog bones, our targeted AMS (Accelerator Mass Spectrometry) dating of various contexts from this site with the application of Bayesian statistical modelling allows us to propose with confidence a new and sound chronological framework and provide formal estimates for several key developments represented in the archaeological record of Lepenski Vir that help us in understanding the transition of last foragers to first farmers in southeast Europe as a whole.

  • Open Access English
    Authors: 
    Ambrosetti, Elena; Miccoli, Sara; Strangio, Donatella;
    Publisher: Bancaria editrice
    Country: Italy
    Project: EC | PERCEPTIONS (833870)
  • Open Access
    Authors: 
    Muñoz Ros, Salvador; González-Blanco, Elena;
    Publisher: Zenodo
    Project: EC | POSTDATA (679528)

    POSTDATA focused on poetry analysis, the publication of poetic resources and their exploration, applying Digital Humanities methods. This is a trans-domain project, as it combines traditional philological studies with digital humanities technologies and tools. It is focused on poetry analysis (rhythm, accent, typology) analyzing and comparing databases of poetic repertories, analysing TEI-XML documents and databases that described poetic resources. The project is combining semantic web technologies to represent and publish the information as Linked Open Data in order to make the aforementioned resources interoperable, for a field, poetry, that has never been analyzed as a standardized and interoperable area. POSTDATA is conceived as a digital humanities project that uses the richest poetry collections from 16 languages and literary traditions in combination with the most updated technologies: the construction of an ontology, Semantic Engineering Modelling, Natural Language Processing (NLP) and artificial intelligence, applying automation technologies and data mining to non-standard literary texts (i.e. Medieval Spanish) Poetry Lab combines the application of Artificial Intelligence and NLP technologies with the creation of poetic corpora in different languages producing different tools and innovative applications, such as our geolocation tool for Spanish Medieval texts HisMeTag. On the other hand, the project has made great progress in the automatic detection of enjambment, a difficult prosodic phenomenon to analyse and understand even from the literary point of view.

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
175 Research products, page 1 of 18
  • Open Access Spanish; Castilian
    Authors: 
    Margarita Serna Vallejo;
    Country: Spain
    Project: EC | RESISTANCE (778076)

    RESUMEN: Desde finales de la Baja Edad Media y a lo largo de Época Moderna, algunas de las cofradías de pescadores establecidas en el corregimiento de las Cuatro Villas de la Costa consiguieron que la Monarquía les reconociera el privilegio de disfrutar de una jurisdicción marítima en cada corporación. El establecimiento de estas jurisdicciones disgustó a otras instituciones que vieron disminuidas sus competencias jurisdiccionales. Y de esta situación surgieron distintos conflictos en los que las hermandades tuvieron que luchar por la conservación de la jurisdicción marítima. ABSTRACT: Since the end of the Late Middle Ages and throughout the Modern Era, some of the fishermen's associations established in the corregimiento of the Four Villas of the Coast managed to get the Monarchy to recognize the privilege of enjoying a maritime jurisdiction in each brotherhood. The establishment of these jurisdictions disgusted other institutions that saw their jurisdiction diminished. From this situation arose different conflicts in which the brotherhoods had to fight for the preservation of the maritime jurisdiction. Este trabajo se ha realizado en el marco del Proyecto de Investigación Culturas urbanas en la España Moderna: policía, gobernanza e imaginarios (siglos XVI-XIX) con referencia HAR2015-64014-C3-1-R, financiado por el Ministerio de Economía y Competitividad) y del europeo (Rebellion and Resistance in the Iberian Empires, 16th-19th Centuries que ha recibido financiación del programa de investigación e innovación Horizonte 2020 de la Unión Europea en virtud del acuerdo de subvención Marie Skłodowska-Curie No 778076.

  • Open Access
    Authors: 
    Guntis Barzdins; Didzis Gosko;
    Publisher: Association for Computational Linguistics
    Project: EC | SUMMA (688139)

    Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic AMR parsers: a CAMR+wrapper parser and a novel character-level neural translation AMR parser. For AMR parsing task the character-level neural translation attains surprising 7% gain over the carefully optimized word-level neural translation. Overall, we achieve smatch F1=62% on the SemEval-2016 official scor-ing set and F1=67% on the LDC2015E86 test set. Comment: NAACL HLT 2016, SemEval-2016 Task 8 submission

  • Open Access English
    Authors: 
    Matthews, Roger; Matthews, Wendy; Rasheed Raheem, Kamal; Richardson, Amy;
    Publisher: Zenodo
    Project: EC | MENTICA (787264)

    ­The Eastern Fertile Crescent region of western Iran and eastern Iraq hosted major developments in the transition from hunting and gathering to more sedentary agricultural lifestyles through the Early Neolithic period, 10,000-7000 BC. Within the scope of the Central Zagros Archaeological Project, excavations have been conducted at two Early Neolithic sites in the Kurdistan region of Iraq: Bestansur and Shimshara, as well as survey in the region of the Epipalaeolithic site of Zarzi since 2012. Bestansur represents an early stage in the transition to sedentary, agricultural life, where the inhabitants pursued a biodiverse strategy of hunting, gathering, herding and cultivating, maximising the new opportunities afforded by the warmer climate of the Early Holocene. They also constructed a substantial settlement of mudbrick, including a major building with a minimum of 78 human individuals buried under its floor in association with hundreds of beads. ­ These buildings and human remains provide new insights into social relations, mortuary practices, demography, diet, health and disease during the early stages of sedentarisation. ­ The material culture of Bestansur and Shimshara is rich in imported items such as obsidian, carnelian and sea-shells, indicating the extent to which Early Neolithic communities were networked across the Eastern Fertile Crescent and beyond along routes that later became the Silk Roads. This volume includes final reports by a large-scale interdisciplinary team on a wealth of new data from excavations at Bestansur and Shimshara, through application of state-of-the-art scientific techniques, integrated ecological and social approaches and sustainability studies. ­ The net result is to re-emphasise the enormous significance of the Eastern Fertile Crescent in one of the most important episodes in human history: the Neolithic transition.

  • Open Access English
    Authors: 
    Luigi Procopio; Rocco Tripodi; Roberto Navigli;
    Country: Italy
    Project: EC | MOUSSE (726487)

    Graph-based semantic parsing aims to represent textual meaning through directed graphs. As one of the most promising general-purpose meaning representations, these structures and their parsing have gained a significant interest momentum during recent years, with several diverse formalisms being proposed. Yet, owing to this very heterogeneity, most of the research effort has focused mainly on solutions specific to a given formalism. In this work, instead, we reframe semantic parsing towards multiple formalisms as Multilingual Neural Machine Translation (MNMT), and propose SGL, a many-to-many seq2seq architecture trained with an MNMT objective. Backed by several experiments, we show that this framework is indeed effective once the learning procedure is enhanced with large parallel corpora coming from Machine Translation: we report competitive performances on AMR and UCCA parsing, especially once paired with pre-trained architectures. Furthermore, we find that models trained under this configuration scale remarkably well to tasks such as cross-lingual AMR parsing: SGL outperforms all its competitors by a large margin without even explicitly seeing non-English to AMR examples at training time and, once these examples are included as well, sets an unprecedented state of the art in this task. We release our code and our models for research purposes at https://github.com/SapienzaNLP/sgl.

  • Open Access
    Authors: 
    Najafabadipour, Marjan; Zanin, Massimiliano; Rodríguez-González, Alejandro; Torrente, Maria; Nuñez García, Beatriz; Cruz Bermudez, Juan Luis; Provencio, Mariano; Menasalvas, Ernestina;
    Publisher: Zenodo
    Project: EC | IASIS (727658)

    The automatic extraction of a patient’s natural history from Electronic Health Records (EHRs) is a critical step towards building intelligent systems that can reason about clinical variables and support decision making. Although EHRs contain a large amount of valuable information about the patient’s medical care, this information can only be fully understood when analyzed in a temporal context. Any intelligent system should then be able to extract medical concepts, date expressions, temporal relations and the temporal ordering of medical events from the free texts of EHRs; yet, this task is hard to tackle, due to the domain specific nature of EHRs, writing quality and lack of structure of these texts, and more generally the presence of redundant information. In this paper, we introduce a new Natural Language Processing (NLP) framework, capable of extracting the aforementioned elements from EHRs written in Spanish using rule-based methods. We focus on building medical timelines, which include disease diagnosis and its progression over time. By using a large dataset of EHRs comprising information about patients suffering from lung cancer, we show that our framework has an adequate level of performance by correctly building the timeline for 843 patients from a pool of 989 patients, achieving a correct result in 85% of instances.

  • Open Access English
    Authors: 
    Bevilacqua, Michele; Rexhina Blloshmi; Navigli, Roberto;
    Publisher: AAAI Press
    Country: Italy
    Project: EC | MOUSSE (726487), EC | ELEXIS (731015)

    In Text-to-AMR parsing, current state-of-the-art semantic parsers use cumbersome pipelines integrating several different modules or components, and exploit graph recategorization, i.e., a set of content-specific heuristics that are developed on the basis of the training set. However, the generalizability of graph recategorization in an out-of-distribution setting is unclear. In contrast, state-of-the-art AMR-to-Text generation, which can be seen as the inverse to parsing, is based on simpler seq2seq. In this paper, we cast Text-to-AMR and AMR-to-Text as a symmetric transduction task and show that by devising a careful graph linearization and extending a pretrained encoder-decoder model, it is possible to obtain state-of-the-art performances in both tasks using the very same seq2seq approach, i.e., SPRING (Symmetric PaRsIng aNd Generation). Our model does not require complex pipelines, nor heuristics built on heavy assumptions. In fact, we drop the need for graph recategorization, showing that this technique is actually harmful outside of the standard benchmark. Finally, we outperform the previous state of the art on the English AMR 2.0 dataset by a large margin: on Text-to-AMR we obtain an improvement of 3.6 Smatch points, while on AMR-to-Text we outperform the state of the art by 11.2 BLEU points. We release the software at github.com/SapienzaNLP/spring.

  • Publication . Conference object . Article . Preprint . Contribution for newspaper or weekly magazine . 2016
    Open Access
    Authors: 
    Ahmed Ali; Najim Dehak; Patrick Cardinal; Sameer Khurana; Sree Harsha Yella; James Glass; Peter Bell; Steve Renals;
    Country: United Kingdom
    Project: EC | SUMMA (688139)

    In this paper, we investigate different approaches for dialect identification in Arabic broadcast speech. These methods are based on phonetic and lexical features obtained from a speech recognition system, and bottleneck features using the i-vector framework. We studied both generative and discriminative classifiers, and we combined these features using a multi-class Support Vector Machine (SVM). We validated our results on an Arabic/English language identification task, with an accuracy of 100%. We also evaluated these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%. We further reported results using the proposed methods to discriminate between the five most widely used dialects of Arabic: namely Egyptian, Gulf, Levantine, North African, and MSA, with an accuracy of 59.2%. We discuss dialect identification errors in the context of dialect code-switching between Dialectal Arabic and MSA, and compare the error pattern between manually labeled data, and the output from our classifier. All the data used on our experiments have been released to the public as a language identification corpus.

  • Open Access English
    Authors: 
    Dusan Boric; Thomas Higham; Emanuela Cristiani; Vesna Dimitrijević; Olaf Nehlich; Seren Griffiths; Craig Alexander; Bojana Mihailović; Dragana Filipović; Ethel Allué; +1 more
    Countries: Italy, Serbia, United Kingdom, United Kingdom
    Project: EC | HIDDEN FOODS (639286), EC | MESO-NEO TECHNOLOGY (273575)

    AbstractThe archaeological site of Lepenski Vir is widely known after its remarkable stone art sculptures that represent a unique and unprecedented case of Holocene hunter-gatherer creativity. These artworks were found largely associated with equally unique trapezoidal limestone building floors around their centrally located rectangular stone-lined hearths. A debate has raged since the discovery of the site about the chronological place of various discovered features. While over years different views from that of the excavator about the stratigraphy and chronology of the site have been put forward, some major disagreements about the chronological position of the features that make this site a key point of reference in European Prehistory persist. Despite challenges of re-analyzing the site’s stratigraphy from the original excavation records, taphonomic problems, and issues of reservoir offsets when providing radiocarbon measurements on human and dog bones, our targeted AMS (Accelerator Mass Spectrometry) dating of various contexts from this site with the application of Bayesian statistical modelling allows us to propose with confidence a new and sound chronological framework and provide formal estimates for several key developments represented in the archaeological record of Lepenski Vir that help us in understanding the transition of last foragers to first farmers in southeast Europe as a whole.

  • Open Access English
    Authors: 
    Ambrosetti, Elena; Miccoli, Sara; Strangio, Donatella;
    Publisher: Bancaria editrice
    Country: Italy
    Project: EC | PERCEPTIONS (833870)
  • Open Access
    Authors: 
    Muñoz Ros, Salvador; González-Blanco, Elena;
    Publisher: Zenodo
    Project: EC | POSTDATA (679528)

    POSTDATA focused on poetry analysis, the publication of poetic resources and their exploration, applying Digital Humanities methods. This is a trans-domain project, as it combines traditional philological studies with digital humanities technologies and tools. It is focused on poetry analysis (rhythm, accent, typology) analyzing and comparing databases of poetic repertories, analysing TEI-XML documents and databases that described poetic resources. The project is combining semantic web technologies to represent and publish the information as Linked Open Data in order to make the aforementioned resources interoperable, for a field, poetry, that has never been analyzed as a standardized and interoperable area. POSTDATA is conceived as a digital humanities project that uses the richest poetry collections from 16 languages and literary traditions in combination with the most updated technologies: the construction of an ontology, Semantic Engineering Modelling, Natural Language Processing (NLP) and artificial intelligence, applying automation technologies and data mining to non-standard literary texts (i.e. Medieval Spanish) Poetry Lab combines the application of Artificial Intelligence and NLP technologies with the creation of poetic corpora in different languages producing different tools and innovative applications, such as our geolocation tool for Spanish Medieval texts HisMeTag. On the other hand, the project has made great progress in the automatic detection of enjambment, a difficult prosodic phenomenon to analyse and understand even from the literary point of view.