Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
13 Research products, page 1 of 2

  • Digital Humanities and Cultural Heritage
  • Publications
  • Research data
  • Other research products
  • 2017-2021
  • Article
  • European Commission
  • EU
  • IT
  • English
  • OpenAIRE
  • Scientometrics
  • Digital Humanities and Cultural Heritage

10
arrow_drop_down
Date (most recent)
arrow_drop_down
  • Open Access English
    Authors: 
    Kun Sun; Haitao Liu; Wenxin Xiong;
    Project: EC | WIDE (742545)

    AbstractScientific writings, as one essential part of human culture, have evolved over centuries into their current form. Knowing how scientific writings evolved is particularly helpful in understanding how trends in scientific culture developed. It also allows us to better understand how scientific culture was interwoven with human culture generally. The availability of massive digitized texts and the progress in computational technologies today provide us with a convenient and credible way to discern the evolutionary patterns in scientific writings by examining the diachronic linguistic changes. The linguistic changes in scientific writings reflect the genre shifts that took place with historical changes in science and scientific writings. This study investigates a general evolutionary linguistic pattern in scientific writings. It does so by merging two credible computational methods: relative entropy; word-embedding concreteness and imageability. It thus creates a novel quantitative methodology and applies this to the examination of diachronic changes in the Philosophical Transactions of Royal Society (PTRS, 1665–1869). The data from two computational approaches can be well mapped to support the argument that this journal followed the evolutionary trend of increasing professionalization and specialization. But it also shows that language use in this journal was greatly influenced by historical events and other socio-cultural factors. This study, as a “culturomic” approach, demonstrates that the linguistic evolutionary patterns in scientific discourse have been interrupted by external factors even though this scientific discourse would likely have cumulatively developed into a professional and specialized genre. The approaches proposed by this study can make a great contribution to full-text analysis in scientometrics.

  • Open Access English
    Authors: 
    Anders Svensson; Dorthe Dahl-Jensen; Jørgen Peder Steffensen; Thomas Blunier; Sune Olander Rasmussen; Bo Møllesøe Vinther; Paul Vallelonga; Emilie Capron; Vasileios Gkinis; Eliza Cook; +16 more
    Countries: Switzerland, France, France, United Kingdom, France, Denmark
    Project: SNSF | EURODIVERSITY 2005 FP083-... (114216), NSF | Collaborative Research: I... (1142166), NSF | Collaborative Research: I... (0839093), EC | THERA (820047), EC | TiPES (820970)

    The last glacial period is characterized by a number of millennial climate events that have been identified in both Greenland and Antarctic ice cores and that are abrupt in Greenland climate records. The mechanisms governing this climate variability remain a puzzle that requires a precise synchronization of ice cores from the two hemispheres to be resolved. Previously, Greenland and Antarctic ice cores have been synchronized primarily via their common records of gas concentrations or isotopes from the trapped air and via cosmogenic isotopes measured on the ice. In this work, we apply ice core volcanic proxies and annual layer counting to identify large volcanic eruptions that have left a signature in both Greenland and Antarctica. Generally, no tephra is associated with those eruptions in the ice cores, so the source of the eruptions cannot be identified. Instead, we identify and match sequences of volcanic eruptions with bipolar distribution of sulfate, i.e. unique patterns of volcanic events separated by the same number of years at the two poles. Using this approach, we pinpoint 82 large bipolar volcanic eruptions throughout the second half of the last glacial period (12–60 ka). This improved ice core synchronization is applied to determine the bipolar phasing of abrupt climate change events at decadal-scale precision. In response to Greenland abrupt climatic transitions, we find a response in the Antarctic water isotope signals (δ18O and deuterium excess) that is both more immediate and more abrupt than that found with previous gas-based interpolar synchronizations, providing additional support for our volcanic framework. On average, the Antarctic bipolar seesaw climate response lags the midpoint of Greenland abrupt δ18O transitions by 122±24 years. The time difference between Antarctic signals in deuterium excess and δ18O, which likewise informs the time needed to propagate the signal as described by the theory of the bipolar seesaw but is less sensitive to synchronization errors, suggests an Antarctic δ18O lag behind Greenland of 152±37 years. These estimates are shorter than the 200 years suggested by earlier gas-based synchronizations. As before, we find variations in the timing and duration between the response at different sites and for different events suggesting an interaction of oceanic and atmospheric teleconnection patterns as well as internal climate variability.

  • Publication . Article . Preprint . Other literature type . Conference object . Contribution for newspaper or weekly magazine . 2020
    Open Access English
    Authors: 
    Biao Zhang; Philip Williams; Ivan Titov; Rico Sennrich;
    Countries: Switzerland, United Kingdom
    Project: EC | ELITR (825460), SNSF | Multi-Task Learning with ... (176727), EC | GoURMET (825299)

    Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by ~10 BLEU, approaching conventional pivot-based methods. Comment: ACL2020

  • Open Access English
    Authors: 
    Mercklé, Pierre; Zalc, Claire;
    Project: EC | LUBARTWORLD (818843)

    RésumésL’objectif de cet article est de proposer un examen détaillé des apports et des limites de la modélisation en histoire à partir du cas de la Shoah. Il s’appuie sur une enquête qui a permis de reconstituer les « trajectoires de persécution » des 992 Juifs de Lens pendant la Seconde Guerre mondiale, dont 527 seulement ont survécu. 491 ont été arrêtés, 468 ont été déportés et 449 ont été exterminés. Les données prosopographiques sont utilisées ici pour répondre à une question simple : est-il possible de modéliser la persécution ? En d’autres termes, est-il possible de construire une représentation simplifiée mais heuristique des processus causaux complexes qui ont déterminé les chances de survie face à la persécution nazie à partir de données standardisées sur un nombre relativement important d’individus ? L’article discute les apports et les limites d’une succession de méthodes quantifiées : celles qui s’inscrivent dans ce qu’Andrew Abbott appelle le « programme standard » des sciences sociales, ainsi que l’analyse des réseaux et l’analyse séquentielle. Pour chacune d’entre elles, sont plus particulièrement discutées les manières de rendre compte des interactions entre les individus, de l’historicité des comportements et des processus déterminant ces chances de survie. Les tentatives de modélisation à partir de données historiennes apportent ainsi de véritables renouvellements de connaissances, notamment lorsqu’elles sont menées de manière cumulative sur une même enquête. En passant d’une logique de propriétés individuelles à une logique de trajectoires interconnectées, ces approches permettent de mieux comprendre les interactions sociales et locales, et offrent ainsi des perspectives stimulantes pour la microhistoire de l’Holocauste.

  • Publication . Other literature type . Part of book or chapter of book . 2020
    Open Access English
    Authors: 
    Oswaldo Solarte-Pabon; Ernestina Menasalvas; Alejandro Rodríguez-González;
    Publisher: Zenodo
    Project: EC | IASIS (727658)

    Electronic health records contain valuable information written in narrative form. A relevant challenge in clinical narrative text is that concepts commonly appear negated. Several proposals have been developed to detect negation in clinical text written in Spanish. Much of these proposals have adapted the Negex algorithm to Spanish, but obtained results indicated lower performance than Negex implementations in other languages. Moreover, in most of these proposals, the validation process could be improved using a shared test corpus focused on negation in clinical text. This paper proposes Spa-neg, an approach to improve negation detection in clinical text written in Spanish. Spa-neg combines three elements: i) an exploratory data analysis of how negation is written in the clinical text, ii) use of regular expressions best adapted to the way in which negation is expressed in Spanish, iii) tests, and validation using a shared annotated corpus focused on negation. Obtained results suggest that the combination of these elements improves the process of negation detection. The tests performed shown 92% F-Score using IULA Spanish, an annotated corpus for negation

  • Publication . Other literature type . Report . 2019
    Open Access English
    Authors: 
    Barbot, Laure; Moranville, Yoan; Fischer, Frank; Petitfils, Clara; Ďurčo, Matej; Illmayer, Klaus; Parkoła, Tomasz; Wieder, Philipp; Karampatakis, Sotiris;
    Publisher: Zenodo
    Project: EC | SSHOC (823782)

    {"references": ["Auer, S\u00f6ren 2018. Towards an Open Research Knowledge Graph (Version 1). Zenodo. http://doi.org/10.5281/zenodo.1157185", "Constantopoulos, Panos & Pertsas, Vayianos 2019. From publications to knowledge graphs. 13th International Workshop on Information Search, Integration, and Personalization, Heraklion, 9\u201310 May 2019.", "29, Issue 3, September 2014, Pages 326\u2013339, https://doi.org/10.1093/llc/fqu026", "Dombrowski, Quinn & Rockwell, Geoffrey. \"The Directory Paradox\". Forthcoming in Debates in Digital Humanities: Institutions, Infrastructures at the Interstices. Univ. of Minnesota Press. Eds. Anne McGrail et al. 2019.", "Representing Research Findings by Semantifying Survey Articles. 315\u2013327. https://doi.org/10.1007/978-3-319- 67008-9_25", "Jim\u00e9nez RC & Kuzak M & Alhamdoosh M et al. 2017. Four simple recommendations to encourage best practices in research software [version 1; peer review: 3 approved]. F1000Research 2017, 6:876 https://doi.org/10.12688/f1000research.11407.1", "Grant, Kaitlyn & Dombrowski, Quinn & Ranaweera, Kamal & Rodriquez-Arenas, Omar & Sinclair, Stefan & Rockwell, Geoffrey. \"Absorbing DiRT: Tool Discovery in the Digital Age.\" Digital Studies/le Champ Num\u00e9rique. Forthcoming.", "de Leeuw, Lisa & Admiraal, Femmy & \u010eur\u010do, Matej & Larousse, Nicolas & Mertens, Michael et al. 2017. D5.1 Report on Integrated Service!Needs: DARIAH (in kind) contributions \u2013 Concept and Procedures. [Other] DARIAH. 2017. https://hal.archives-ouvertes.fr/hal-01628733", "Raciti, Marco & Moranville, Yoann & Barthauer, Raisa & Buddenbohm, Stefan & Seillier, Dorian 2019. https://hal.archives-ouvertes.fr/hal-02088278"]} This document delivers the results of Task 7.1 of the Social Sciences & Humanities Open Cloud project funded by the European Commission under Grant #823782. Its main purpose is the specification of the SSH Open Marketplace (SSHOC MP) in terms of service requirements, data model, and system architecture and design. The Social Sciences & Humanities communities are in an urgent need for a place to gather and exchange information about their tools, services, and datasets. Although plenty of project websites, service registries, and data repositories exist, the lack of a central place integrating these assets and offering domain-relevant means to enrich them and communicate is evident. This place is the SSHOC Marketplace. The approach towards the system specification is based on an extensive requirements engineering process. First and foremost, user requirements have been gathered through questionnaires. The results have been then prioritised based on the user feedback and the experience of the SSHOC project partners. Based on the requirements and thorough state-of-the-art analysis, a data model and the system design have been developed. In order to do so, and by taking into account as much previous work from other European projects as possible, the integration with the EOSC infrastructure has been a primary concern at every step taken. The system specification is now the starting point for the development of the SSHOC MP and also a communication instrument within the project and externally. Over the course of the agile development of the Marketplace, the system specification will also be evolving and contributing to a growing number of SSHOC outcomes. This deliverable has been accepted by the European Commission on - 03 November 2020

  • Open Access English
    Authors: 
    Jorge Miguel Viana Pedreira;
    Publisher: ISCTE-Instituto Universitário de Lisboa
    Project: EC | RESISTANCE (778076)

    In this interview, James Green, a prominent Brazilianist, tells us about his interest in Brazilian history, his life as a civic and political activist against authoritarianism in Brazil and for gay and lesbian rights, and his academic work and career. The purpose of the interview, besides bringing his work to a wider audience of European historians and social scientists, is to reflect on the relationship between academic work and political and ideological activism, and to discuss the problems of subjectivism and the use of individual testimonies in the making of contemporary history. We invited James Green to reflect on those matters, so he could share with us the views of someone who, because of the nature of his work, could not help but deal permanently with such questions. Nesta entrevista, James Green, um importante “brasilianista”, fala-nos sobre o seu interesse pelo história do Brasil, sobre a sua vida como militante cívico e político contra o autoritarismo no Brasil e a favor dos direitos de gays e lésbicas, e ainda sobre a sua carreira e o seu trabalho académico. O objetivo da entrevista, além de levar o seu trabalho a um público mais amplo de historiadores e cientistas sociais europeus, é refletir sobre a relação entre o trabalho académico e o ativismo político e ideológico, e discutir os problemas do subjetivismo e do uso de testemunhos individuais na construção da história contemporânea. Convidámos James Green a refletir sobre esses problemas, para que pudesse compartilhar connosco as opiniões de alguém que, devido à natureza do seu trabalho, não pôde deixar de se confrontar permanentemente com tais questões. Dans cet entretien, James Green, un important spécialiste de l’histoire moderne du Brésil, nous parle de son intérêt pour le Brésil, de sa vie de militant civique et politique contre l’autoritarisme au Brésil et pour les droits des gays et lesbiennes, ainsi que de sa carrière et de son travail universitaire. L’entretien a pour but de présenter son travail à un public plus large d’historiens et de spécialistes des sciences sociales européens, mais aussi de réfléchir sur le rapport entre travail universitaire et activisme politique et idéologique, et de discuter les problèmes du subjectivisme et de l’usage de témoignages individuels dans la construction de l’histoire contemporaine. Nous avons invité James Green à réfléchir sur ces questions pourqu’il puisse partager avec nous le point de vue de quelqu’un qui, en raison de la nature de son travail, ne pourrait s’empêcher de faire toujours face à ces questions.

  • Open Access English
    Authors: 
    Ros; Salvador;
    Publisher: Zenodo
    Project: EC | POSTDATA (679528)

    Presentation at the EADH 2018: "Data in Digital Humanities" at National University of Ireland, Galway 7-9 December 2018

  • Open Access English
    Authors: 
    Camil Demetrescu; Andrea Ribichini; Marco Schaerf;
    Publisher: Springer Verlag
    Country: Italy
    Project: EC | SecondHands (643950)

    We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages.

  • Publication . Conference object . Other literature type . Article . Contribution for newspaper or weekly magazine . Preprint . 2018
    Open Access English
    Authors: 
    Rachel Bawden; Rico Sennrich; Alexandra Birch; Barry Haddow;
    Countries: United Kingdom, Switzerland, France
    Project: SNSF | Dating structural fabric ... (105212), SNSF | Rich Context in Neural Ma... (169888), EC | HimL (644402), EC | SUMMA (688139), EC | TraMOOC (644333)

    For machine translation to tackle discourse phenomena, models must have access to extra-sentential linguistic context. There has been recent interest in modelling context in neural machine translation (NMT), but models have been principally evaluated with standard automatic metrics, poorly adapted to evaluating discourse phenomena. In this article, we present hand-crafted, discourse test sets, designed to test the models' ability to exploit previous source and target sentences. We investigate the performance of recently proposed multi-encoder NMT models trained on subtitles for English to French. We also explore a novel way of exploiting context from the previous sentence. Despite gains using BLEU, multi-encoder models give limited improvement in the handling of discourse phenomena: 50% accuracy on our coreference test set and 53.5% for coherence/cohesion (compared to a non-contextual baseline of 50%). A simple strategy of decoding the concatenation of the previous and current sentence leads to good performance, and our novel strategy of multi-encoding and decoding of two sentences leads to the best performance (72.5% for coreference and 57% for coherence/cohesion), highlighting the importance of target-side context. Comment: Final version of paper to appear in Proceedings of NAACL 2018

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
13 Research products, page 1 of 2
  • Open Access English
    Authors: 
    Kun Sun; Haitao Liu; Wenxin Xiong;
    Project: EC | WIDE (742545)

    AbstractScientific writings, as one essential part of human culture, have evolved over centuries into their current form. Knowing how scientific writings evolved is particularly helpful in understanding how trends in scientific culture developed. It also allows us to better understand how scientific culture was interwoven with human culture generally. The availability of massive digitized texts and the progress in computational technologies today provide us with a convenient and credible way to discern the evolutionary patterns in scientific writings by examining the diachronic linguistic changes. The linguistic changes in scientific writings reflect the genre shifts that took place with historical changes in science and scientific writings. This study investigates a general evolutionary linguistic pattern in scientific writings. It does so by merging two credible computational methods: relative entropy; word-embedding concreteness and imageability. It thus creates a novel quantitative methodology and applies this to the examination of diachronic changes in the Philosophical Transactions of Royal Society (PTRS, 1665–1869). The data from two computational approaches can be well mapped to support the argument that this journal followed the evolutionary trend of increasing professionalization and specialization. But it also shows that language use in this journal was greatly influenced by historical events and other socio-cultural factors. This study, as a “culturomic” approach, demonstrates that the linguistic evolutionary patterns in scientific discourse have been interrupted by external factors even though this scientific discourse would likely have cumulatively developed into a professional and specialized genre. The approaches proposed by this study can make a great contribution to full-text analysis in scientometrics.

  • Open Access English
    Authors: 
    Anders Svensson; Dorthe Dahl-Jensen; Jørgen Peder Steffensen; Thomas Blunier; Sune Olander Rasmussen; Bo Møllesøe Vinther; Paul Vallelonga; Emilie Capron; Vasileios Gkinis; Eliza Cook; +16 more
    Countries: Switzerland, France, France, United Kingdom, France, Denmark
    Project: SNSF | EURODIVERSITY 2005 FP083-... (114216), NSF | Collaborative Research: I... (1142166), NSF | Collaborative Research: I... (0839093), EC | THERA (820047), EC | TiPES (820970)

    The last glacial period is characterized by a number of millennial climate events that have been identified in both Greenland and Antarctic ice cores and that are abrupt in Greenland climate records. The mechanisms governing this climate variability remain a puzzle that requires a precise synchronization of ice cores from the two hemispheres to be resolved. Previously, Greenland and Antarctic ice cores have been synchronized primarily via their common records of gas concentrations or isotopes from the trapped air and via cosmogenic isotopes measured on the ice. In this work, we apply ice core volcanic proxies and annual layer counting to identify large volcanic eruptions that have left a signature in both Greenland and Antarctica. Generally, no tephra is associated with those eruptions in the ice cores, so the source of the eruptions cannot be identified. Instead, we identify and match sequences of volcanic eruptions with bipolar distribution of sulfate, i.e. unique patterns of volcanic events separated by the same number of years at the two poles. Using this approach, we pinpoint 82 large bipolar volcanic eruptions throughout the second half of the last glacial period (12–60 ka). This improved ice core synchronization is applied to determine the bipolar phasing of abrupt climate change events at decadal-scale precision. In response to Greenland abrupt climatic transitions, we find a response in the Antarctic water isotope signals (δ18O and deuterium excess) that is both more immediate and more abrupt than that found with previous gas-based interpolar synchronizations, providing additional support for our volcanic framework. On average, the Antarctic bipolar seesaw climate response lags the midpoint of Greenland abrupt δ18O transitions by 122±24 years. The time difference between Antarctic signals in deuterium excess and δ18O, which likewise informs the time needed to propagate the signal as described by the theory of the bipolar seesaw but is less sensitive to synchronization errors, suggests an Antarctic δ18O lag behind Greenland of 152±37 years. These estimates are shorter than the 200 years suggested by earlier gas-based synchronizations. As before, we find variations in the timing and duration between the response at different sites and for different events suggesting an interaction of oceanic and atmospheric teleconnection patterns as well as internal climate variability.

  • Publication . Article . Preprint . Other literature type . Conference object . Contribution for newspaper or weekly magazine . 2020
    Open Access English
    Authors: 
    Biao Zhang; Philip Williams; Ivan Titov; Rico Sennrich;
    Countries: Switzerland, United Kingdom
    Project: EC | ELITR (825460), SNSF | Multi-Task Learning with ... (176727), EC | GoURMET (825299)

    Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by ~10 BLEU, approaching conventional pivot-based methods. Comment: ACL2020

  • Open Access English
    Authors: 
    Mercklé, Pierre; Zalc, Claire;
    Project: EC | LUBARTWORLD (818843)

    RésumésL’objectif de cet article est de proposer un examen détaillé des apports et des limites de la modélisation en histoire à partir du cas de la Shoah. Il s’appuie sur une enquête qui a permis de reconstituer les « trajectoires de persécution » des 992 Juifs de Lens pendant la Seconde Guerre mondiale, dont 527 seulement ont survécu. 491 ont été arrêtés, 468 ont été déportés et 449 ont été exterminés. Les données prosopographiques sont utilisées ici pour répondre à une question simple : est-il possible de modéliser la persécution ? En d’autres termes, est-il possible de construire une représentation simplifiée mais heuristique des processus causaux complexes qui ont déterminé les chances de survie face à la persécution nazie à partir de données standardisées sur un nombre relativement important d’individus ? L’article discute les apports et les limites d’une succession de méthodes quantifiées : celles qui s’inscrivent dans ce qu’Andrew Abbott appelle le « programme standard » des sciences sociales, ainsi que l’analyse des réseaux et l’analyse séquentielle. Pour chacune d’entre elles, sont plus particulièrement discutées les manières de rendre compte des interactions entre les individus, de l’historicité des comportements et des processus déterminant ces chances de survie. Les tentatives de modélisation à partir de données historiennes apportent ainsi de véritables renouvellements de connaissances, notamment lorsqu’elles sont menées de manière cumulative sur une même enquête. En passant d’une logique de propriétés individuelles à une logique de trajectoires interconnectées, ces approches permettent de mieux comprendre les interactions sociales et locales, et offrent ainsi des perspectives stimulantes pour la microhistoire de l’Holocauste.

  • Publication . Other literature type . Part of book or chapter of book . 2020
    Open Access English
    Authors: 
    Oswaldo Solarte-Pabon; Ernestina Menasalvas; Alejandro Rodríguez-González;
    Publisher: Zenodo
    Project: EC | IASIS (727658)

    Electronic health records contain valuable information written in narrative form. A relevant challenge in clinical narrative text is that concepts commonly appear negated. Several proposals have been developed to detect negation in clinical text written in Spanish. Much of these proposals have adapted the Negex algorithm to Spanish, but obtained results indicated lower performance than Negex implementations in other languages. Moreover, in most of these proposals, the validation process could be improved using a shared test corpus focused on negation in clinical text. This paper proposes Spa-neg, an approach to improve negation detection in clinical text written in Spanish. Spa-neg combines three elements: i) an exploratory data analysis of how negation is written in the clinical text, ii) use of regular expressions best adapted to the way in which negation is expressed in Spanish, iii) tests, and validation using a shared annotated corpus focused on negation. Obtained results suggest that the combination of these elements improves the process of negation detection. The tests performed shown 92% F-Score using IULA Spanish, an annotated corpus for negation

  • Publication . Other literature type . Report . 2019
    Open Access English
    Authors: 
    Barbot, Laure; Moranville, Yoan; Fischer, Frank; Petitfils, Clara; Ďurčo, Matej; Illmayer, Klaus; Parkoła, Tomasz; Wieder, Philipp; Karampatakis, Sotiris;
    Publisher: Zenodo
    Project: EC | SSHOC (823782)

    {"references": ["Auer, S\u00f6ren 2018. Towards an Open Research Knowledge Graph (Version 1). Zenodo. http://doi.org/10.5281/zenodo.1157185", "Constantopoulos, Panos & Pertsas, Vayianos 2019. From publications to knowledge graphs. 13th International Workshop on Information Search, Integration, and Personalization, Heraklion, 9\u201310 May 2019.", "29, Issue 3, September 2014, Pages 326\u2013339, https://doi.org/10.1093/llc/fqu026", "Dombrowski, Quinn & Rockwell, Geoffrey. \"The Directory Paradox\". Forthcoming in Debates in Digital Humanities: Institutions, Infrastructures at the Interstices. Univ. of Minnesota Press. Eds. Anne McGrail et al. 2019.", "Representing Research Findings by Semantifying Survey Articles. 315\u2013327. https://doi.org/10.1007/978-3-319- 67008-9_25", "Jim\u00e9nez RC & Kuzak M & Alhamdoosh M et al. 2017. Four simple recommendations to encourage best practices in research software [version 1; peer review: 3 approved]. F1000Research 2017, 6:876 https://doi.org/10.12688/f1000research.11407.1", "Grant, Kaitlyn & Dombrowski, Quinn & Ranaweera, Kamal & Rodriquez-Arenas, Omar & Sinclair, Stefan & Rockwell, Geoffrey. \"Absorbing DiRT: Tool Discovery in the Digital Age.\" Digital Studies/le Champ Num\u00e9rique. Forthcoming.", "de Leeuw, Lisa & Admiraal, Femmy & \u010eur\u010do, Matej & Larousse, Nicolas & Mertens, Michael et al. 2017. D5.1 Report on Integrated Service!Needs: DARIAH (in kind) contributions \u2013 Concept and Procedures. [Other] DARIAH. 2017. https://hal.archives-ouvertes.fr/hal-01628733", "Raciti, Marco & Moranville, Yoann & Barthauer, Raisa & Buddenbohm, Stefan & Seillier, Dorian 2019. https://hal.archives-ouvertes.fr/hal-02088278"]} This document delivers the results of Task 7.1 of the Social Sciences & Humanities Open Cloud project funded by the European Commission under Grant #823782. Its main purpose is the specification of the SSH Open Marketplace (SSHOC MP) in terms of service requirements, data model, and system architecture and design. The Social Sciences & Humanities communities are in an urgent need for a place to gather and exchange information about their tools, services, and datasets. Although plenty of project websites, service registries, and data repositories exist, the lack of a central place integrating these assets and offering domain-relevant means to enrich them and communicate is evident. This place is the SSHOC Marketplace. The approach towards the system specification is based on an extensive requirements engineering process. First and foremost, user requirements have been gathered through questionnaires. The results have been then prioritised based on the user feedback and the experience of the SSHOC project partners. Based on the requirements and thorough state-of-the-art analysis, a data model and the system design have been developed. In order to do so, and by taking into account as much previous work from other European projects as possible, the integration with the EOSC infrastructure has been a primary concern at every step taken. The system specification is now the starting point for the development of the SSHOC MP and also a communication instrument within the project and externally. Over the course of the agile development of the Marketplace, the system specification will also be evolving and contributing to a growing number of SSHOC outcomes. This deliverable has been accepted by the European Commission on - 03 November 2020

  • Open Access English
    Authors: 
    Jorge Miguel Viana Pedreira;
    Publisher: ISCTE-Instituto Universitário de Lisboa
    Project: EC | RESISTANCE (778076)

    In this interview, James Green, a prominent Brazilianist, tells us about his interest in Brazilian history, his life as a civic and political activist against authoritarianism in Brazil and for gay and lesbian rights, and his academic work and career. The purpose of the interview, besides bringing his work to a wider audience of European historians and social scientists, is to reflect on the relationship between academic work and political and ideological activism, and to discuss the problems of subjectivism and the use of individual testimonies in the making of contemporary history. We invited James Green to reflect on those matters, so he could share with us the views of someone who, because of the nature of his work, could not help but deal permanently with such questions. Nesta entrevista, James Green, um importante “brasilianista”, fala-nos sobre o seu interesse pelo história do Brasil, sobre a sua vida como militante cívico e político contra o autoritarismo no Brasil e a favor dos direitos de gays e lésbicas, e ainda sobre a sua carreira e o seu trabalho académico. O objetivo da entrevista, além de levar o seu trabalho a um público mais amplo de historiadores e cientistas sociais europeus, é refletir sobre a relação entre o trabalho académico e o ativismo político e ideológico, e discutir os problemas do subjetivismo e do uso de testemunhos individuais na construção da história contemporânea. Convidámos James Green a refletir sobre esses problemas, para que pudesse compartilhar connosco as opiniões de alguém que, devido à natureza do seu trabalho, não pôde deixar de se confrontar permanentemente com tais questões. Dans cet entretien, James Green, un important spécialiste de l’histoire moderne du Brésil, nous parle de son intérêt pour le Brésil, de sa vie de militant civique et politique contre l’autoritarisme au Brésil et pour les droits des gays et lesbiennes, ainsi que de sa carrière et de son travail universitaire. L’entretien a pour but de présenter son travail à un public plus large d’historiens et de spécialistes des sciences sociales européens, mais aussi de réfléchir sur le rapport entre travail universitaire et activisme politique et idéologique, et de discuter les problèmes du subjectivisme et de l’usage de témoignages individuels dans la construction de l’histoire contemporaine. Nous avons invité James Green à réfléchir sur ces questions pourqu’il puisse partager avec nous le point de vue de quelqu’un qui, en raison de la nature de son travail, ne pourrait s’empêcher de faire toujours face à ces questions.

  • Open Access English
    Authors: 
    Ros; Salvador;
    Publisher: Zenodo
    Project: EC | POSTDATA (679528)

    Presentation at the EADH 2018: "Data in Digital Humanities" at National University of Ireland, Galway 7-9 December 2018

  • Open Access English
    Authors: 
    Camil Demetrescu; Andrea Ribichini; Marco Schaerf;
    Publisher: Springer Verlag
    Country: Italy
    Project: EC | SecondHands (643950)

    We investigate the accuracy of how author names are reported in bibliographic records excerpted from four prominent sources: WoS, Scopus, PubMed, and CrossRef. We take as a case study 44,549 publications stored in the internal database of Sapienza University of Rome, one of the largest universities in Europe. While our results indicate generally good accuracy for all bibliographic data sources considered, we highlight a number of issues that undermine the accuracy for certain classes of author names, including compound names and names with diacritics, which are common features to Italian and other Western languages.

  • Publication . Conference object . Other literature type . Article . Contribution for newspaper or weekly magazine . Preprint . 2018
    Open Access English
    Authors: 
    Rachel Bawden; Rico Sennrich; Alexandra Birch; Barry Haddow;
    Countries: United Kingdom, Switzerland, France
    Project: SNSF | Dating structural fabric ... (105212), SNSF | Rich Context in Neural Ma... (169888), EC | HimL (644402), EC | SUMMA (688139), EC | TraMOOC (644333)

    For machine translation to tackle discourse phenomena, models must have access to extra-sentential linguistic context. There has been recent interest in modelling context in neural machine translation (NMT), but models have been principally evaluated with standard automatic metrics, poorly adapted to evaluating discourse phenomena. In this article, we present hand-crafted, discourse test sets, designed to test the models' ability to exploit previous source and target sentences. We investigate the performance of recently proposed multi-encoder NMT models trained on subtitles for English to French. We also explore a novel way of exploiting context from the previous sentence. Despite gains using BLEU, multi-encoder models give limited improvement in the handling of discourse phenomena: 50% accuracy on our coreference test set and 53.5% for coherence/cohesion (compared to a non-contextual baseline of 50%). A simple strategy of decoding the concatenation of the previous and current sentence leads to good performance, and our novel strategy of multi-encoding and decoding of two sentences leads to the best performance (72.5% for coreference and 57% for coherence/cohesion), highlighting the importance of target-side context. Comment: Final version of paper to appear in Proceedings of NAACL 2018