Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
1,961 Research products, page 1 of 197

  • Digital Humanities and Cultural Heritage
  • Research data
  • Research software
  • 2018-2022
  • ZENODO
  • Digital Humanities and Cultural Heritage

10
arrow_drop_down
Relevance
arrow_drop_down
  • Research data . 2018
    Open Access
    Authors: 
    Bonnie, Rick;
    Publisher: Zenodo

    This open dataset lists, describes, and provides relevant bibliography to all known archaeological sites in Galilee with evidence for stone vessels. It forms part of the dataset used in the monograph Being Jewish in Galilee, 100–200 CE: An Archaeological Study (Brepols). The dataset is available in both PDF and CSV formats. The PDF file provides a detailed description of and bibliography for the evidence of stone vessels at each archaeological site. The CSV file contains the raw data that can be easily imported into spreadsheets and databases.

  • Open Access
    Authors: 
    Luis Naranjo-Zeledón;
    Publisher: Zenodo

    These files contain the calculations of phonological proximity and validations with users carried out on a subset of the Costa Rican Sign Language (LESCO, for its acronym in Spanish). The signs corresponding to the alphabet have been used.

  • Open Access
    Authors: 
    Mengual, Ximo; Bot, Sander; Chkhartishvili, Tinatin; Reimann, André; Thormann, Jana; von der Mark, Laura;
    Publisher: Zenodo

    : Data type: molecular data

  • Open Access English
    Authors: 
    Giovanni Spitale;
    Publisher: Zenodo

    The COVID-19 pandemic generated (and keeps generating) a huge corpus of news articles, easily retrievable in Factiva with very targeted queries. This dataset, generated with an ad-hoc parser and NLP pipeline, analyzes the frequency of lemmas and named entities in news articles (in German, French, Italian and English ) regarding Switzerland and COVID-19. The analysis of large bodies of grey literature via text mining and computational linguistics is an increasingly frequent approach to understand the large-scale trends of specific topics. We used Factiva, a news monitoring and search engine developed and owned by Dow Jones, to gather and download all the news articles published between January 2020 and May 2021 on Covid-19 and Switzerland. Due to Factiva's copyright policy, it is not possible to share the original dataset with the exports of the articles' text; however, we can share the results of our work on the corpus. All the information relevant to reproduce the results is provided. Factiva allows a very granular definition of the queries, and moreover has access to full text articles published by the major media outlet of the world. The query has been defined as follows (syntax in bold, explanation in italics): ((coronavirus or Wuhan virus or corvid19 or corvid 19 or covid19 or covid 19 or ncov or novel coronavirus or sars) and (atleast3 coronavirus or atleast3 wuhan or atleast3 corvid* or atleast3 covid* or atleast3 ncov or atleast3 novel or atleast3 corona*)) Keywords for covid19; must appear at least 3 times in the text and ns=(gsars or gout) Subject is “novel coronaviruses” or “outbreaks and epidemics” and “general news” and la=X Language is X (DE, FR, IT, EN) and rst=tmnb Restrict to TMNB (major news and business publications) and wc>300 At least 300 words and date from 20191001 to 20212005 Date interval and re=SWITZ Region is Switzerland It is important to specify some details that characterize the query. The query is not limited to articles published by Swiss media, but to articles regarding Switzerland. The reason is simple: a Swiss user googling for “Schweiz Coronavirus” or for “Coronavirus Ticino” can easily find and read articles published by foreign media outlets (namely, German or Italian) on that topic. If the objective is capturing and describing the information trends to which people are exposed, this approach makes much more sense than limiting the analysis to articles published by Swiss media. Factiva’s field “NS” is a descriptor for the content of the article. “gsars” is defined in Factiva’s documentation as “All news on Severe Acute Respiratory Syndrome”, and “gout” as “The widespread occurrence of an infectious disease affecting many people or animals in a given population at the same time”; however, the way these descriptors are assigned to articles is not specified in the documentation. Finally, the query has been restricted to major news and business publications of at least 300 words. Duplicate check is performed by Factiva. Given the incredibly large amount of articles published on COVID-19, this (absolutely arbitrary) restriction allows retrieving a corpus that is both meaningful and manageable. metadata.xlsx contains information about the articles retrieved (strategy, amount) This work is part of the PubliCo research project. This work is part of the PubliCo research project, supported by the Swiss National Science Foundation (SNF). Project no. 31CA30_195905

  • Open Access
    Authors: 
    Barman, Raphaël; Ehrmann, Maud; Clematide, Simon; Ares Oliveira, Sofia;
    Publisher: Zenodo
    Country: Switzerland

    This record contains the datasets and models used and produced for the work reported in the paper "Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers" (link). Please cite this paper if you are using the models/datasets or find it relevant to your research: @article{barman_combining_2020, title = {{Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers}}, author = {Raphaël Barman and Maud Ehrmann and Simon Clematide and Sofia Ares Oliveira and Frédéric Kaplan}, journal= {Journal of Data Mining \& Digital Humanities}, volume= {HistoInformatics} DOI = {10.5281/zenodo.4065271}, year = {2021}, url = {https://jdmdh.episciences.org/7097}, } Please note that this record contains data under different licenses. 1. DATA Annotations (json files): JSON files contains image annotations, with one file per newspaper containing region annotations (label and coordinates) in VIA format. The following licenses apply: luxwort.json: those annotations are under a CC0 1.0 license. Please refer to the right statement specified for each image in the file. GDL.json, IMP.json and JDG.json: those annotations are under a CC BY-SA 4.0 license. Image files: The archive images.zip contains the Swiss titles image files (GDL, IMP, JDG) used for the experiments described in the paper. Those images are under copyright (property of the journal Le Temps and of ArcInfo) and can be used for academic research or educational purposes only. Redistribution, publication or commercial use are not permitted. These terms of use are similar to the following right statement: http://rightsstatements.org/vocab/InC-EDU/1.0/ 2. MODELS Some of the best models are released under a CC BY-SA 4.0 license (they are also available as assets of the current Github release). JDG_flair-FT: this model was trained on JDG using french Flair and FastText embeddings. It is able to predict the four classes presented in the paper (Serial, Weather, Death notice and Stocks). Luxwort_obituary_flair-bpemb: this model was trained on Luxwort using multilingual Flair and Byte-pair embeddings. It is able to predict the Death notice class. Luxwort_obituary_flair-FT_indomain: this model was trained on Luxwort using in-domain Flair and FastText embeddings (trained on Luxwort data). It is also able to predict the Death notice class. Those models can be used to predict probabilities on new images using the same code as in the original dhSegment repository. One needs to adjust three parameters to the predict function: 1) embeddings_path (the path to the embeddings list), 2) embeddings_map_path(the path to the compressed embedding map), and 3) embeddings_dim (the size of the embeddings). Please refer to the paper for further information or contact us. 3. CODE: https://github.com/dhlab-epfl/dhSegment-text 4. ACKNOWLEDGEMENTS We warmly thank the journal Le Temps (owner of La Gazette de Lausanne and the Journal de Genève) and the group ArcInfo (owner of L'Impartial) for accepting to share the related datasets for academic purposes. We also thank the National Library of Luxembourg for its support with all steps related to the Luxemburger Wort annotation release. This work was realized in the context of the impresso - Media Monitoring of the Past project and supported by the Swiss National Science Foundation under grant CR- SII5_173719. 5. CONTACT Maud Ehrmann (EPFL-DHLAB) Simon Clematide (UZH)

  • Open Access
    Authors: 
    Jamyang Dakpa; Tashi Dhondup; Yeshi Jigme Gangne; Garrett, Edward; Meelen, Marieke; Sonam Wangyal;
    Publisher: Zenodo

    This is a small hand-annotated partial treebank of Modern Tibetan, primarily in CoNLL-U format. Some texts were POS-tagged by machine, and then dependency relations between verbs and their arguments were added by hand. Other texts include only dependency relations and relevant POS-tags. A number of the texts have English translations which have been manually aligned to the Tibetan text. This work was created as part of the AHRC-funded project Lexicography in Motion (PI Ulrich Pagel, 2017-2021). Funded by the UK's Arts and Humanities Research Council (grant code: AH/P004644/1)

  • Open Access
    Authors: 
    Wang, Yanan; Zhuang, Huifu; Shen, Yunguang; Wang, Yuhua; Wang, Zhonglang;
    Publisher: Zenodo

    Figure 3 Venn diagram of economic value type for cultivars.

  • Open Access English
    Authors: 
    Rossi, Matteo; Gittins, Mark; Mercuri, Giulia; Perles, Angel; Peiró, Andrea;
    Publisher: Zenodo
    Project: EC | CollectionCare (814624)

    This dataset contains environmental data (temperature, relative humidity, and, in some cases, light and ultraviolet radiation levels) of partner museums of the European Horizon 2020 CollectionCare project . The following museums provided data to create this compilation and consolidation: Alava Arms Museum (Spain), Alava Fine Arts Museum (Spain), National Historical Museum (Greece), The Ethnographic Open Air Museum of Latvia, The Royal Danish Collection - Rosenborg (Denmark).

  • Open Access
    Authors: 
    Digital Humanities im deutschsprachigen Raum e.V.;
    Publisher: Zenodo

    Diese Publikation enthält das Logo des Verbands "Digital Humanities im deutschsprachigen Raum" e.V. in unterschiedlichen Formaten.

  • Research data . Image . 2021
    Open Access
    Authors: 
    Catherine Anne Cassidy; Iain Oliver; David Caldwell; Ray Lafferty; Alan Miller;
    Publisher: Zenodo

    3D model of a game piece. This bone gaming piece probably dates from the 15th century. It shows a zoomorphic shape on the front, probably a unicorn (also interpreted as a stag) on the front, while the back is plain. It resembles a tableman, a piece used in playing backgammon. It was excavated on Eilean Mor on Loch Finlaggan. A number of items associated with board games were discovered in excavations at the former residence of the Lords of the Isles at Loch Finlaggan on Islay (in Western Scotland). Islay had a long tradition of board games, influenced by both Norse and Gaelic culture. [Original Object: 30mm x 30mm] Part of the Lords of the Isle 15th-Century Finlaggan reconstruction as well as part of the gallery of 3D objects associated with Finlaggan Trust, viewable here: https://cineg.org/type-gallery-page/?itemid=273&type=Physical%20Object

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
1,961 Research products, page 1 of 197
  • Research data . 2018
    Open Access
    Authors: 
    Bonnie, Rick;
    Publisher: Zenodo

    This open dataset lists, describes, and provides relevant bibliography to all known archaeological sites in Galilee with evidence for stone vessels. It forms part of the dataset used in the monograph Being Jewish in Galilee, 100–200 CE: An Archaeological Study (Brepols). The dataset is available in both PDF and CSV formats. The PDF file provides a detailed description of and bibliography for the evidence of stone vessels at each archaeological site. The CSV file contains the raw data that can be easily imported into spreadsheets and databases.

  • Open Access
    Authors: 
    Luis Naranjo-Zeledón;
    Publisher: Zenodo

    These files contain the calculations of phonological proximity and validations with users carried out on a subset of the Costa Rican Sign Language (LESCO, for its acronym in Spanish). The signs corresponding to the alphabet have been used.

  • Open Access
    Authors: 
    Mengual, Ximo; Bot, Sander; Chkhartishvili, Tinatin; Reimann, André; Thormann, Jana; von der Mark, Laura;
    Publisher: Zenodo

    : Data type: molecular data

  • Open Access English
    Authors: 
    Giovanni Spitale;
    Publisher: Zenodo

    The COVID-19 pandemic generated (and keeps generating) a huge corpus of news articles, easily retrievable in Factiva with very targeted queries. This dataset, generated with an ad-hoc parser and NLP pipeline, analyzes the frequency of lemmas and named entities in news articles (in German, French, Italian and English ) regarding Switzerland and COVID-19. The analysis of large bodies of grey literature via text mining and computational linguistics is an increasingly frequent approach to understand the large-scale trends of specific topics. We used Factiva, a news monitoring and search engine developed and owned by Dow Jones, to gather and download all the news articles published between January 2020 and May 2021 on Covid-19 and Switzerland. Due to Factiva's copyright policy, it is not possible to share the original dataset with the exports of the articles' text; however, we can share the results of our work on the corpus. All the information relevant to reproduce the results is provided. Factiva allows a very granular definition of the queries, and moreover has access to full text articles published by the major media outlet of the world. The query has been defined as follows (syntax in bold, explanation in italics): ((coronavirus or Wuhan virus or corvid19 or corvid 19 or covid19 or covid 19 or ncov or novel coronavirus or sars) and (atleast3 coronavirus or atleast3 wuhan or atleast3 corvid* or atleast3 covid* or atleast3 ncov or atleast3 novel or atleast3 corona*)) Keywords for covid19; must appear at least 3 times in the text and ns=(gsars or gout) Subject is “novel coronaviruses” or “outbreaks and epidemics” and “general news” and la=X Language is X (DE, FR, IT, EN) and rst=tmnb Restrict to TMNB (major news and business publications) and wc>300 At least 300 words and date from 20191001 to 20212005 Date interval and re=SWITZ Region is Switzerland It is important to specify some details that characterize the query. The query is not limited to articles published by Swiss media, but to articles regarding Switzerland. The reason is simple: a Swiss user googling for “Schweiz Coronavirus” or for “Coronavirus Ticino” can easily find and read articles published by foreign media outlets (namely, German or Italian) on that topic. If the objective is capturing and describing the information trends to which people are exposed, this approach makes much more sense than limiting the analysis to articles published by Swiss media. Factiva’s field “NS” is a descriptor for the content of the article. “gsars” is defined in Factiva’s documentation as “All news on Severe Acute Respiratory Syndrome”, and “gout” as “The widespread occurrence of an infectious disease affecting many people or animals in a given population at the same time”; however, the way these descriptors are assigned to articles is not specified in the documentation. Finally, the query has been restricted to major news and business publications of at least 300 words. Duplicate check is performed by Factiva. Given the incredibly large amount of articles published on COVID-19, this (absolutely arbitrary) restriction allows retrieving a corpus that is both meaningful and manageable. metadata.xlsx contains information about the articles retrieved (strategy, amount) This work is part of the PubliCo research project. This work is part of the PubliCo research project, supported by the Swiss National Science Foundation (SNF). Project no. 31CA30_195905

  • Open Access
    Authors: 
    Barman, Raphaël; Ehrmann, Maud; Clematide, Simon; Ares Oliveira, Sofia;
    Publisher: Zenodo
    Country: Switzerland

    This record contains the datasets and models used and produced for the work reported in the paper "Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers" (link). Please cite this paper if you are using the models/datasets or find it relevant to your research: @article{barman_combining_2020, title = {{Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers}}, author = {Raphaël Barman and Maud Ehrmann and Simon Clematide and Sofia Ares Oliveira and Frédéric Kaplan}, journal= {Journal of Data Mining \& Digital Humanities}, volume= {HistoInformatics} DOI = {10.5281/zenodo.4065271}, year = {2021}, url = {https://jdmdh.episciences.org/7097}, } Please note that this record contains data under different licenses. 1. DATA Annotations (json files): JSON files contains image annotations, with one file per newspaper containing region annotations (label and coordinates) in VIA format. The following licenses apply: luxwort.json: those annotations are under a CC0 1.0 license. Please refer to the right statement specified for each image in the file. GDL.json, IMP.json and JDG.json: those annotations are under a CC BY-SA 4.0 license. Image files: The archive images.zip contains the Swiss titles image files (GDL, IMP, JDG) used for the experiments described in the paper. Those images are under copyright (property of the journal Le Temps and of ArcInfo) and can be used for academic research or educational purposes only. Redistribution, publication or commercial use are not permitted. These terms of use are similar to the following right statement: http://rightsstatements.org/vocab/InC-EDU/1.0/ 2. MODELS Some of the best models are released under a CC BY-SA 4.0 license (they are also available as assets of the current Github release). JDG_flair-FT: this model was trained on JDG using french Flair and FastText embeddings. It is able to predict the four classes presented in the paper (Serial, Weather, Death notice and Stocks). Luxwort_obituary_flair-bpemb: this model was trained on Luxwort using multilingual Flair and Byte-pair embeddings. It is able to predict the Death notice class. Luxwort_obituary_flair-FT_indomain: this model was trained on Luxwort using in-domain Flair and FastText embeddings (trained on Luxwort data). It is also able to predict the Death notice class. Those models can be used to predict probabilities on new images using the same code as in the original dhSegment repository. One needs to adjust three parameters to the predict function: 1) embeddings_path (the path to the embeddings list), 2) embeddings_map_path(the path to the compressed embedding map), and 3) embeddings_dim (the size of the embeddings). Please refer to the paper for further information or contact us. 3. CODE: https://github.com/dhlab-epfl/dhSegment-text 4. ACKNOWLEDGEMENTS We warmly thank the journal Le Temps (owner of La Gazette de Lausanne and the Journal de Genève) and the group ArcInfo (owner of L'Impartial) for accepting to share the related datasets for academic purposes. We also thank the National Library of Luxembourg for its support with all steps related to the Luxemburger Wort annotation release. This work was realized in the context of the impresso - Media Monitoring of the Past project and supported by the Swiss National Science Foundation under grant CR- SII5_173719. 5. CONTACT Maud Ehrmann (EPFL-DHLAB) Simon Clematide (UZH)

  • Open Access
    Authors: 
    Jamyang Dakpa; Tashi Dhondup; Yeshi Jigme Gangne; Garrett, Edward; Meelen, Marieke; Sonam Wangyal;
    Publisher: Zenodo

    This is a small hand-annotated partial treebank of Modern Tibetan, primarily in CoNLL-U format. Some texts were POS-tagged by machine, and then dependency relations between verbs and their arguments were added by hand. Other texts include only dependency relations and relevant POS-tags. A number of the texts have English translations which have been manually aligned to the Tibetan text. This work was created as part of the AHRC-funded project Lexicography in Motion (PI Ulrich Pagel, 2017-2021). Funded by the UK's Arts and Humanities Research Council (grant code: AH/P004644/1)

  • Open Access
    Authors: 
    Wang, Yanan; Zhuang, Huifu; Shen, Yunguang; Wang, Yuhua; Wang, Zhonglang;
    Publisher: Zenodo

    Figure 3 Venn diagram of economic value type for cultivars.

  • Open Access English
    Authors: 
    Rossi, Matteo; Gittins, Mark; Mercuri, Giulia; Perles, Angel; Peiró, Andrea;
    Publisher: Zenodo
    Project: EC | CollectionCare (814624)

    This dataset contains environmental data (temperature, relative humidity, and, in some cases, light and ultraviolet radiation levels) of partner museums of the European Horizon 2020 CollectionCare project . The following museums provided data to create this compilation and consolidation: Alava Arms Museum (Spain), Alava Fine Arts Museum (Spain), National Historical Museum (Greece), The Ethnographic Open Air Museum of Latvia, The Royal Danish Collection - Rosenborg (Denmark).

  • Open Access
    Authors: 
    Digital Humanities im deutschsprachigen Raum e.V.;
    Publisher: Zenodo

    Diese Publikation enthält das Logo des Verbands "Digital Humanities im deutschsprachigen Raum" e.V. in unterschiedlichen Formaten.

  • Research data . Image . 2021
    Open Access
    Authors: 
    Catherine Anne Cassidy; Iain Oliver; David Caldwell; Ray Lafferty; Alan Miller;
    Publisher: Zenodo

    3D model of a game piece. This bone gaming piece probably dates from the 15th century. It shows a zoomorphic shape on the front, probably a unicorn (also interpreted as a stag) on the front, while the back is plain. It resembles a tableman, a piece used in playing backgammon. It was excavated on Eilean Mor on Loch Finlaggan. A number of items associated with board games were discovered in excavations at the former residence of the Lords of the Isles at Loch Finlaggan on Islay (in Western Scotland). Islay had a long tradition of board games, influenced by both Norse and Gaelic culture. [Original Object: 30mm x 30mm] Part of the Lords of the Isle 15th-Century Finlaggan reconstruction as well as part of the gallery of 3D objects associated with Finlaggan Trust, viewable here: https://cineg.org/type-gallery-page/?itemid=273&type=Physical%20Object