- home
- Advanced Search
1,859 Research products, page 1 of 186
Loading
- Research data . Image . 2016Open AccessAuthors:van Nieukerken, Erik; Doorenweerd, Camiel; Hoare, Robert; Davis, Donald;van Nieukerken, Erik; Doorenweerd, Camiel; Hoare, Robert; Davis, Donald;Publisher: Zenodo
Figures 25-34 - Diversity of Nepticulidae, all on same scale. 25 Enteucha acetosae, male, Austria 26 Stigmella mespilicola, male, Switzerland, holotype 27 Roscidotoga callicomae, female paratype, Australia, NSW 28 Menurella libera, male holotype, Australia, NSW 29 Pectinivalva caenodora, male holotype, Australia, NSW 30 Glaucolepis lituanica , male, Austria 31 Bohemannia auriciliella, male, The Netherlands 32 Trifurcula iberica, male paratype, Spain 33 Fomoria weaveri, female, Sweden 34 Parafomoria helianthemella, female, Czech Republic. Scale 1 mm. Watercolours by Roland Johansson, 25, 26, 31 and 33 published earlier by Johansson et al. (1990), 28 and 29 by Hoare et al. (1997). The left wings of 32 and 34 are digitally mirrored images of the right wings. These figures may be reproduced given that their author Roland Johansson and the present publication are credited.
- Research data . 2016 . Embargo End Date: 23 Jul 2020Open Access EnglishAuthors:Nakao, Hisashi; Tamura, Kohei; Arimatsu, Yui; Nakagawa, Tomomi; Matsumoto, Naoko; Matsugi, Takehiko;Nakao, Hisashi; Tamura, Kohei; Arimatsu, Yui; Nakagawa, Tomomi; Matsumoto, Naoko; Matsugi, Takehiko;Publisher: Dryad
Whether man is predisposed to lethal violence, ranging from homicide to warfare, and how that may have impacted human evolution, are among the most controversial topics of debate on human evolution. Although recent studies on the evolution of warfare have been based on various archaeological and ethnographic data, they have reported mixed results: it is unclear whether or not warfare among prehistoric hunter–gatherers was common enough to be a component of human nature and a selective pressure for the evolution of human behaviour. This paper reports the mortality attributable to violence, and the spatio-temporal pattern of violence thus shown among ancient hunter–gatherers using skeletal evidence in prehistoric Japan (the Jomon period: 13 000 cal BC–800 cal BC). Our results suggest that the mortality due to violence was low and spatio-temporally highly restricted in the Jomon period, which implies that violence including warfare in prehistoric Japan was not common. ESM for Violence in Japanese prehistoryDefinition of and sources of data for injured individuals in the Jomon period, and detailed information of all sites where skeletal remains have been recovered.supplement_corrected_final.docx
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2016Open Access GermanAuthors:Hagmann, Dominik; Langendorf, Alarich; Steininger, Andreas;Hagmann, Dominik; Langendorf, Alarich; Steininger, Andreas;Publisher: Zenodo
2015's 3d model of the interior of the so-called witch tower at the south easteren corner of the outer fortifications of Ulmerfeld Castle. The model was made using 3d photogrammetry (image based modeling) and mast aerial photography.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . Audiovisual . 2009Open Access EnglishAuthors:Ibekwe, Fidelia;Ibekwe, Fidelia;Publisher: Zenodo
The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, we classified sentences into four major argumentative roles: objective, method, result, conclusion. The idea is that if the role of each sentence can be marked up, then this metadata can be used during information retrieval to seek for particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2018Open AccessAuthors:Bonnie, Rick;Bonnie, Rick;Publisher: Zenodo
This open dataset lists, describes, and provides relevant bibliography to all known archaeological sites in Galilee with evidence for stone vessels. It forms part of the dataset used in the monograph Being Jewish in Galilee, 100–200 CE: An Archaeological Study (Brepols). The dataset is available in both PDF and CSV formats. The PDF file provides a detailed description of and bibliography for the evidence of stone vessels at each archaeological site. The CSV file contains the raw data that can be easily imported into spreadsheets and databases.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Open AccessAuthors:Luis Naranjo-Zeledón;Luis Naranjo-Zeledón;Publisher: Zenodo
These files contain the calculations of phonological proximity and validations with users carried out on a subset of the Costa Rican Sign Language (LESCO, for its acronym in Spanish). The signs corresponding to the alphabet have been used.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Open AccessAuthors:Mengual, Ximo; Bot, Sander; Chkhartishvili, Tinatin; Reimann, André; Thormann, Jana; von der Mark, Laura;Mengual, Ximo; Bot, Sander; Chkhartishvili, Tinatin; Reimann, André; Thormann, Jana; von der Mark, Laura;Publisher: Zenodo
: Data type: molecular data
- Research data . 2020Open Access EnglishAuthors:Giovanni Spitale;Giovanni Spitale;Publisher: Zenodo
The COVID-19 pandemic generated (and keeps generating) a huge corpus of news articles, easily retrievable in Factiva with very targeted queries. This dataset, generated with an ad-hoc parser and NLP pipeline, analyzes the frequency of lemmas and named entities in news articles (in German, French, Italian and English ) regarding Switzerland and COVID-19. The analysis of large bodies of grey literature via text mining and computational linguistics is an increasingly frequent approach to understand the large-scale trends of specific topics. We used Factiva, a news monitoring and search engine developed and owned by Dow Jones, to gather and download all the news articles published between January 2020 and May 2021 on Covid-19 and Switzerland. Due to Factiva's copyright policy, it is not possible to share the original dataset with the exports of the articles' text; however, we can share the results of our work on the corpus. All the information relevant to reproduce the results is provided. Factiva allows a very granular definition of the queries, and moreover has access to full text articles published by the major media outlet of the world. The query has been defined as follows (syntax in bold, explanation in italics): ((coronavirus or Wuhan virus or corvid19 or corvid 19 or covid19 or covid 19 or ncov or novel coronavirus or sars) and (atleast3 coronavirus or atleast3 wuhan or atleast3 corvid* or atleast3 covid* or atleast3 ncov or atleast3 novel or atleast3 corona*)) Keywords for covid19; must appear at least 3 times in the text and ns=(gsars or gout) Subject is “novel coronaviruses” or “outbreaks and epidemics” and “general news” and la=X Language is X (DE, FR, IT, EN) and rst=tmnb Restrict to TMNB (major news and business publications) and wc>300 At least 300 words and date from 20191001 to 20212005 Date interval and re=SWITZ Region is Switzerland It is important to specify some details that characterize the query. The query is not limited to articles published by Swiss media, but to articles regarding Switzerland. The reason is simple: a Swiss user googling for “Schweiz Coronavirus” or for “Coronavirus Ticino” can easily find and read articles published by foreign media outlets (namely, German or Italian) on that topic. If the objective is capturing and describing the information trends to which people are exposed, this approach makes much more sense than limiting the analysis to articles published by Swiss media. Factiva’s field “NS” is a descriptor for the content of the article. “gsars” is defined in Factiva’s documentation as “All news on Severe Acute Respiratory Syndrome”, and “gout” as “The widespread occurrence of an infectious disease affecting many people or animals in a given population at the same time”; however, the way these descriptors are assigned to articles is not specified in the documentation. Finally, the query has been restricted to major news and business publications of at least 300 words. Duplicate check is performed by Factiva. Given the incredibly large amount of articles published on COVID-19, this (absolutely arbitrary) restriction allows retrieving a corpus that is both meaningful and manageable. metadata.xlsx contains information about the articles retrieved (strategy, amount) This work is part of the PubliCo research project. This work is part of the PubliCo research project, supported by the Swiss National Science Foundation (SNF). Project no. 31CA30_195905
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2015Open Access
In 2014-2015, Caddo vessels from the Tuck Carpenter (41CP5) collection were scanned at the Center for Regional Heritage Research. These scans were generated for use in a study of 3D geometric morphometrics and for public outreach. Many thanks to the Caddo Nation of Oklahoma and the Anthropology and Archaeology Laboratory for the requisite permissions and access.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Open AccessAuthors:Barman, Raphaël; Ehrmann, Maud; Clematide, Simon; Ares Oliveira, Sofia;Barman, Raphaël; Ehrmann, Maud; Clematide, Simon; Ares Oliveira, Sofia;Publisher: ZenodoCountry: Switzerland
This record contains the datasets and models used and produced for the work reported in the paper "Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers" (link). Please cite this paper if you are using the models/datasets or find it relevant to your research: @article{barman_combining_2020, title = {{Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers}}, author = {Raphaël Barman and Maud Ehrmann and Simon Clematide and Sofia Ares Oliveira and Frédéric Kaplan}, journal= {Journal of Data Mining \& Digital Humanities}, volume= {HistoInformatics} DOI = {10.5281/zenodo.4065271}, year = {2021}, url = {https://jdmdh.episciences.org/7097}, } Please note that this record contains data under different licenses. 1. DATA Annotations (json files): JSON files contains image annotations, with one file per newspaper containing region annotations (label and coordinates) in VIA format. The following licenses apply: luxwort.json: those annotations are under a CC0 1.0 license. Please refer to the right statement specified for each image in the file. GDL.json, IMP.json and JDG.json: those annotations are under a CC BY-SA 4.0 license. Image files: The archive images.zip contains the Swiss titles image files (GDL, IMP, JDG) used for the experiments described in the paper. Those images are under copyright (property of the journal Le Temps and of ArcInfo) and can be used for academic research or educational purposes only. Redistribution, publication or commercial use are not permitted. These terms of use are similar to the following right statement: http://rightsstatements.org/vocab/InC-EDU/1.0/ 2. MODELS Some of the best models are released under a CC BY-SA 4.0 license (they are also available as assets of the current Github release). JDG_flair-FT: this model was trained on JDG using french Flair and FastText embeddings. It is able to predict the four classes presented in the paper (Serial, Weather, Death notice and Stocks). Luxwort_obituary_flair-bpemb: this model was trained on Luxwort using multilingual Flair and Byte-pair embeddings. It is able to predict the Death notice class. Luxwort_obituary_flair-FT_indomain: this model was trained on Luxwort using in-domain Flair and FastText embeddings (trained on Luxwort data). It is also able to predict the Death notice class. Those models can be used to predict probabilities on new images using the same code as in the original dhSegment repository. One needs to adjust three parameters to the predict function: 1) embeddings_path (the path to the embeddings list), 2) embeddings_map_path(the path to the compressed embedding map), and 3) embeddings_dim (the size of the embeddings). Please refer to the paper for further information or contact us. 3. CODE: https://github.com/dhlab-epfl/dhSegment-text 4. ACKNOWLEDGEMENTS We warmly thank the journal Le Temps (owner of La Gazette de Lausanne and the Journal de Genève) and the group ArcInfo (owner of L'Impartial) for accepting to share the related datasets for academic purposes. We also thank the National Library of Luxembourg for its support with all steps related to the Luxemburger Wort annotation release. This work was realized in the context of the impresso - Media Monitoring of the Past project and supported by the Swiss National Science Foundation under grant CR- SII5_173719. 5. CONTACT Maud Ehrmann (EPFL-DHLAB) Simon Clematide (UZH)
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.
1,859 Research products, page 1 of 186
Loading
- Research data . Image . 2016Open AccessAuthors:van Nieukerken, Erik; Doorenweerd, Camiel; Hoare, Robert; Davis, Donald;van Nieukerken, Erik; Doorenweerd, Camiel; Hoare, Robert; Davis, Donald;Publisher: Zenodo
Figures 25-34 - Diversity of Nepticulidae, all on same scale. 25 Enteucha acetosae, male, Austria 26 Stigmella mespilicola, male, Switzerland, holotype 27 Roscidotoga callicomae, female paratype, Australia, NSW 28 Menurella libera, male holotype, Australia, NSW 29 Pectinivalva caenodora, male holotype, Australia, NSW 30 Glaucolepis lituanica , male, Austria 31 Bohemannia auriciliella, male, The Netherlands 32 Trifurcula iberica, male paratype, Spain 33 Fomoria weaveri, female, Sweden 34 Parafomoria helianthemella, female, Czech Republic. Scale 1 mm. Watercolours by Roland Johansson, 25, 26, 31 and 33 published earlier by Johansson et al. (1990), 28 and 29 by Hoare et al. (1997). The left wings of 32 and 34 are digitally mirrored images of the right wings. These figures may be reproduced given that their author Roland Johansson and the present publication are credited.
- Research data . 2016 . Embargo End Date: 23 Jul 2020Open Access EnglishAuthors:Nakao, Hisashi; Tamura, Kohei; Arimatsu, Yui; Nakagawa, Tomomi; Matsumoto, Naoko; Matsugi, Takehiko;Nakao, Hisashi; Tamura, Kohei; Arimatsu, Yui; Nakagawa, Tomomi; Matsumoto, Naoko; Matsugi, Takehiko;Publisher: Dryad
Whether man is predisposed to lethal violence, ranging from homicide to warfare, and how that may have impacted human evolution, are among the most controversial topics of debate on human evolution. Although recent studies on the evolution of warfare have been based on various archaeological and ethnographic data, they have reported mixed results: it is unclear whether or not warfare among prehistoric hunter–gatherers was common enough to be a component of human nature and a selective pressure for the evolution of human behaviour. This paper reports the mortality attributable to violence, and the spatio-temporal pattern of violence thus shown among ancient hunter–gatherers using skeletal evidence in prehistoric Japan (the Jomon period: 13 000 cal BC–800 cal BC). Our results suggest that the mortality due to violence was low and spatio-temporally highly restricted in the Jomon period, which implies that violence including warfare in prehistoric Japan was not common. ESM for Violence in Japanese prehistoryDefinition of and sources of data for injured individuals in the Jomon period, and detailed information of all sites where skeletal remains have been recovered.supplement_corrected_final.docx
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2016Open Access GermanAuthors:Hagmann, Dominik; Langendorf, Alarich; Steininger, Andreas;Hagmann, Dominik; Langendorf, Alarich; Steininger, Andreas;Publisher: Zenodo
2015's 3d model of the interior of the so-called witch tower at the south easteren corner of the outer fortifications of Ulmerfeld Castle. The model was made using 3d photogrammetry (image based modeling) and mast aerial photography.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . Audiovisual . 2009Open Access EnglishAuthors:Ibekwe, Fidelia;Ibekwe, Fidelia;Publisher: Zenodo
The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, we classified sentences into four major argumentative roles: objective, method, result, conclusion. The idea is that if the role of each sentence can be marked up, then this metadata can be used during information retrieval to seek for particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2018Open AccessAuthors:Bonnie, Rick;Bonnie, Rick;Publisher: Zenodo
This open dataset lists, describes, and provides relevant bibliography to all known archaeological sites in Galilee with evidence for stone vessels. It forms part of the dataset used in the monograph Being Jewish in Galilee, 100–200 CE: An Archaeological Study (Brepols). The dataset is available in both PDF and CSV formats. The PDF file provides a detailed description of and bibliography for the evidence of stone vessels at each archaeological site. The CSV file contains the raw data that can be easily imported into spreadsheets and databases.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Open AccessAuthors:Luis Naranjo-Zeledón;Luis Naranjo-Zeledón;Publisher: Zenodo
These files contain the calculations of phonological proximity and validations with users carried out on a subset of the Costa Rican Sign Language (LESCO, for its acronym in Spanish). The signs corresponding to the alphabet have been used.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Open AccessAuthors:Mengual, Ximo; Bot, Sander; Chkhartishvili, Tinatin; Reimann, André; Thormann, Jana; von der Mark, Laura;Mengual, Ximo; Bot, Sander; Chkhartishvili, Tinatin; Reimann, André; Thormann, Jana; von der Mark, Laura;Publisher: Zenodo
: Data type: molecular data
- Research data . 2020Open Access EnglishAuthors:Giovanni Spitale;Giovanni Spitale;Publisher: Zenodo
The COVID-19 pandemic generated (and keeps generating) a huge corpus of news articles, easily retrievable in Factiva with very targeted queries. This dataset, generated with an ad-hoc parser and NLP pipeline, analyzes the frequency of lemmas and named entities in news articles (in German, French, Italian and English ) regarding Switzerland and COVID-19. The analysis of large bodies of grey literature via text mining and computational linguistics is an increasingly frequent approach to understand the large-scale trends of specific topics. We used Factiva, a news monitoring and search engine developed and owned by Dow Jones, to gather and download all the news articles published between January 2020 and May 2021 on Covid-19 and Switzerland. Due to Factiva's copyright policy, it is not possible to share the original dataset with the exports of the articles' text; however, we can share the results of our work on the corpus. All the information relevant to reproduce the results is provided. Factiva allows a very granular definition of the queries, and moreover has access to full text articles published by the major media outlet of the world. The query has been defined as follows (syntax in bold, explanation in italics): ((coronavirus or Wuhan virus or corvid19 or corvid 19 or covid19 or covid 19 or ncov or novel coronavirus or sars) and (atleast3 coronavirus or atleast3 wuhan or atleast3 corvid* or atleast3 covid* or atleast3 ncov or atleast3 novel or atleast3 corona*)) Keywords for covid19; must appear at least 3 times in the text and ns=(gsars or gout) Subject is “novel coronaviruses” or “outbreaks and epidemics” and “general news” and la=X Language is X (DE, FR, IT, EN) and rst=tmnb Restrict to TMNB (major news and business publications) and wc>300 At least 300 words and date from 20191001 to 20212005 Date interval and re=SWITZ Region is Switzerland It is important to specify some details that characterize the query. The query is not limited to articles published by Swiss media, but to articles regarding Switzerland. The reason is simple: a Swiss user googling for “Schweiz Coronavirus” or for “Coronavirus Ticino” can easily find and read articles published by foreign media outlets (namely, German or Italian) on that topic. If the objective is capturing and describing the information trends to which people are exposed, this approach makes much more sense than limiting the analysis to articles published by Swiss media. Factiva’s field “NS” is a descriptor for the content of the article. “gsars” is defined in Factiva’s documentation as “All news on Severe Acute Respiratory Syndrome”, and “gout” as “The widespread occurrence of an infectious disease affecting many people or animals in a given population at the same time”; however, the way these descriptors are assigned to articles is not specified in the documentation. Finally, the query has been restricted to major news and business publications of at least 300 words. Duplicate check is performed by Factiva. Given the incredibly large amount of articles published on COVID-19, this (absolutely arbitrary) restriction allows retrieving a corpus that is both meaningful and manageable. metadata.xlsx contains information about the articles retrieved (strategy, amount) This work is part of the PubliCo research project. This work is part of the PubliCo research project, supported by the Swiss National Science Foundation (SNF). Project no. 31CA30_195905
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2015Open Access
In 2014-2015, Caddo vessels from the Tuck Carpenter (41CP5) collection were scanned at the Center for Regional Heritage Research. These scans were generated for use in a study of 3D geometric morphometrics and for public outreach. Many thanks to the Caddo Nation of Oklahoma and the Anthropology and Archaeology Laboratory for the requisite permissions and access.
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product. - Research data . 2021Open AccessAuthors:Barman, Raphaël; Ehrmann, Maud; Clematide, Simon; Ares Oliveira, Sofia;Barman, Raphaël; Ehrmann, Maud; Clematide, Simon; Ares Oliveira, Sofia;Publisher: ZenodoCountry: Switzerland
This record contains the datasets and models used and produced for the work reported in the paper "Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers" (link). Please cite this paper if you are using the models/datasets or find it relevant to your research: @article{barman_combining_2020, title = {{Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers}}, author = {Raphaël Barman and Maud Ehrmann and Simon Clematide and Sofia Ares Oliveira and Frédéric Kaplan}, journal= {Journal of Data Mining \& Digital Humanities}, volume= {HistoInformatics} DOI = {10.5281/zenodo.4065271}, year = {2021}, url = {https://jdmdh.episciences.org/7097}, } Please note that this record contains data under different licenses. 1. DATA Annotations (json files): JSON files contains image annotations, with one file per newspaper containing region annotations (label and coordinates) in VIA format. The following licenses apply: luxwort.json: those annotations are under a CC0 1.0 license. Please refer to the right statement specified for each image in the file. GDL.json, IMP.json and JDG.json: those annotations are under a CC BY-SA 4.0 license. Image files: The archive images.zip contains the Swiss titles image files (GDL, IMP, JDG) used for the experiments described in the paper. Those images are under copyright (property of the journal Le Temps and of ArcInfo) and can be used for academic research or educational purposes only. Redistribution, publication or commercial use are not permitted. These terms of use are similar to the following right statement: http://rightsstatements.org/vocab/InC-EDU/1.0/ 2. MODELS Some of the best models are released under a CC BY-SA 4.0 license (they are also available as assets of the current Github release). JDG_flair-FT: this model was trained on JDG using french Flair and FastText embeddings. It is able to predict the four classes presented in the paper (Serial, Weather, Death notice and Stocks). Luxwort_obituary_flair-bpemb: this model was trained on Luxwort using multilingual Flair and Byte-pair embeddings. It is able to predict the Death notice class. Luxwort_obituary_flair-FT_indomain: this model was trained on Luxwort using in-domain Flair and FastText embeddings (trained on Luxwort data). It is also able to predict the Death notice class. Those models can be used to predict probabilities on new images using the same code as in the original dhSegment repository. One needs to adjust three parameters to the predict function: 1) embeddings_path (the path to the embeddings list), 2) embeddings_map_path(the path to the compressed embedding map), and 3) embeddings_dim (the size of the embeddings). Please refer to the paper for further information or contact us. 3. CODE: https://github.com/dhlab-epfl/dhSegment-text 4. ACKNOWLEDGEMENTS We warmly thank the journal Le Temps (owner of La Gazette de Lausanne and the Journal de Genève) and the group ArcInfo (owner of L'Impartial) for accepting to share the related datasets for academic purposes. We also thank the National Library of Luxembourg for its support with all steps related to the Luxemburger Wort annotation release. This work was realized in the context of the impresso - Media Monitoring of the Past project and supported by the Swiss National Science Foundation under grant CR- SII5_173719. 5. CONTACT Maud Ehrmann (EPFL-DHLAB) Simon Clematide (UZH)
Average popularityAverage popularity In bottom 99%Average influencePopularity: Citation-based measure reflecting the current impact.Average influence In bottom 99%Influence: Citation-based measure reflecting the total impact.add Add to ORCIDPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.