Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
218 Research products, page 1 of 22

  • Digital Humanities and Cultural Heritage
  • Research data
  • Other research products
  • DE

10
arrow_drop_down
Date (most recent)
arrow_drop_down
  • Open Access English
    Authors: 
    Paladugu, Roshan; Richter, Kristine Korzow; Valente, Maria João; Gabriel, Sonia; Detry, Cleia; Warinner, Christina; Barrocas Dias, Cristina;
    Publisher: Zenodo
    Project: FCT | PTDC/HAR-ARQ/4909/2020 (PTDC/HAR-ARQ/4909/2020), EC | DAIRYCULTURES (804884), EC | ED-ARCHMAT (766311)

    MALDI-TOF-MS spectra of extracted collagen from modern reference and archaeological bone samples to develop markers for Zooarchaeology by Mass Spectrometry (ZooMS) to distinguish between Equus species. For each sample digestions were done in both trypsin and chymotrypsin separately. Information about the species of the samples can be found in 'sample metadata.csv' file. Information on the extraction and digestion protocol can be found in the associated manuscript. The sequence data contains alignments of the proteins COL1A1 and COL1A2 for available Equus collagen protein sequences. More information on these files can be found in the corresponding manuscript to this dataset.

  • Open Access
    Authors: 
    Kliegl, Reinhold; Czmiel, Alexander; Katja Marciniak;
    Country: Germany

    Eine Präsentation der erreichten und anvisierten Ziele der Initiative „Forschungsdatenmanagement“ der Berlin-Brandenburgischen Akademie der Wissenschaften von Reinhold Kliegl, Katja Marciniak und Alexander Czmiel (Stand: 24.02.2022).

  • Other research product . Lecture . 2022
    Open Access German
    Authors: 
    Katja Marciniak;
    Country: Germany

    Im Rahmen der Akademievorträge an brandenburgischen Schulen 2021/22 bot die Initiative "Forschungsdatenmanagement" den Schülerinnen und Schülern einen Einblick in die Themen Datenorganisation und -sicherung. Denn die Menge an digitalen Daten auf dieser Erde wächst täglich. Umso wichtiger ist das Management der eigenen Daten, um den Überblick zu behalten – privat, aber auch im Studium oder Berufsleben. In der Wissenschaftswelt geben die „Leitlinien zur Sicherung guter wissenschaftlicher Praxis“ den sorgsamen Umgang mit den sogenannten „Forschungsdaten“ sogar vor. In den Geistes- und Kulturwissenschaften versteht man unter dem Begriff alle Quellen/Materialien und Ergebnisse, die im Zusammenhang einer Forschungsfrage gesammelt, erzeugt, beschrieben und/oder ausgewertet werden. Wie geht man mit diesen Daten am besten um und welche Tipps und Tricks kann man sich hier für seine private Datenorganisation abschauen? Der Vortrag sensibilisiert für die Relevanz von Datenmanagement und gibt einen datenzentrierten Einblick in (geistes-)wissenschaftliche Forschungsprozesse.

  • English
    Authors: 
    Balabin, Helena; Hoyt, Charles Tapley; Gyori, Benjamin; Bachman, John; Tom Kodamullil, Alpha; Hofmann-Apitius, Martin; Domingo Fernández, Daniel;
    Country: Germany

    While most approaches individually exploit unstructured data from the biomedical literature or structured data from biomedical knowledge graphs, their union can better exploit the advantages of such approaches, ultimately improving representations of biology. Using multimodal transformers for such purposes can improve performance on context dependent classification tasks, as demonstrated by our previous model, the Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs (STonKGs). In this work, we introduce ProtSTonKGs, a transformer aimed at learning all-encompassing representations of protein-protein interactions. ProtSTonKGs presents an extension to our previous work by adding textual protein descriptions and amino acid sequences (i.e., structural information) to the text- and knowledge graph-based input sequence used in STonKGs. We benchmark ProtSTonKGs against STonKGs, resulting in improved F1 scores by up to 0.066 (i.e., from 0.204 to 0.270) in several tasks such as predicting protein interactions in several contexts. Our work demonstrates how multimodal transformers can be used to integrate heterogeneous sources of information, paving the foundation for future approaches that use multiple modalities for biomedical applications.

  • Open Access English
    Authors: 
    Gómez-Letona, Markel; Baumann, Moritz; González, Acorayda; Pérez Barrancos, Clàudia; Sebastian, Marta; Baños Cerón, Isabel; Montero, María Fernanda; Riebesell, Ulf; Arístegui, Javier;
    Publisher: PANGAEA - Data Publisher for Earth & Environmental Science
    Project: EC | TRIATLAS (817578), EC | Ocean artUp (695094)

    This dataset contains the dissolved organic matter (DOM) quantification and optical characterisation results from a KOSMOS mesocosm experiment carried out in the framework of the Ocean Artificial Upwelling project. The experiment was carried out in the autumn of 2018 in the oligotrophic waters of Gran Canaria. During the 39 days of experiment nutrient-rich deep water was added to the mesocosms in two modes (singular vs recurring additions), with four levels of intensity. Dissolved organic carbon, nitrogen and phosphorus were quantified with a Shimadzu TOC-5000 and a QuAAtro AutoAnalyzer. The absorption and fluorescence proprieties of DOM were determined making use of an Ocean Optics USB2000+UV-VIS-ES Spectrometer and a Jobin Yvon Horiba Fluoromax-4 spectrofluorometer, respectively. The aim of this dataset was to study the effect of artificial upwelling on the dissolved organic matter pool and its potential implications for carbon sequestration.

  • English
    Authors: 
    Huber, Marco; Terhörst, Philipp; Luu, Anh Thi; Damer, Naser; Kirchbuchner, Florian;
    Country: Germany

    Verifying the identity of a person (sitter) portrayed in a historical painting is often a challenging but critical task in art historian research. In many cases, this information has been lost due to time or other circumstances and today there are only speculations of art historians about which person it could be. Art historians often use subjective factors for this purpose and then infer from the identity information about the person depicted in terms of his or her life, status, and era. On the other hand, automated face recognition has achieved a high level of accuracy, especially on photographs, and considers objective factors to determine the identity or verify a suspected identity. The limited amount of data, as well as the domain-specific challenges, make the use of automated face recognition methods in the domain of historic paintings difficult. We propose a specialized, likelihood-based fusion method to enable deep learning-based face recognition on historic portrait paintings. We additionally propose a method to accurately determine the confidence of the made decision to assist art historians in their research. For this purpose, we used a model trained on common photographs and adapted it to the domain of historical paintings through transfer learning. By using an underlying challenge dataset, we compute the likelihood for the assumed identity against reference images of the identity and fuse them to utilize as much information as possible. From these results of the likelihoods fusion, we then derive decision confidence to make statements to determine the certainty of the model’s decision. The experiments were carried out in a leave-one-out evaluation scenario on our created database, the largest authentic database of historic portrait paintings to date, consisting of over 760 portrait paintings of 210 different sitters by over 250 different artists. The experiments demonstrated, that a) the proposed approach outperforms pure face recognition solutions, b) the fusion approach effectively combines the sitter information towards a higher verification accuracy, and c) the proposed confidence estimation approach is highly successful in capturing the estimated accuracy of the decision. The meta-information of the used historic face images can be found at https://github.com/marcohuber/HistoricalFaces.

  • Other research product . Other ORP type . 2022
    English
    Authors: 
    Müller, Almuth; Kuwertz, Achim;
    Country: Germany

    This paper presents a concept for a two-tire semi-automated approach for business data entity resolution. Resolving entity names is generally relevant e.g. in business intelligence. When applied, several difficulties have to be considered, such as name deviations for an organization. Here, two types of deviations can be distinguished. First, names can differ due to typos, native special characters or transformation errors. Second, an organization name can change due to outdated designations or being given in another language. A further aspect is data sovereignty. Analyzed data sources can be under direct control, e.g. in own data storage systems, and thus be kept clean. Yet, other sources of relevant data may only be publicly available. It is in general not recommended to copy such data, due to e.g. its amount and data duplication issues. The proposed two-tire approach for entity resolution thus not only considers different kinds of name derivations, but also data sovereignty issues. Being still work in progress, it yet has the potential to reduce the effort required when compared to manual approaches and can possibly be applied in different areas where there is a significant need for harmonized data and externally curated systems are not feasible.

  • English
    Authors: 
    Dorrn, Tobias; Dambier, Natalie; Müller, Almuth; Kuwertz, Achim;
    Country: Germany

    Modern data and information systems usually contain considerable amounts of data and documents and thus provide a large amount of information. The automatic extraction of domain-specific information is all the more important in order to improve work with such systems. If information is available as free text information, machine processing can prove to be a difficult technical hurdle. State-of-the-art approaches use modern Natural Language Processing (NLP) methods to solve such tasks. In this paper, we want to introduce a data-driven approach, applying an XML data model to an application-specific scenario, using different NLP methods, which are combined into a multidimensional pipeline. It is important to understand how certain NLP methods can be used and what their limitations are. Individual modern NLP methods are often not sufficient and resilient enough to solve complex information extraction tasks. Therefore, it has to be examined how such problems can be alleviated or circumvented by a combination of different NLP methods. As a distinction to categorical grammar models, all cases considered here should be available as free text. The approach presented in this paper is still a work in progress, yet first evaluation results will be given.

  • Open Access
    Authors: 
    Polla, Silvia; de Vos Raaijmakers, Mariette;
    Publisher: Freie Universität Berlin
    Country: Germany

    Radiocarbon dating of two bone samples from the archaeological excavation of the farm at Ain Wassel (High Tunisian Tell). The analysed material comes from the archaeological excavation at Ain Wassel carried out by the University of Trento (Italy) and the Institut National du Patrimoine of Tunisia, under the joint direction of Mustapha Khanoussi and Mariette de Vos Raaijmakers (1994- 1996). See: http://rusafricum.org/en/thuggasurvey/home/ The samples, a pork (KIA-55574) and a chicken (KIA-55575) bone, come from a fill layer including very mixed heretogeneous material dating to the Byzantine period and adhered to the wall of a ceramic vessel (closed form in common ware). The analysis have been carried out by the The Leibniz Laboratory for Radiometric Dating and Stable Isotope Research (AMS 14C Lab). For methodology and references, see: Reimer, P., Austin, W., Bard, E., Bayliss, A., Blackwell, P., Bronk Ramsey, C., . . . Talamo, S. (2020). The IntCal20 Northern Hemisphere Radiocarbon Age Calibration Curve (0–55 cal kBP). Radiocarbon, 62(4), 725-757. doi:10.1017/RDC.2020.41 Ramsey, C., & Lee, S. (2013). Recent and Planned Developments of the Program OxCal. Radiocarbon, 55(2), 720-730. doi:10.1017/S0033822200057878 Longin R. New method of collagen extraction for radiocarbon dating. Nature. 1971 Mar 26;230(5291):241-2. doi: 10.1038/230241a0 Stuiver, M., & Polach, H. (1977). Discussion Reporting of 14C Data. Radiocarbon, 19(3), 355-363. doi:10.1017/S0033822200003672

  • English
    Authors: 
    Müller, Almuth; Kuwertz, Achim;
    Country: Germany

    As part of the unprecedented wealth of data available nowadays, semi-formal reports in the domain of remote sensing can convey information important for decision making in structured and unstructured text parts. For such reports, often kept in large data management systems, targeted information retrieval remains difficult, e.g., the extraction of texts parts relevant to a question posed via natural language. The work presented in this paper therefore aims at finding the relevant documents in data management systems and extracting their relevant content parts based on natural language questions. For this purpose, an approach for semantic information retrieval based on Abstract Meaning Representation (AMR) is adapted, extended and evaluated for the considered domain of remote sensing and image exploitation. In detail, two different metrics used in AMR, Smatch and SemBleu, are compared for their suitability in an AMR-based search. The first results presented in this paper are promising. In addition, more detailed experiments regarding the performance of the metrics under differently formulated yet semantically equivalent questions reveal interesting insights into their ability for semantic comparison.

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
218 Research products, page 1 of 22
  • Open Access English
    Authors: 
    Paladugu, Roshan; Richter, Kristine Korzow; Valente, Maria João; Gabriel, Sonia; Detry, Cleia; Warinner, Christina; Barrocas Dias, Cristina;
    Publisher: Zenodo
    Project: FCT | PTDC/HAR-ARQ/4909/2020 (PTDC/HAR-ARQ/4909/2020), EC | DAIRYCULTURES (804884), EC | ED-ARCHMAT (766311)

    MALDI-TOF-MS spectra of extracted collagen from modern reference and archaeological bone samples to develop markers for Zooarchaeology by Mass Spectrometry (ZooMS) to distinguish between Equus species. For each sample digestions were done in both trypsin and chymotrypsin separately. Information about the species of the samples can be found in 'sample metadata.csv' file. Information on the extraction and digestion protocol can be found in the associated manuscript. The sequence data contains alignments of the proteins COL1A1 and COL1A2 for available Equus collagen protein sequences. More information on these files can be found in the corresponding manuscript to this dataset.

  • Open Access
    Authors: 
    Kliegl, Reinhold; Czmiel, Alexander; Katja Marciniak;
    Country: Germany

    Eine Präsentation der erreichten und anvisierten Ziele der Initiative „Forschungsdatenmanagement“ der Berlin-Brandenburgischen Akademie der Wissenschaften von Reinhold Kliegl, Katja Marciniak und Alexander Czmiel (Stand: 24.02.2022).

  • Other research product . Lecture . 2022
    Open Access German
    Authors: 
    Katja Marciniak;
    Country: Germany

    Im Rahmen der Akademievorträge an brandenburgischen Schulen 2021/22 bot die Initiative "Forschungsdatenmanagement" den Schülerinnen und Schülern einen Einblick in die Themen Datenorganisation und -sicherung. Denn die Menge an digitalen Daten auf dieser Erde wächst täglich. Umso wichtiger ist das Management der eigenen Daten, um den Überblick zu behalten – privat, aber auch im Studium oder Berufsleben. In der Wissenschaftswelt geben die „Leitlinien zur Sicherung guter wissenschaftlicher Praxis“ den sorgsamen Umgang mit den sogenannten „Forschungsdaten“ sogar vor. In den Geistes- und Kulturwissenschaften versteht man unter dem Begriff alle Quellen/Materialien und Ergebnisse, die im Zusammenhang einer Forschungsfrage gesammelt, erzeugt, beschrieben und/oder ausgewertet werden. Wie geht man mit diesen Daten am besten um und welche Tipps und Tricks kann man sich hier für seine private Datenorganisation abschauen? Der Vortrag sensibilisiert für die Relevanz von Datenmanagement und gibt einen datenzentrierten Einblick in (geistes-)wissenschaftliche Forschungsprozesse.

  • English
    Authors: 
    Balabin, Helena; Hoyt, Charles Tapley; Gyori, Benjamin; Bachman, John; Tom Kodamullil, Alpha; Hofmann-Apitius, Martin; Domingo Fernández, Daniel;
    Country: Germany

    While most approaches individually exploit unstructured data from the biomedical literature or structured data from biomedical knowledge graphs, their union can better exploit the advantages of such approaches, ultimately improving representations of biology. Using multimodal transformers for such purposes can improve performance on context dependent classification tasks, as demonstrated by our previous model, the Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs (STonKGs). In this work, we introduce ProtSTonKGs, a transformer aimed at learning all-encompassing representations of protein-protein interactions. ProtSTonKGs presents an extension to our previous work by adding textual protein descriptions and amino acid sequences (i.e., structural information) to the text- and knowledge graph-based input sequence used in STonKGs. We benchmark ProtSTonKGs against STonKGs, resulting in improved F1 scores by up to 0.066 (i.e., from 0.204 to 0.270) in several tasks such as predicting protein interactions in several contexts. Our work demonstrates how multimodal transformers can be used to integrate heterogeneous sources of information, paving the foundation for future approaches that use multiple modalities for biomedical applications.

  • Open Access English
    Authors: 
    Gómez-Letona, Markel; Baumann, Moritz; González, Acorayda; Pérez Barrancos, Clàudia; Sebastian, Marta; Baños Cerón, Isabel; Montero, María Fernanda; Riebesell, Ulf; Arístegui, Javier;
    Publisher: PANGAEA - Data Publisher for Earth & Environmental Science
    Project: EC | TRIATLAS (817578), EC | Ocean artUp (695094)

    This dataset contains the dissolved organic matter (DOM) quantification and optical characterisation results from a KOSMOS mesocosm experiment carried out in the framework of the Ocean Artificial Upwelling project. The experiment was carried out in the autumn of 2018 in the oligotrophic waters of Gran Canaria. During the 39 days of experiment nutrient-rich deep water was added to the mesocosms in two modes (singular vs recurring additions), with four levels of intensity. Dissolved organic carbon, nitrogen and phosphorus were quantified with a Shimadzu TOC-5000 and a QuAAtro AutoAnalyzer. The absorption and fluorescence proprieties of DOM were determined making use of an Ocean Optics USB2000+UV-VIS-ES Spectrometer and a Jobin Yvon Horiba Fluoromax-4 spectrofluorometer, respectively. The aim of this dataset was to study the effect of artificial upwelling on the dissolved organic matter pool and its potential implications for carbon sequestration.

  • English
    Authors: 
    Huber, Marco; Terhörst, Philipp; Luu, Anh Thi; Damer, Naser; Kirchbuchner, Florian;
    Country: Germany

    Verifying the identity of a person (sitter) portrayed in a historical painting is often a challenging but critical task in art historian research. In many cases, this information has been lost due to time or other circumstances and today there are only speculations of art historians about which person it could be. Art historians often use subjective factors for this purpose and then infer from the identity information about the person depicted in terms of his or her life, status, and era. On the other hand, automated face recognition has achieved a high level of accuracy, especially on photographs, and considers objective factors to determine the identity or verify a suspected identity. The limited amount of data, as well as the domain-specific challenges, make the use of automated face recognition methods in the domain of historic paintings difficult. We propose a specialized, likelihood-based fusion method to enable deep learning-based face recognition on historic portrait paintings. We additionally propose a method to accurately determine the confidence of the made decision to assist art historians in their research. For this purpose, we used a model trained on common photographs and adapted it to the domain of historical paintings through transfer learning. By using an underlying challenge dataset, we compute the likelihood for the assumed identity against reference images of the identity and fuse them to utilize as much information as possible. From these results of the likelihoods fusion, we then derive decision confidence to make statements to determine the certainty of the model’s decision. The experiments were carried out in a leave-one-out evaluation scenario on our created database, the largest authentic database of historic portrait paintings to date, consisting of over 760 portrait paintings of 210 different sitters by over 250 different artists. The experiments demonstrated, that a) the proposed approach outperforms pure face recognition solutions, b) the fusion approach effectively combines the sitter information towards a higher verification accuracy, and c) the proposed confidence estimation approach is highly successful in capturing the estimated accuracy of the decision. The meta-information of the used historic face images can be found at https://github.com/marcohuber/HistoricalFaces.

  • Other research product . Other ORP type . 2022
    English
    Authors: 
    Müller, Almuth; Kuwertz, Achim;
    Country: Germany

    This paper presents a concept for a two-tire semi-automated approach for business data entity resolution. Resolving entity names is generally relevant e.g. in business intelligence. When applied, several difficulties have to be considered, such as name deviations for an organization. Here, two types of deviations can be distinguished. First, names can differ due to typos, native special characters or transformation errors. Second, an organization name can change due to outdated designations or being given in another language. A further aspect is data sovereignty. Analyzed data sources can be under direct control, e.g. in own data storage systems, and thus be kept clean. Yet, other sources of relevant data may only be publicly available. It is in general not recommended to copy such data, due to e.g. its amount and data duplication issues. The proposed two-tire approach for entity resolution thus not only considers different kinds of name derivations, but also data sovereignty issues. Being still work in progress, it yet has the potential to reduce the effort required when compared to manual approaches and can possibly be applied in different areas where there is a significant need for harmonized data and externally curated systems are not feasible.

  • English
    Authors: 
    Dorrn, Tobias; Dambier, Natalie; Müller, Almuth; Kuwertz, Achim;
    Country: Germany

    Modern data and information systems usually contain considerable amounts of data and documents and thus provide a large amount of information. The automatic extraction of domain-specific information is all the more important in order to improve work with such systems. If information is available as free text information, machine processing can prove to be a difficult technical hurdle. State-of-the-art approaches use modern Natural Language Processing (NLP) methods to solve such tasks. In this paper, we want to introduce a data-driven approach, applying an XML data model to an application-specific scenario, using different NLP methods, which are combined into a multidimensional pipeline. It is important to understand how certain NLP methods can be used and what their limitations are. Individual modern NLP methods are often not sufficient and resilient enough to solve complex information extraction tasks. Therefore, it has to be examined how such problems can be alleviated or circumvented by a combination of different NLP methods. As a distinction to categorical grammar models, all cases considered here should be available as free text. The approach presented in this paper is still a work in progress, yet first evaluation results will be given.

  • Open Access
    Authors: 
    Polla, Silvia; de Vos Raaijmakers, Mariette;
    Publisher: Freie Universität Berlin
    Country: Germany

    Radiocarbon dating of two bone samples from the archaeological excavation of the farm at Ain Wassel (High Tunisian Tell). The analysed material comes from the archaeological excavation at Ain Wassel carried out by the University of Trento (Italy) and the Institut National du Patrimoine of Tunisia, under the joint direction of Mustapha Khanoussi and Mariette de Vos Raaijmakers (1994- 1996). See: http://rusafricum.org/en/thuggasurvey/home/ The samples, a pork (KIA-55574) and a chicken (KIA-55575) bone, come from a fill layer including very mixed heretogeneous material dating to the Byzantine period and adhered to the wall of a ceramic vessel (closed form in common ware). The analysis have been carried out by the The Leibniz Laboratory for Radiometric Dating and Stable Isotope Research (AMS 14C Lab). For methodology and references, see: Reimer, P., Austin, W., Bard, E., Bayliss, A., Blackwell, P., Bronk Ramsey, C., . . . Talamo, S. (2020). The IntCal20 Northern Hemisphere Radiocarbon Age Calibration Curve (0–55 cal kBP). Radiocarbon, 62(4), 725-757. doi:10.1017/RDC.2020.41 Ramsey, C., & Lee, S. (2013). Recent and Planned Developments of the Program OxCal. Radiocarbon, 55(2), 720-730. doi:10.1017/S0033822200057878 Longin R. New method of collagen extraction for radiocarbon dating. Nature. 1971 Mar 26;230(5291):241-2. doi: 10.1038/230241a0 Stuiver, M., & Polach, H. (1977). Discussion Reporting of 14C Data. Radiocarbon, 19(3), 355-363. doi:10.1017/S0033822200003672

  • English
    Authors: 
    Müller, Almuth; Kuwertz, Achim;
    Country: Germany

    As part of the unprecedented wealth of data available nowadays, semi-formal reports in the domain of remote sensing can convey information important for decision making in structured and unstructured text parts. For such reports, often kept in large data management systems, targeted information retrieval remains difficult, e.g., the extraction of texts parts relevant to a question posed via natural language. The work presented in this paper therefore aims at finding the relevant documents in data management systems and extracting their relevant content parts based on natural language questions. For this purpose, an approach for semantic information retrieval based on Abstract Meaning Representation (AMR) is adapted, extended and evaluated for the considered domain of remote sensing and image exploitation. In detail, two different metrics used in AMR, Smatch and SemBleu, are compared for their suitability in an AMR-based search. The first results presented in this paper are promising. In addition, more detailed experiments regarding the performance of the metrics under differently formulated yet semantically equivalent questions reveal interesting insights into their ability for semantic comparison.