Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
169 Research products, page 1 of 17

  • Digital Humanities and Cultural Heritage
  • Publications
  • Research data
  • Research software
  • 2013-2022
  • Software
  • English
  • ZENODO
  • Digital Humanities and Cultural Heritage

10
arrow_drop_down
Date (most recent)
arrow_drop_down
  • Open Access English
    Authors: 
    Sheridan A Stewart; Adam S Miner; Meghan C Halley; Laura K Nelson; Eleni Linos;
    Publisher: Zenodo

    Code for "Formally Comparing Topic Models and Human-Generated Qualitative Coding of Physician Mothers' Experiences of Workplace Discrimination" This repository contains the code used in "Formally Comparing Topic Models and Human-Generated Qualitative Coding of Physician Mothers' Experiences of Workplace Discrimination" by Adam S. Miner, Sheridan A. Stewart, Meghan C. Halley, Laura K. Nelson, and Eleni Linos. Please refer to the original paper in Big Data & Society. In this paper, we evaluate whether topic models identify themes similar to those found by human coders in a prior qualitative analysis of physician mothers' experiences of workplace discrimination. More detail is available at the main page for the repository.

  • Restricted English
    Authors: 
    GEN;
    Publisher: Zenodo

    This repository is created for sharing materials (e.g., sample data, trained models, and demo files) for our work. The demo files in the repository allow users to run our models on their own data or on sample data that we provide. The repository includes the following four components: A code demonstration of review text preprocessing. (ReviewPreprocess.zip) The lexicon and a code demonstration of using the lexicon to generate input for the two lexicon-based classification models. (LexiconModels.zip) The trained Doc2Vec model and a code demonstration of obtaining Doc2Vec embeddings using this model. (Doc2VecEmbeddings.zip) Trained base-learner classification models (M2, M3, M4), optimized weights for the ensemble model E2, and the trained ensemble model (E3). We also provide a code demonstration of classifying reviews using our proposed models. (ClassificationModels.zip) The data used for building these models can be requested from the Global Emancipation Network for approved uses established in a data use agreement. This work was funded by the National Science Foundation under award #1936331.

  • Research software . 2022
    Open Access English
    Authors: 
    Mähr, Moritz;
    Publisher: Zenodo

    Full Changelog: https://github.com/maehr/the-corpus-as-a-network/commits/v0.1.0-alpha If you use this dataset, please cite it using the metadata from this file.

  • Research software . 2022
    Open Access English
    Authors: 
    Frosini, Luca;
    Publisher: Zenodo
    Project: EC | ARIADNEplus (823914), EC | ENVRI PLUS (654182), EC | PARTHENOS (654119), EC | D4SCIENCE (212488), EC | RISIS 2 (824091), EC | Blue Cloud (862409), EC | IMARINE (283644), EC | EUBRAZILOPENBIO (288754), EC | EGI-Engage (654142), EC | BlueBRIDGE (675680),...

    gCube Catalogue (gCat) API is a library containing classes shared across gcat* components. gCube is an open-source software toolkit used for building and operating Hybrid Data Infrastructures enabling the dynamic deployment of Virtual Research Environments, such as the D4Science Infrastructure, by favouring the realisation of reuse-oriented policies. gCube has been used to successfully build and operate infrastructures and virtual research environments for application domains ranging from biodiversity to environmental data management and cultural heritage. gCube offers components supporting typical data management workflows including data access, curation, processing, and visualisation on a large set of data typologies ranging from primary biodiversity data to geospatial and tabular data. D4Science is a Hybrid Data Infrastructure combining over 500 software components and integrating data from more than 50 different data providers into a coherent and managed system of hardware, software, and data resources. The D4Science infrastructure drastically reduces the cost of ownership, maintenance, and operation thanks to the exploitation of gCube. The source code of this software version is available at: https://code-repo.d4science.org/gCubeSystem/gcat-api/releases/tag/v2.0.0

  • Research software . 2022
    Open Access English
    Authors: 
    Frosini, Luca;
    Publisher: Zenodo
    Project: EC | ARIADNEplus (823914), EC | EUBRAZILOPENBIO (288754), EC | EGI-Engage (654142), EC | PARTHENOS (654119), EC | ENVRI PLUS (654182), EC | D4SCIENCE-II (239019), EC | RISIS 2 (824091), EC | Blue Cloud (862409), EC | BlueBRIDGE (675680), EC | SoBigData (654024),...

    gCube Catalogue (gCat) Client is a library designed to interact with REST API exposed by the gCat Service. gCube is an open-source software toolkit used for building and operating Hybrid Data Infrastructures enabling the dynamic deployment of Virtual Research Environments, such as the D4Science Infrastructure, by favouring the realisation of reuse-oriented policies. gCube has been used to successfully build and operate infrastructures and virtual research environments for application domains ranging from biodiversity to environmental data management and cultural heritage. gCube offers components supporting typical data management workflows including data access, curation, processing, and visualisation on a large set of data typologies ranging from primary biodiversity data to geospatial and tabular data. D4Science is a Hybrid Data Infrastructure combining over 500 software components and integrating data from more than 50 different data providers into a coherent and managed system of hardware, software, and data resources. The D4Science infrastructure drastically reduces the cost of ownership, maintenance, and operation thanks to the exploitation of gCube. The source code of this software version is available at: https://code-repo.d4science.org/gCubeSystem/gcat-client/releases/tag/v2.4.0

  • Research software . 2022
    Open Access English
    Authors: 
    Frosini, Luca;
    Publisher: Zenodo
    Project: EC | ARIADNEplus (823914), EC | ENVRI PLUS (654182), EC | D4SCIENCE-II (239019), EC | EOSC-Pillar (857650), EC | SoBigData (654024), EC | D4SCIENCE (212488), EC | PerformFISH (727610), EC | AGINFRA PLUS (731001), EC | BlueBRIDGE (675680), EC | EUBRAZILOPENBIO (288754),...

    gCube Catalogue (gCat) Service allows any client to publish on the gCube Catalogue. gCube is an open-source software toolkit used for building and operating Hybrid Data Infrastructures enabling the dynamic deployment of Virtual Research Environments, such as the D4Science Infrastructure, by favouring the realisation of reuse-oriented policies. gCube has been used to successfully build and operate infrastructures and virtual research environments for application domains ranging from biodiversity to environmental data management and cultural heritage. gCube offers components supporting typical data management workflows including data access, curation, processing, and visualisation on a large set of data typologies ranging from primary biodiversity data to geospatial and tabular data. D4Science is a Hybrid Data Infrastructure combining over 500 software components and integrating data from more than 50 different data providers into a coherent and managed system of hardware, software, and data resources. The D4Science infrastructure drastically reduces the cost of ownership, maintenance, and operation thanks to the exploitation of gCube. The source code of this software version is available at: https://code-repo.d4science.org/gCubeSystem/gcat/releases/tag/v2.3.0

  • Open Access English
    Authors: 
    Sánchez, Javier; Salgado, Agustín; García, Alejandro;
    Publisher: Zenodo

    Code for the IDSEM dataset, first version. IDSEM is an acronym for "an Invoices Database of the Spanish Electricity Market" This database contains electricity bills related to energy consumption in Spanish households. The contents of bills are automatically generated using this code. The main purpose of the dataset is for training machine learning algorithms, especially for designing new methods for extracting information from invoices. There are 86 different labels, which are related to several topics, such as the customer and marketer, the contract, energy consumption, or billing. The code relies on a set of dictionaries and template documents for generating many training and test samples. The file format of invoices is PDF and the labels are stored in JSON files. More information can be found at https://idsem.ulpgc.es/ and in the following article: [1] Javier Sánchez, Agustín Salgado, Alejandro García, and Nelson Monzón, "IDSEM, an invoices database of the Spanish electricity market", Sci. Data, (2022). Full Changelog: https://github.com/jsanchezperez/idsem/commits/v1.0.0

  • Open Access English
    Authors: 
    Rose, Thomas; Girotto, Chiara G. M.;
    Publisher: Zenodo

    R package for an easy way to draw chronological charts from tables, aiming to include an intuitive environment for anyone new to R. Includes 'ggplot2' geoms and theme for chronological charts. {"references": ["Thomas Rose and Chiara Girotto (2022). chronochrt: Creating Chronological Charts with R. R package version 0.1.2. https://CRAN.R-project.org/package=chronochrt"]} Changes to version 0.1.1 * Update to maintain compatibility with newer versions of packages ChronochRt depends on. * import_chron() now has default arguments for the column names.

  • Open Access English
    Authors: 
    Berge, PS;
    Publisher: Zenodo

    A hosted version of this notebook is available on Google Colaboratory. It requires no coding to use. If downloaded, this notebook must be run in Google Colab! Link: darcmode.org/scraper The Disboard Scraper and Analysis Notebook is a research toolkit for Internet scholars interested in examining networks of Discord servers. It includes tools for collecting and analyzing data from Disboard.org. This is a Google Colaboratory version of the Modified Disboard Scraper, a fork of DisboardScraper by DiscordFederation. It is written in Python and Markdown. The Disboard Scraper and Analysis Notebook is maintained by the Discord Academic Research Community, a collective of Internet scholars interested in Discord research. {"references": ["Heslep, D. G., & Berge, P. (2021). Mapping Discord's darkside: Distributed hate networks on Disboard. New Media & Society, 14614448211062548. https://doi.org/10.1177/14614448211062548"]}

  • Open Access English

    Datasets for teaching quantitative approaches and modeling in archaeology and paleontology. This package provides several types of data related to broad topics (cultural evolution, radiocarbon dating, paleoenvironments, etc.), which can be used to illustrate statistical methods in the classroom (multivariate data analysis, compositional data analysis, diversity measurement, etc.). To cite package "folio" in publications use:

Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
169 Research products, page 1 of 17
  • Open Access English
    Authors: 
    Sheridan A Stewart; Adam S Miner; Meghan C Halley; Laura K Nelson; Eleni Linos;
    Publisher: Zenodo

    Code for "Formally Comparing Topic Models and Human-Generated Qualitative Coding of Physician Mothers' Experiences of Workplace Discrimination" This repository contains the code used in "Formally Comparing Topic Models and Human-Generated Qualitative Coding of Physician Mothers' Experiences of Workplace Discrimination" by Adam S. Miner, Sheridan A. Stewart, Meghan C. Halley, Laura K. Nelson, and Eleni Linos. Please refer to the original paper in Big Data & Society. In this paper, we evaluate whether topic models identify themes similar to those found by human coders in a prior qualitative analysis of physician mothers' experiences of workplace discrimination. More detail is available at the main page for the repository.

  • Restricted English
    Authors: 
    GEN;
    Publisher: Zenodo

    This repository is created for sharing materials (e.g., sample data, trained models, and demo files) for our work. The demo files in the repository allow users to run our models on their own data or on sample data that we provide. The repository includes the following four components: A code demonstration of review text preprocessing. (ReviewPreprocess.zip) The lexicon and a code demonstration of using the lexicon to generate input for the two lexicon-based classification models. (LexiconModels.zip) The trained Doc2Vec model and a code demonstration of obtaining Doc2Vec embeddings using this model. (Doc2VecEmbeddings.zip) Trained base-learner classification models (M2, M3, M4), optimized weights for the ensemble model E2, and the trained ensemble model (E3). We also provide a code demonstration of classifying reviews using our proposed models. (ClassificationModels.zip) The data used for building these models can be requested from the Global Emancipation Network for approved uses established in a data use agreement. This work was funded by the National Science Foundation under award #1936331.

  • Research software . 2022
    Open Access English
    Authors: 
    Mähr, Moritz;
    Publisher: Zenodo

    Full Changelog: https://github.com/maehr/the-corpus-as-a-network/commits/v0.1.0-alpha If you use this dataset, please cite it using the metadata from this file.

  • Research software . 2022
    Open Access English
    Authors: 
    Frosini, Luca;
    Publisher: Zenodo
    Project: EC | ARIADNEplus (823914), EC | ENVRI PLUS (654182), EC | PARTHENOS (654119), EC | D4SCIENCE (212488), EC | RISIS 2 (824091), EC | Blue Cloud (862409), EC | IMARINE (283644), EC | EUBRAZILOPENBIO (288754), EC | EGI-Engage (654142), EC | BlueBRIDGE (675680),...

    gCube Catalogue (gCat) API is a library containing classes shared across gcat* components. gCube is an open-source software toolkit used for building and operating Hybrid Data Infrastructures enabling the dynamic deployment of Virtual Research Environments, such as the D4Science Infrastructure, by favouring the realisation of reuse-oriented policies. gCube has been used to successfully build and operate infrastructures and virtual research environments for application domains ranging from biodiversity to environmental data management and cultural heritage. gCube offers components supporting typical data management workflows including data access, curation, processing, and visualisation on a large set of data typologies ranging from primary biodiversity data to geospatial and tabular data. D4Science is a Hybrid Data Infrastructure combining over 500 software components and integrating data from more than 50 different data providers into a coherent and managed system of hardware, software, and data resources. The D4Science infrastructure drastically reduces the cost of ownership, maintenance, and operation thanks to the exploitation of gCube. The source code of this software version is available at: https://code-repo.d4science.org/gCubeSystem/gcat-api/releases/tag/v2.0.0

  • Research software . 2022
    Open Access English
    Authors: 
    Frosini, Luca;
    Publisher: Zenodo
    Project: EC | ARIADNEplus (823914), EC | EUBRAZILOPENBIO (288754), EC | EGI-Engage (654142), EC | PARTHENOS (654119), EC | ENVRI PLUS (654182), EC | D4SCIENCE-II (239019), EC | RISIS 2 (824091), EC | Blue Cloud (862409), EC | BlueBRIDGE (675680), EC | SoBigData (654024),...

    gCube Catalogue (gCat) Client is a library designed to interact with REST API exposed by the gCat Service. gCube is an open-source software toolkit used for building and operating Hybrid Data Infrastructures enabling the dynamic deployment of Virtual Research Environments, such as the D4Science Infrastructure, by favouring the realisation of reuse-oriented policies. gCube has been used to successfully build and operate infrastructures and virtual research environments for application domains ranging from biodiversity to environmental data management and cultural heritage. gCube offers components supporting typical data management workflows including data access, curation, processing, and visualisation on a large set of data typologies ranging from primary biodiversity data to geospatial and tabular data. D4Science is a Hybrid Data Infrastructure combining over 500 software components and integrating data from more than 50 different data providers into a coherent and managed system of hardware, software, and data resources. The D4Science infrastructure drastically reduces the cost of ownership, maintenance, and operation thanks to the exploitation of gCube. The source code of this software version is available at: https://code-repo.d4science.org/gCubeSystem/gcat-client/releases/tag/v2.4.0

  • Research software . 2022
    Open Access English
    Authors: 
    Frosini, Luca;
    Publisher: Zenodo
    Project: EC | ARIADNEplus (823914), EC | ENVRI PLUS (654182), EC | D4SCIENCE-II (239019), EC | EOSC-Pillar (857650), EC | SoBigData (654024), EC | D4SCIENCE (212488), EC | PerformFISH (727610), EC | AGINFRA PLUS (731001), EC | BlueBRIDGE (675680), EC | EUBRAZILOPENBIO (288754),...

    gCube Catalogue (gCat) Service allows any client to publish on the gCube Catalogue. gCube is an open-source software toolkit used for building and operating Hybrid Data Infrastructures enabling the dynamic deployment of Virtual Research Environments, such as the D4Science Infrastructure, by favouring the realisation of reuse-oriented policies. gCube has been used to successfully build and operate infrastructures and virtual research environments for application domains ranging from biodiversity to environmental data management and cultural heritage. gCube offers components supporting typical data management workflows including data access, curation, processing, and visualisation on a large set of data typologies ranging from primary biodiversity data to geospatial and tabular data. D4Science is a Hybrid Data Infrastructure combining over 500 software components and integrating data from more than 50 different data providers into a coherent and managed system of hardware, software, and data resources. The D4Science infrastructure drastically reduces the cost of ownership, maintenance, and operation thanks to the exploitation of gCube. The source code of this software version is available at: https://code-repo.d4science.org/gCubeSystem/gcat/releases/tag/v2.3.0

  • Open Access English
    Authors: 
    Sánchez, Javier; Salgado, Agustín; García, Alejandro;
    Publisher: Zenodo

    Code for the IDSEM dataset, first version. IDSEM is an acronym for "an Invoices Database of the Spanish Electricity Market" This database contains electricity bills related to energy consumption in Spanish households. The contents of bills are automatically generated using this code. The main purpose of the dataset is for training machine learning algorithms, especially for designing new methods for extracting information from invoices. There are 86 different labels, which are related to several topics, such as the customer and marketer, the contract, energy consumption, or billing. The code relies on a set of dictionaries and template documents for generating many training and test samples. The file format of invoices is PDF and the labels are stored in JSON files. More information can be found at https://idsem.ulpgc.es/ and in the following article: [1] Javier Sánchez, Agustín Salgado, Alejandro García, and Nelson Monzón, "IDSEM, an invoices database of the Spanish electricity market", Sci. Data, (2022). Full Changelog: https://github.com/jsanchezperez/idsem/commits/v1.0.0

  • Open Access English
    Authors: 
    Rose, Thomas; Girotto, Chiara G. M.;
    Publisher: Zenodo

    R package for an easy way to draw chronological charts from tables, aiming to include an intuitive environment for anyone new to R. Includes 'ggplot2' geoms and theme for chronological charts. {"references": ["Thomas Rose and Chiara Girotto (2022). chronochrt: Creating Chronological Charts with R. R package version 0.1.2. https://CRAN.R-project.org/package=chronochrt"]} Changes to version 0.1.1 * Update to maintain compatibility with newer versions of packages ChronochRt depends on. * import_chron() now has default arguments for the column names.

  • Open Access English
    Authors: 
    Berge, PS;
    Publisher: Zenodo

    A hosted version of this notebook is available on Google Colaboratory. It requires no coding to use. If downloaded, this notebook must be run in Google Colab! Link: darcmode.org/scraper The Disboard Scraper and Analysis Notebook is a research toolkit for Internet scholars interested in examining networks of Discord servers. It includes tools for collecting and analyzing data from Disboard.org. This is a Google Colaboratory version of the Modified Disboard Scraper, a fork of DisboardScraper by DiscordFederation. It is written in Python and Markdown. The Disboard Scraper and Analysis Notebook is maintained by the Discord Academic Research Community, a collective of Internet scholars interested in Discord research. {"references": ["Heslep, D. G., & Berge, P. (2021). Mapping Discord's darkside: Distributed hate networks on Disboard. New Media & Society, 14614448211062548. https://doi.org/10.1177/14614448211062548"]}

  • Open Access English

    Datasets for teaching quantitative approaches and modeling in archaeology and paleontology. This package provides several types of data related to broad topics (cultural evolution, radiocarbon dating, paleoenvironments, etc.), which can be used to illustrate statistical methods in the classroom (multivariate data analysis, compositional data analysis, diversity measurement, etc.). To cite package "folio" in publications use: