Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
8 Research products, page 1 of 1

  • Digital Humanities and Cultural Heritage
  • Research data
  • Research software
  • Other research products
  • 2013-2022
  • AR
  • English
  • Digital Humanities and Cultural Heritage

Relevance
arrow_drop_down
  • Open Access English
    Authors: 
    Grill, Pablo; Claassen, Mathias; Rosá, Aiala; Correa, Hernán;
    Country: Argentina

    This paper presents a series of semi-supervised learning algorithms which were designed to classify words or expressions with temporal meanings. The algorithms use a set of pre-tagged temporal expressions and a set of semantic classes which were defined within a research project on the lexical coding of temporal meaning in Spanish. The algorithms in this article are mostly based on word embeddings, but they also make use of other methods. The results obtained strongly depend on the temporal classes considered, but, for some classes, results have reached 90% precision or above. Sociedad Argentina de Informática e Investigación Operativa

  • Open Access English
    Authors: 
    Rio Riande, María Gimena del; González Blanco García, Elena; Martínez Cantón, Clara; Curado Malta, Mariana;
    Country: Argentina

    This paper presents work-in-progress of the POSTDATA project. This project aims to provide means to solve the interoperability issues that exist among the digital poetry repertoires. These repertoires hold data of poetry metrics that is locked in their own databases and it is not freely available to be compared and to be used by intelligent machines that could infer over the data. The POSTDATA project will use Linked Open Data (LOD) technologies to overcome the interoperability problems. POSTDATA is developing a metadata application proFIle (MAP) for the digital poetry repertoires, a construct that enhances interoperability.This development follows the method for the development of MAP (Me4MAP).A MAP for the digital poetry repertoires will open doors for this repertoires to be able to structure the data with a common model in order to publish it as Linked Open Data. This paper presents how this MAP is being developed so far. Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)

  • Open Access English
    Authors: 
    Garciarena Ucelay, María José; Villegas, María Paula; Cagnina, Leticia; Errecalde, Marcelo Luis;
    Country: Argentina

    Author Profiling is the task of predicting characteristics of the author of a text, such as age, gender, personality, native language, etc. This is a task of growing importance due to the potential applications in security, crime detection and marketing, among others. An interesting point is to study the robustness of a classifier when it is trained with a dataset and tested with others containing different characteristics. Commonly this is called cross domain experimentation. Although different cross domain studies have been done for datasets in English language, for Spanish it has recently begun. In this context, this work presents a study of cross domain classification for the author profiling task in Spanish. The experimental results showed that using corpora with different levels of formality we can obtain robust classifiers for the author profiling task in Spanish language. Red de Universidades con Carreras en Informática (RedUNCI) XII Workshop Bases de Datos y Minería de Datos (WBDDM)

  • Other research product . 2016
    Open Access English
    Authors: 
    Argerich, Luis; Cano, Matías J.; Torre Zaffaroni, Joaquín;
    Country: Argentina

    In this paper we propose the application of feature hashing to create word embeddings for natural language processing. Feature hashing has been used successfully to create document vectors in related tasks like document classification. In this work we show that feature hashing can be applied to obtain word embeddings in linear time with the size of the data. The results show that this algorithm, that does not need training, is able to capture the semantic meaning of words.We compare the results against GloVe showing that they are similar. As far as we know this is the first application of feature hashing to the word embeddings problem and the results indicate this is a scalable technique with practical results for NLP applications. Sociedad Argentina de Informática e Investigación Operativa (SADIO)

  • Open Access English
    Authors: 
    Mechaca C., Ana L.; Marmanillo, Walter G.; Xamena, Eduardo; Ramirez-Orta, Juan; Maguitman, Ana Gabriela; Milios, Evangelos E.;
    Country: Argentina

    Digital Humanities researchers often make use of software that helps them in the task of finding non-trivial relationships among characters in historical text. Usually, the source texts that contain such information come from OCR acquired volumes, carrying high amounts of errors within them. This work explains the development of a web platform for the task of OCR post-processing and ground-truth generation. This platform employs machine learning to predict the correct texts accurately from OCR noisy strings. The method used for this task involves transformers for character-based denoising language models. An active learning workflow is proposed, as the users can feed their corrections to the platform, generating new annotated data for re-training the underlying machine learning correction models. Sociedad Argentina de Informática e Investigación Operativa

  • Open Access English
    Authors: 
    Cardellino, Cristian; Alonso i Alemany, Laura;
    Country: Argentina

    We present SuFLexQA, a system for Question Answering that integrates deep linguistic information from verbal lexica into Quepy, a generic framework for translating natural language questions into a query language. We are participating in the QALD-3 contest to assess the main achievements and shortcomings of the system. Sociedad Argentina de Informática e Investigación Operativa

  • Open Access English
    Authors: 
    Xamena, Eduardo; Marmanillo, Walter Gabriel; Mechaca, Ana Lidia;
    Country: Argentina

    Large amounts of ancient documents have become available in the last years, regarding Argentinian history. This fact turns possible to find interesting and useful aggregated information. This work proposes the application of Natural Language Processing, Text Mining and Visualization tools over Argentinian ancient document repositories. Conceptual maps and entity networks make up the first target of this preliminary paper. The first step is the normalization of OCR acquired books of General G¨uemes. Exploratory analyses reveal the presence of manifold spelling errors, due to the OCR acquisition process of the volumes. We propose smart automatic ways for overcoming this issue in the process of normalization. Besides, a first topic landscape of a subset of volumes is obtained and analysed, via Topic Modelling tools. Sociedad Argentina de Informática e Investigación Operativa

  • Open Access English
    Authors: 
    Carrillo, Facundo; Cecchi, Guillermo; Sigman, Mariano; Fernández Slezak, Diego;
    Country: Argentina

    Latent Semantic Analysis is a natural language processing tools that allows estimating semantic distance between terms. The success of LSA is mainly based on the training corpus choice, which have been studied principally in English. This study focuses on studying LSA with regional Spanish corpus and evaluate the performance by identifying synonyms. We found that performance was slightly better than chance, concordantly with previous results. Standard LSA method cannot dynamically increase the training corpus. By using classifiers we combined multiple LSA models and showed that the use of automatic classifiers increase the performance. Sociedad Argentina de Informática e Investigación Operativa