Advanced search in Research products
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
The following results are related to Digital Humanities and Cultural Heritage. Are you interested to view more results? Visit OpenAIRE - Explore.
44 Research products, page 1 of 5

  • Digital Humanities and Cultural Heritage
  • Publications
  • Other research products
  • Preprint
  • Netherlands Organisation for Scientific Research (NWO)
  • NARCIS

10
arrow_drop_down
Date (most recent)
arrow_drop_down
  • Publication . Article . Conference object . Preprint . 2021
    Open Access English
    Authors: 
    Danny Merkx; Stefan L. Frank; Mirjam Ernestus;
    Country: Netherlands
    Project: NWO | Language in Interaction (2300176475)

    This study addresses the question whether visually grounded speech recognition (VGS) models learn to capture sentence semantics without access to any prior linguistic knowledge. We produce synthetic and natural spoken versions of a well known semantic textual similarity database and show that our VGS model produces embeddings that correlate well with human semantic similarity judgements. Our results show that a model trained on a small image-caption database outperforms two models trained on much larger databases, indicating that database size is not all that matters. We also investigate the importance of having multiple captions per image and find that this is indeed helpful even if the total number of images is lower, suggesting that paraphrasing is a valuable learning signal. While the general trend in the field is to create ever larger datasets to train models on, our findings indicate other characteristics of the database can just as important important. This paper has been accepted at Interspeech 2021 where it will be presented and appear in the conference proceedings in September 2021

  • Publication . Article . Preprint . 2020
    Open Access English
    Authors: 
    Lisa Beinborn; Rochelle Choenni;
    Publisher: The MIT Press
    Country: Netherlands
    Project: NWO | Language in Interaction (2300176475)

    Multilingual representations have mostly been evaluated based on their performance on specific tasks. In this article, we look beyond engineering goals and analyze the relations between languages in computational representations. We introduce a methodology for comparing languages based on their organization of semantic concepts. We propose to conduct an adapted version of representational similarity analysis of a selected set of concepts in computational multilingual representations. Using this analysis method, we can reconstruct a phylogenetic tree that closely resembles those assumed by linguistic experts. These results indicate that multilingual distributional representations which are only trained on monolingual text and bilingual dictionaries preserve relations between languages without the need for any etymological information. In addition, we propose a measure to identify semantic drift between language families. We perform experiments on word-based and sentence-based multilingual models and provide both quantitative results and qualitative examples. Analyses of semantic drift in multilingual representations can serve two purposes: they can indicate unwanted characteristics of the computational models and they provide a quantitative means to study linguistic phenomena across languages. The code is available at https://github.com/beinborn/SemanticDrift. Comment: Almost final version. Paper will appear in the Computational Linguistics Journal, Volume 46, Issue 3

  • Open Access English
    Authors: 
    Van Der Boon, Annique; Kuiper, Klaudia F.; Van Der Ploeg, Robin; Cramwinckel, Margot J.; Honarmand, Maryam; Sluijs, Appy; Krijgsman, Wout; Paleomagnetism; Marine palynology and palaeoceanography; Paleomagnetism; +1 more
    Country: Netherlands
    Project: EC | SPANC (771497), NWO | The evolution of Parateth... (2300163618)

    The Middle Eocene Climatic Optimum (MECO), a ∼500 kyr episode of global warming that initiated at ∼ 40.5 Ma, is postulated to be driven by a net increase in volcanic carbon input, but a direct source has not been identified. Here we show, based on new and previously published radiometric ages of volcanic rocks, that the interval spanning the MECO corresponds to a massive increase in continental arc volcanism in Iran and Azerbaijan. Ages of Eocene igneous rocks in all volcanic provinces of Iran cluster around 40 Ma, very close to the peak warming phase of the MECO. Based on the spatial extent and volume of the volcanic rocks as well as the carbonaceous lithology in which they are emplaced, we estimate the total amount of CO2 that could have been released at this time corresponds to between 1052 and 12 565 Pg carbon. This is compatible with the estimated carbon release during the MECO. Although the uncertainty in both individual ages, and the spread in the compilation of ages, is larger than the duration of the MECO, a flare-up in Neotethys subduction zone volcanism represents a plausible excess carbon source responsible for MECO warming.

  • Publication . Article . Preprint . Other literature type . 2020
    Open Access English
    Authors: 
    G.-J. A. Brummer; G.-J. A. Brummer; B. Metcalfe; B. Metcalfe; W. Feldmeijer; W. Feldmeijer; M. A. Prins; J. van 't Hoff; J. van 't Hoff; G. M. Ganssen;
    Countries: France, Netherlands, France
    Project: ANR | L-IPSL (ANR-10-LABX-0018), NWO | SCAN-2: Scanning Sediment... (2300165588)

    Changeover from a glacial to an interglacial climate is considered as transitional between two stable modes. Palaeoceanographic reconstructions using the polar foraminifera Neogloboquadrina pachyderma highlight the retreat of the Polar Front during the last deglaciation in terms of both its decreasing abundance and stable oxygen isotope values (δ18O) in sediment cores. While conventional isotope analysis of pooled N. pachyderma and G. bulloides shells shows a warming trend concurrent with the retreating ice, new single-shell measurements reveal that this trend is composed of two isotopically different populations that are morphologically indistinguishable. Using modern time series as analogues for interpreting downcore data, glacial productivity in the mid-North Atlantic appears limited to a single maximum in late summer, followed by the melting of drifting icebergs and winter sea ice. Despite collapsing ice sheets and global warming during the deglaciation, a second “warm” population of N. pachyderma appears in a bimodal seasonal succession, separated by the subpolar G. bulloides. This represents a shift in the timing of the main plankton bloom from late to early summer in a “deglacial” intermediate mode that persisted from the glacial maximum until the start of the Holocene. When seawater temperatures exceeded the threshold values, first the “cold” (glacial) then the “warm” (deglacial) populations of N. pachyderma disappeared, whilst G. bulloides with a greater tolerance to higher temperatures persisted throughout the Holocene to the present day in the midlatitude North Atlantic. Single-specimen δ18O of polar N. pachyderma reveals a steeper rate of ocean warming during the last deglaciation than appears from conventional pooled δ18O average values.

  • Open Access English
    Authors: 
    N. Geerits; Steven R. Parnell; M.A. Thijs; Wim G. Bouwman; Jeroen Plomp;
    Country: Netherlands
    Project: NWO | LAMROR A Multipurpose Pol... (2300173430), NWO | LAMROR A Multipurpose Pol... (2300173430)

    A time of flight MIEZE spectrometer study is presented. The instrument uses solenoid radio frequency (RF) spin flippers with square pole shoes and a magnetic yoke. These flippers can achieve higher static fields than conventional resonant RF spin flippers, which employ an air core. High fields are crucial for the construction of a high resolution and compact MIEZE spectrometer. Using both types of flippers two MIEZE spectrometer configurations are constructed and compared on the same beam line. It was demonstrated that the pole shoe/solenoid coil RF flippers can achieve a MIEZE signal, which is similar in quality to the conventional reference setup. The highest obtained modulation frequency was 100 kHz. Comment: PNCMI 2018 conference proceedings

  • Open Access English
    Authors: 
    Mourits, R. J.; van den Berg, N.; Rodríguez-Girondo, M.; Mandemakers, K.; Slagboom, P. E.; Beekman, M.; Janssens, A.A.P.O.; LS Economische Geschiedenis; OGKG - Sociaal-economische geschiedenis;
    Country: Netherlands
    Project: NWO | Galactofuranose biosynthe... (2300158215)

    AbstractStudies have shown that long-lived individuals seem to pass their survival advantage on to their offspring. Offspring of long-lived parents had a lifelong survival advantage over individuals without long-lived parents, making them more likely to become long-lived themselves. We test whether the survival advantage enjoyed by offspring of long-lived individuals is explained by environmental factors. 101,577 individuals from 16,905 families in the 1812-1886 Zeeland cohort were followed over time. To prevent that certain families were overrepresented in our data, disjoint family trees were selected. Offspring was included if the age at death of both parents was known. Our analyses show that multiple familial resources are associated with survival within the first 5 years of life, with stronger maternal than paternal effects. However, between ages 5 and 100 both parents contribute equally to offspring’s survival chances. After age 5, offspring of long-lived fathers and long-lived mothers had a 16-19% lower chance of dying at any given point in time than individuals without long-lived parents. This survival advantage is most likely genetic in nature, as it could not be explained by other, tested familial resources and is transmitted equally by fathers and mothers.

  • Open Access English
    Authors: 
    Jumelet, J.; Zuidema, W.; Hupkes, D.; Bansal, M.; Villavicencio, A.;
    Publisher: The Association for Computational Linguistics
    Country: Netherlands
    Project: NWO | Language in Interaction (2300176475)

    Extensive research has recently shown that recurrent neural language models are able to process a wide range of grammatical phenomena. How these models are able to perform these remarkable feats so well, however, is still an open question. To gain more insight into what information LSTMs base their decisions on, we propose a generalisation of Contextual Decomposition (GCD). In particular, this setup enables us to accurately distil which part of a prediction stems from semantic heuristics, which part truly emanates from syntactic cues and which part arise from the model biases themselves instead. We investigate this technique on tasks pertaining to syntactic agreement and co-reference resolution and discover that the model strongly relies on a default reasoning effect to perform these tasks. Comment: To appear at CoNLL2019

  • Publication . Article . Conference object . Preprint . 2019
    Open Access
    Authors: 
    Ulmer, D.; Hupkes, D.; Bruni, E.; Augenstein, I.; Gella, S.; Ruder, S.; Kann, K.; Can, B.; Welbl, J.; Conneau, A.; +2 more
    Publisher: Association for Computational Linguistics
    Country: Netherlands
    Project: EC | MAGIC (790369), NWO | Language in Interaction (2300176475)

    Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. In particular, because past representations can be revisited at any point in time, attention-centric methods seem to lack an incentive to build up incrementally more informative representations of incoming sentences. This way of processing stands in stark contrast with the way in which humans are believed to process language: continuously and rapidly integrating new information as it is encountered. In this work, we propose three novel metrics to assess the behavior of RNNs with and without an attention mechanism and identify key differences in the way the different model types process sentences. Comment: Accepted at Repl4NLP, ACL

  • Open Access
    Authors: 
    Danny Merkx; Stefan L. Frank;
    Country: Netherlands
    Project: NWO | Language in Interaction (2300176475)

    AbstractCurrent approaches to learning semantic representations of sentences often use prior word-level knowledge. The current study aims to leverage visual information in order to capture sentence level semantics without the need for word embeddings. We use a multimodal sentence encoder trained on a corpus of images with matching text captions to produce visually grounded sentence embeddings. Deep Neural Networks are trained to map the two modalities to a common embedding space such that for an image the corresponding caption can be retrieved and vice versa. We show that our model achieves results comparable to the current state of the art on two popular image-caption retrieval benchmark datasets: Microsoft Common Objects in Context (MSCOCO) and Flickr8k. We evaluate the semantic content of the resulting sentence embeddings using the data from the Semantic Textual Similarity (STS) benchmark task and show that the multimodal embeddings correlate well with human semantic similarity judgements. The system achieves state-of-the-art results on several of these benchmarks, which shows that a system trained solely on multimodal data, without assuming any word representations, is able to capture sentence level semantics. Importantly, this result shows that we do not need prior knowledge of lexical level semantics in order to model sentence level semantics. These findings demonstrate the importance of visual information in semantics.

  • Publication . Article . Conference object . Preprint . 2018 . Embargo End Date: 01 Jan 2018
    Open Access
    Authors: 
    Fadaee, M.; Monz, C.; Riloff, E.; Chiang, D.; Hockenmaier, J.; Tsujii, J.;
    Publisher: arXiv
    Country: Netherlands
    Project: NWO | Surface Realization in St... (2300172930), NWO | Constraint-Based Language... (2300178358)

    Neural Machine Translation has achieved state-of-the-art performance for several language pairs using a combination of parallel and synthetic data. Synthetic data is often generated by back-translating sentences randomly sampled from monolingual data using a reverse translation model. While back-translation has been shown to be very effective in many cases, it is not entirely clear why. In this work, we explore different aspects of back-translation, and show that words with high prediction loss during training benefit most from the addition of synthetic data. We introduce several variations of sampling strategies targeting difficult-to-predict words using prediction losses and frequencies of words. In addition, we also target the contexts of difficult words and sample sentences that are similar in context. Experimental results for the WMT news translation task show that our method improves translation quality by up to 1.7 and 1.2 Bleu points over back-translation using random sampling for German-English and English-German, respectively. Comment: 11 pages, 2 figures. Accepted at EMNLP 2018