- home
- Advanced Search
Filters
Clear AllLoading
Research data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: De Latte, Fien;De Latte, Fien;doi: 10.18710/cmoivt
Dataset abstract: The two data files in this dataset contain the annotated data used to conduct the Apparent-Time and micro-diachronic analysis presented in the paper "Vocativos contraculturales: cambios paradigmáticos y difusión hasta el español coloquial actual". The first data file contains 1107 tokens of a carefully selected set of vocatives, including the most productive Spanish countercultural "cheli slang" vocatives 'tío/-a', 'tronco/-a', 'chaval-/a', 'colega', 'socio/-a', 'pibe/-a' y 'titi', in addition to the more general ones 'chico/-a', 'guapo/-a' and 'macho/-a'. These were extracted from CORMA, a conversational corpus of peninsular Spanish, recorded between 2016 and 2019, in order to conduct the Apparent-Time analysis. The second data file contains 832 tokens of the same vocatives, retrieved from different corpora of conversational peninsular Spanish recorded between the 80s and the first decade of the 21st century, in order to conduct the complementary micro-diachronic analysis. The data from the first data file are annotated for (i) form and (ii) generation of the speaker, while the data from the second data file are annotated for (i) form, (ii) corpus, and (iii) decade. Article abstract: The present study aims to explore the diffusion, and its underlying factors, of the most emblematic vocatives of the Spanish countercultural cheli slang in the decades after the countercultural boom. By adopting an empirical approach, we examine how the leading vocatives from the cheli paradigm (e.g. tío/-a ‘dude/girl’) have spread in colloquial Spanish over the last fifty years, in contrast with a selection of vocatives marked by a more general meaning and usage (e.g. chico/-a ‘boy/girl’). Special attention will be paid to changes in productivity of the analyzed forms. To this end, we analyze data from different corpora of spoken Spanish, namely CORLEC (90s), COLAm (2000s), and CORMA (2016-2019), which represent data from different decades. These corpus data are combined with documentary materials. Results suggest that the semantic features, as well as the expressive power of the vocatives under scrutiny, play a crucial role in their trajectory until recent years.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/cmoivt&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/cmoivt&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Sönning, Lukas;Sönning, Lukas;doi: 10.18710/euxsmw
This dataset contains corpus-based frequency data for an analysis of key verbs in published academic writing. The data are from the Corpus of Contemporary American English (COCA; Davies 2008-) and cover a period of 30 years (1990-2019). The section ‘academic’, which contains research articles from peer-reviewed journals, represents the target variety, and the reference variety is fictional writing as represented in the ‘fiction’ section (which contains short stories, plays, movie scripts, and the first chapter of novels). The total number of text files is 26,137 (academic) and 25,992 (fiction). To reduce computational expense for our methodological simulation study, we restrict our attention to verb lemmas whose whole-(sub)corpus normalized frequency exceeds 10 pmw in the academic section of COCA. The data therefore contain frequency information on only 700 verb lemmas.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/euxsmw&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/euxsmw&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Klavan, Jane;Klavan, Jane;doi: 10.18710/kdszep
Manually annotated dataset of 3,000 uses of exterior locative constructions (specifically cases and postpositions) in present-day Estonian. The data is extracted from the Estonian National Corpus (ENC 2017; 1.1 billion words, mainly web-based texts). The data includes 500 uses of each of the following constructions: allative, adessive, ablative, peale, peal, pealt. The data sampling procedure and more details about the dataset is given in Klavan & Schützler (to appear in Cognitive Linguistics). The data is annotated for 9 variables: postpos (outcome variable: case, postposition), position (post, pre), complexity (simple, compound), length (length in syllables of landmark phrase), frequency (raw frequency of landmark form in association with the respective semantic relation), function (adverbial, modifier), verb_lemma (224 levels for lative, 279 levels for locative, 252 levels for separative), lm_lemma (592 levels for lative, 438 levels for locative, 528 levels for separative), sem_rel (lative, locative, separative). The dataset was collected by the PI of the project PUT1358 "The Making and Breaking of Models: Experimentally Validating Classification Models in Linguistics" (1.01.2017−31.12.2020) funded by the Estonian Research Council. Sketch Engine, 2.36.5
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/kdszep&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/kdszep&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Janda, Laura A; Nesset, Tore;Janda, Laura A; Nesset, Tore;doi: 10.18710/4d2qii
Data and R code are provided for statistical analysis of approximately 39,000 corpus examples of predicate agreement in constructions with quantified subjects in Russian. The analysis indicates that these constructions constitute a network of constructions (“allostructions”) with various preferences for singular or plural agreement. Factors pull in different directions, and we observe a relatively stable situation in the face of variation. We present an analysis of a multidimensional network of allostructions in Russian, thus contributing to our understanding of allostructional relationships in Construction Grammar. With regard to historical linguistics, language stability is an understudied field. We illustrate an interplay of divergent factors that apparently resists language change. The syntax of numerals and other quantifiers represents a notoriously complex phenomenon of the Russian language. Our study sheds new light on the contributions of factors that favor singular or plural agreement in sentences with quantified subjects.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/4d2qii&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/4d2qii&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Arctic Indigenous Languages And Revitalization: An Online Educational Resource;Arctic Indigenous Languages And Revitalization: An Online Educational Resource;doi: 10.18710/y21dh0
Dataset containing (5) GIS shapefiles which can be used to visualize a circumpolar overview map of geographical language speaker areas for Arctic Indigenous Peoples languages with additional attribute information about the languages. The language speaker areas show generally the maximum continuous areas where the Indigenous Peoples who spoke those languages lived in a historical context. The exact time range is defined specifically per region. Data from the languages and dialects shapefile was used to make language family and language family branch shapefiles. There is also a separate shapefile with some examples of innovative language revitalization in the region. There is a supporting shapefile of Arctic places to assist when visualizing the data. The 5 shapefiles can be used together or separately. All shapefiles are intended to be used as open resources for education and research.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/y21dh0&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/y21dh0&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Enghels, Renata; Roels, Linde;Enghels, Renata; Roels, Linde;doi: 10.18710/9qlip6
This dataset contains two annotated datasets used to create the tables and graphs in the paper "The apparent-time construct as a proxy to spoken conversational data in the 20th century: a Spanish case study". A first dataset contains tokens of the pragmatic marker 'sabes' and related epistemic expressions that were retrieved from different corpora of conversational peninsular Spanish recorded between 1970 and the first decade of the 21st century. The second dataset contains tokens from the pragmatic marker 'sabes' and related epistemic expressions that were retrieved from CORMA, a conversational corpus of peninsular Spanish, recorded between 2016 and 2019. The data from the first dataset is annotated for (i) form, (ii) corpus, (iii) decade, (iv) position of the marker in the turn, (v) pragmatic function. The data from the second dataset is annotated for (i) form, (ii) generation of the speaker, (iii) position of the marker in the turn, (iv) pragmatic function.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/9qlip6&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/9qlip6&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: De Cuypere, Ludovic; De Coster, Evelyn; Baten, Kristof;De Cuypere, Ludovic; De Coster, Evelyn; Baten, Kristof;doi: 10.18710/tdufpi
Dataset abstract The dataset contains the ratings for a 100-split task performed by Russian learners of English. 272 Russian learners were subdivided into two groups. One group rated 25 English sentences containing the dative alternation taken from the British National Corpus (BNC), the other group rated 25 Russian sentences (translations of the English ones) with the ditransitive alternation. The English sentences with the double object construction were translated into Russian sentences with the equivalent recipient/dative-theme /accusative order, while the English sentences with the to-dative construction were translated with the equivalent theme/accusative-Recipient/dative order. The dataset contains information about the Participants (Age, Gender, Year of Study, Study domain), the test Sentences (the reference to the BNC, the sentences in both languages, the observed dative construction in the BNC, and its main verb), and the Ratings for each Sentence by each Participant. The replication R code is additionally shared together with the output as an html-file. Article abstract Ditransitive verbs include a “recipient” and a “theme” argument (in addition to the subject). The choice of putting one argument before the other (i.e., either recipient-theme, or theme-recipient) is associated with multiple discourse-pragmatic factors. Languages have different options to code the ditransitive construction. In English, a ditransitive verb can take two alternating patterns (“the dative alternation”): the Double Object Construction (DOC) (John gives Mary a book) and the to-dative construction (to-dative) (John gives a book to Mary). In Russian, theme and recipient are marked by accusative and dative, respectively. In addition, word order is flexible and either the accusative-marked theme (Pjotr dal knigu Marii), or the dative-marked recipient (Pjotr dal Marii knigu) can come first. This article reports on two sentence rating experiments (acceptability judgments) to test whether Russian learners of English transfer their preferences about the theme-recipient order in Russian to the ditransitive construction in English. A total of 272 Russian students were tested. Results for both tests showed a great variability in the ratings. A comparison of the ratings seems to suggest a small positive correlation, but no statistically significant relation was found between the order preferences in both languages. However, we found a small preference for the use of the to-dative, which we relate to the language acquisition process as proposed by Processability Theory
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/tdufpi&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/tdufpi&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Janda, Laura Alexis;Janda, Laura Alexis;doi: 10.18710/xkdblf
Description of Dataset This is a study of examples of Russian predicate adjectives in clauses with zero-copula present tense, where the adjective is a short form (SF) or a long form nominative (LF). The data was collected in 2022 from SynTagRus (https://universaldependencies.org/treebanks/ru_syntagrus/index.html), the syntactic subcorpus of the Russian National Corpus (https://ruscorpora.ru/new/). The data merges the results of several searches conducted to extract examples of sentences with long form and short form adjectives in predicate position, as identified by the corpus. The examples were imported to a spreadsheet and annotated manually, based on the syntactic analyses given in the corpus. For present tense sentences with no copula (Река спокойна or Река спокойная), it was necessary to search for an adjective as the top (root) node in the syntactic structure. The syntactic and morphological categories used in the corpus are explained here: https://ruscorpora.ru/page/instruction-syntax/. In order for the R code to run from these files, one needs to set up an R project with the data files in a folder named "data" and the R markdown files in a folder named "scripts". Method: Logistic regression analysis of corpus data carried out in R (R version 4.2.3 (2023-03-15)-- "Shortstop Beagle" Copyright (C) 2023 The R Foundation for Statistical Computing) and documented in an .Rmd file. Publication Abstract The present article presents an empirical investigation of the choice between so-called long (e.g., prostoj ‘simple’) and short forms (e.g., prost ‘simple’) of predicate adjectives in Russian based on data from the syntactic subcorpus of the Russian National Corpus. The data under scrutiny suggest that short forms represent the dominant option for predicate adjectives. It is proposed that long forms are descriptions of thematic participants in sentences with no complement, while short forms may take complements and describe both participants (thematic and rhematic) and situations. Within the “space of competition” where both long and short forms are well attested, it is argued that the choice of form to some extent depends on subject type, gender/number, and frequency. On the methodological level, the approach adopted in the present study may be extended to other cases of competition in morphosyntax. It is suggested that one should first “peel off” contexts where (nearly) categorical rules are at work, before one undertakes a statistical analysis of the “space of competition”.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/xkdblf&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/xkdblf&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Verbeke, Gil; Simon, Ellen;Verbeke, Gil; Simon, Ellen;doi: 10.18710/8f0q0l
Dataset abstract This dataset contains the results from 33 Flemish English as a Foreign Language (EFL) learners, who were exposed to eight native and non-native accents of English. These participants completed (i) a comprehensibility and accentedness rating task, followed by (ii) an orthographic transcription task. In the first task, listeners were asked to rate eight speakers of English on comprehensibility and accentedness on a nine-point scale (1 = easy to understand/no accent; 9 = hard to understand/strong accent). How Accentedness ratings and listeners' Familiarity with the different accents impacted on their Comprehensibility judgements was measured using a linear mixed-effects model. The orthographic transcription task, then, was used to verify how well listeners actually understood the different accents of English (i.e. intelligibility). To that end, participants' transcription Accuracy was measured as the number of correctly transcribed words and was estimated using a logistic mixed-effects model. Finally, the relation between listeners' self-reported ease of understanding the different speakers (comprehensibility) and their actual understanding of the speakers (intelligibility) was assessed using a linear mixed-effects regression. R code for the data analysis is provided. Article abstract This study investigates how well English as a Foreign Language (EFL) learners report understanding (i.e. comprehensibility) and actually understand (i.e. intelligibility) native and non-native accents of English, and how EFL learners’ self-reported ease of understanding and actual understanding of these accents are aligned. Thirty-three Dutch-speaking EFL learners performed a comprehensibility and accentedness judgement task, followed by an orthographic transcription task. The judgement task elicited listeners’ scalar ratings of authentic speech from eight speakers with traditional Inner, Outer and Expanding Circle accents. The transcription task assessed listeners’ actual understanding of 40 sentences produced by the same eight speakers. Speakers with Inner Circle accents were reported to be more comprehensible than speakers with non-Inner Circle accents, with Expanding Circle speakers being easier to understand than Outer Circle speakers. The strength of a speaker’s accent significantly affected listeners’ comprehensibility ratings. Most speakers were highly intelligible, with intelligibility scores ranging between 79% and 95%. Listeners’ self-reported ease of understanding the speakers in our study generally matched their actual understanding of those speakers, but no correlation between comprehensibility and intelligibility was detected. The study foregrounds the effect of native and non-native accents on comprehensibility and intelligibility, and highlights the importance of multidialectal listening skills.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/8f0q0l&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/8f0q0l&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Ponnet, Aaricia; De Cuypere, Ludovic;Ponnet, Aaricia; De Cuypere, Ludovic;doi: 10.18710/3ywq8r
Dataset abstract The dataset includes annotated corpus data of N = 1811 utterances based on a picture description task that elicited semi-spontaneous oral production data from 15 Dutch learners of Hindi, from four (cross-sectional) stages (Years) of the Hindi course trajectory. The corpus data is annotated for (i) Learner, (ii) Year of study of the learner, (iii) the use of ne as an ergative marker, (iv) correct usage of the ne-marker, (v) the use of ko as a Differential Object Marker, (vi) the use of ko as another marker, and multiple features associated with ne- and ko-marking, including: (vii) specificity of the Direct Object, (viii) animacy of the Direct Object, (ix) transitivity of the sentence Verb, (x) perfectivity of the sentence Verb, (xi) other uses of the ko-marker, (xii) the semantic role of these other uses of the ko-marker. Article abstract We investigated the acquisition of Hindi split ergativity (ne-marking) and Differential Object Marking (zero or komarking) by L1 speakers of Dutch. Both grammatical phenomena are conditioned by multiple syntactic and semantic features. On a descriptive level, the study aims to examine when and how Dutch learners acquire and apply the conditional features associated with ne- and ko-marking. A specific learner corpus was created based on a picture description task that elicited semi-spontaneous oral production data from 15 Dutch learners of Hindi, from four (cross-sectional) stages of the Hindi course trajectory. We annotated the corpus data for multiple features associated with ne- and ko-marking. Using a mixed-effects logistic regression analysis, we found an increase in the use and accuracy of each case marker over the different years of study, but individual learner profile analyses revealed considerable intersubject differences in learner behaviour. We show that it is possible to define developmental stages for the acquisition of ne- and ko-marking in line with Processability Theory.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/3ywq8r&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/3ywq8r&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu
Loading
Research data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: De Latte, Fien;De Latte, Fien;doi: 10.18710/cmoivt
Dataset abstract: The two data files in this dataset contain the annotated data used to conduct the Apparent-Time and micro-diachronic analysis presented in the paper "Vocativos contraculturales: cambios paradigmáticos y difusión hasta el español coloquial actual". The first data file contains 1107 tokens of a carefully selected set of vocatives, including the most productive Spanish countercultural "cheli slang" vocatives 'tío/-a', 'tronco/-a', 'chaval-/a', 'colega', 'socio/-a', 'pibe/-a' y 'titi', in addition to the more general ones 'chico/-a', 'guapo/-a' and 'macho/-a'. These were extracted from CORMA, a conversational corpus of peninsular Spanish, recorded between 2016 and 2019, in order to conduct the Apparent-Time analysis. The second data file contains 832 tokens of the same vocatives, retrieved from different corpora of conversational peninsular Spanish recorded between the 80s and the first decade of the 21st century, in order to conduct the complementary micro-diachronic analysis. The data from the first data file are annotated for (i) form and (ii) generation of the speaker, while the data from the second data file are annotated for (i) form, (ii) corpus, and (iii) decade. Article abstract: The present study aims to explore the diffusion, and its underlying factors, of the most emblematic vocatives of the Spanish countercultural cheli slang in the decades after the countercultural boom. By adopting an empirical approach, we examine how the leading vocatives from the cheli paradigm (e.g. tío/-a ‘dude/girl’) have spread in colloquial Spanish over the last fifty years, in contrast with a selection of vocatives marked by a more general meaning and usage (e.g. chico/-a ‘boy/girl’). Special attention will be paid to changes in productivity of the analyzed forms. To this end, we analyze data from different corpora of spoken Spanish, namely CORLEC (90s), COLAm (2000s), and CORMA (2016-2019), which represent data from different decades. These corpus data are combined with documentary materials. Results suggest that the semantic features, as well as the expressive power of the vocatives under scrutiny, play a crucial role in their trajectory until recent years.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/cmoivt&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/cmoivt&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Sönning, Lukas;Sönning, Lukas;doi: 10.18710/euxsmw
This dataset contains corpus-based frequency data for an analysis of key verbs in published academic writing. The data are from the Corpus of Contemporary American English (COCA; Davies 2008-) and cover a period of 30 years (1990-2019). The section ‘academic’, which contains research articles from peer-reviewed journals, represents the target variety, and the reference variety is fictional writing as represented in the ‘fiction’ section (which contains short stories, plays, movie scripts, and the first chapter of novels). The total number of text files is 26,137 (academic) and 25,992 (fiction). To reduce computational expense for our methodological simulation study, we restrict our attention to verb lemmas whose whole-(sub)corpus normalized frequency exceeds 10 pmw in the academic section of COCA. The data therefore contain frequency information on only 700 verb lemmas.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/euxsmw&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/euxsmw&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Klavan, Jane;Klavan, Jane;doi: 10.18710/kdszep
Manually annotated dataset of 3,000 uses of exterior locative constructions (specifically cases and postpositions) in present-day Estonian. The data is extracted from the Estonian National Corpus (ENC 2017; 1.1 billion words, mainly web-based texts). The data includes 500 uses of each of the following constructions: allative, adessive, ablative, peale, peal, pealt. The data sampling procedure and more details about the dataset is given in Klavan & Schützler (to appear in Cognitive Linguistics). The data is annotated for 9 variables: postpos (outcome variable: case, postposition), position (post, pre), complexity (simple, compound), length (length in syllables of landmark phrase), frequency (raw frequency of landmark form in association with the respective semantic relation), function (adverbial, modifier), verb_lemma (224 levels for lative, 279 levels for locative, 252 levels for separative), lm_lemma (592 levels for lative, 438 levels for locative, 528 levels for separative), sem_rel (lative, locative, separative). The dataset was collected by the PI of the project PUT1358 "The Making and Breaking of Models: Experimentally Validating Classification Models in Linguistics" (1.01.2017−31.12.2020) funded by the Estonian Research Council. Sketch Engine, 2.36.5
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/kdszep&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/kdszep&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Janda, Laura A; Nesset, Tore;Janda, Laura A; Nesset, Tore;doi: 10.18710/4d2qii
Data and R code are provided for statistical analysis of approximately 39,000 corpus examples of predicate agreement in constructions with quantified subjects in Russian. The analysis indicates that these constructions constitute a network of constructions (“allostructions”) with various preferences for singular or plural agreement. Factors pull in different directions, and we observe a relatively stable situation in the face of variation. We present an analysis of a multidimensional network of allostructions in Russian, thus contributing to our understanding of allostructional relationships in Construction Grammar. With regard to historical linguistics, language stability is an understudied field. We illustrate an interplay of divergent factors that apparently resists language change. The syntax of numerals and other quantifiers represents a notoriously complex phenomenon of the Russian language. Our study sheds new light on the contributions of factors that favor singular or plural agreement in sentences with quantified subjects.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/4d2qii&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/4d2qii&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Arctic Indigenous Languages And Revitalization: An Online Educational Resource;Arctic Indigenous Languages And Revitalization: An Online Educational Resource;doi: 10.18710/y21dh0
Dataset containing (5) GIS shapefiles which can be used to visualize a circumpolar overview map of geographical language speaker areas for Arctic Indigenous Peoples languages with additional attribute information about the languages. The language speaker areas show generally the maximum continuous areas where the Indigenous Peoples who spoke those languages lived in a historical context. The exact time range is defined specifically per region. Data from the languages and dialects shapefile was used to make language family and language family branch shapefiles. There is also a separate shapefile with some examples of innovative language revitalization in the region. There is a supporting shapefile of Arctic places to assist when visualizing the data. The 5 shapefiles can be used together or separately. All shapefiles are intended to be used as open resources for education and research.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/y21dh0&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.eu0 citations 0 popularity Average influence Average impulse Average Powered by BIP!
more_vert add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.All Research productsarrow_drop_down <script type="text/javascript"> <!-- document.write('<div id="oa_widget"></div>'); document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=10.18710/y21dh0&type=result"></script>'); --> </script>
For further information contact us at helpdesk@openaire.euResearch data keyboard_double_arrow_right Dataset 2023DataverseNO Authors: Enghels, Renata; Roels, Linde;Enghels, Renata; Roels, Linde;doi: 10.18710/9qlip6
This dataset contains two annotated datasets used to create the tables and graphs in the paper "The apparent-time construct as a proxy to spoken conversational data in the 20th century: a Spanish case study". A first dataset contains tokens of the pragmatic marker 'sabes' and related epistemic expressions that were retrieved from different corpora of conversational peninsular Spanish recorded between 1970 and the first decade of the 21st century. The second dataset contains tokens from the pragmatic marker 'sabes' and related epistemic expressions that were retrieved from CORMA, a conversational corpus of peninsular Spanish, recorded between 2016 and 2019. The data from the first dataset is annotated for (i) form, (ii) corpus, (iii) decade, (iv) position of the marker in the turn, (v) pragmatic function. The data from the second dataset is annotated for (i) form, (ii) generation of the speaker, (iii) position of the marker in the turn, (iv) pragmatic function.
add ClaimPlease grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.This Research product is the result of merged Research products in OpenAIRE.
You have already added works in your ORCID record related to the merged Research product.