• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 6 versions
Publication . Article . 2021 . Embargo end date: 07 Oct 2022

Using the relative entropy of linguistic complexity to assess L2 language proficiency development

Kun Sun; Rong Wang;
Open Access
Published: 20 Aug 2021
Publisher: Universität Stuttgart
Country: Germany

This study applies relative entropy in naturalistic large-scale corpus to calculate the difference among L2 (second language) learners at different levels. We chose lemma, token, POStrigram, conjunction to represent lexicon and grammar to detect the patterns of language proficiency development among different L2 groups using relative entropy. The results show that information distribution discrimination regarding lexical and grammatical differences continues to increase from L2 learners at a lower level to those at a higher level. This result is consistent with the assumption that in the course of second language acquisition, L2 learners develop towards a more complex and diverse use of language. Meanwhile, this study uses the statistics method of time series to process the data on L2 differences yielded by traditional frequency-based methods processing the same L2 corpus to compare with the results of relative entropy. However, the results from the traditional methods rarely show regularity. As compared to the algorithms in traditional approaches, relative entropy performs much better in detecting L2 proficiency development. In this sense, we have developed an effective and practical algorithm for stably detecting and predicting the developments in L2 learners’ language proficiency.

H2020 European Research Council

Subjects by Vocabulary

Microsoft Academic Graph classification: Computer science Kullback–Leibler divergence Natural language processing computer.software_genre computer Lexicon Information theory Artificial intelligence business.industry business Linguistic sequence complexity Language proficiency Grammar media_common.quotation_subject media_common Lemma (mathematics) Second-language acquisition


400, 400, Article, L2 learners, linguistic complexity, language proficiency development, information theory, time series, General Physics and Astronomy, Science, Q, Astrophysics, QB460-466, Physics, QC1-999

Related Organizations
Funded by
Wide Incremental learning with Discrimination nEtworks
  • Funder: European Commission (EC)
  • Project Code: 742545
  • Funding stream: H2020 | ERC | ERC-ADG
Related to Research communities
Digital Humanities and Cultural Heritage
Download fromView all 7 sources
Article . 2021
Providers: DOAJ-Articles