LEONIDE

LEONIDE is a collection of longitudinal learner data produced in the three languages Italian, German and English (Longitudinal lEarner cOrpus iN Italiano, Deutsch, English). All data was collected to document the development of plurilingual linguistic skills of middle-school-aged pupils and thus to obtain a global view of their individual linguistic repertoire.

The corpus contains around 2.500 texts from 163 pupils, who participated in the project “One school, many languages” conducted in eight schools in the officially multilingual Italian province of South Tyrol – Alto Adige.

Please watch the video below to learn more about LEONIDE.

The video has been presented at the 14th Teaching and Language Corpora (TaLC) conference in July 2020, organized by the Universté de Perpignan (France).

Corpus Information

sub-corpusyear# tokens# texts# writerswriters’ agelanguage
LEONIDE_DE2015-18ca. 74,00083316111-14 yearsGerman
LEONIDE_EN2015-18ca. 70,00083515911-14 yearsEnglish
LEONIDE_IT2015-18ca. 93,00084416211-14 yearsItalian
LEONIDE TOTAL2015-18237,0002,51016311-14 yearsGerman, English, Italian

Corpus Access

The Corpus can be queried via the ANNIS interface or downloaded on the Eurac Research Clarin Repository.

Reference Paper

Glaznieks, A., Frey, J.-C., Stopfner, M., Zanasi, L. & Nicolas, L. (2022): LEONIDE: A longitudinal trilingual corpus of young learners of Italian, German and English. International Journal of Learner Corpus Research 8:1, 97-120. https://doi.org/10.1075/ijlcr.21004.gla

Documentation

Writing Tasks

Picture story task

We used various picture stories for the picture story task. In the first year, we used three Father and Son stories by Erich Ohser (aka E. O. Plauen):

In the second year, we used different sections (without text) of the graphic novel “This one summer” by Mariko Tamaki & Jilian Tamaki (2014, page 89 for the task in German as L2 and language of instruction, page 161 for Italian as L2 and langauge of instruction, page 162 for English as L3), and, in the third year, short extracts of Marjane Satrapi’s “Persepolis” (2007, p. 151, English as L3, without text), Vera Brosgol’s “Anya’s Ghost” (2011, p. 7, Italian as L2 and language of instruction) and Shaun Tan’s “The Arrival” (2006, p. 9, German as L2 and language of instruction).

Opinion text tasks

The topics for the opinion texts were presented in the target language. The topics varied for language (language of instruction vs. language taught as L2/L3). The topic of the first year was repeated in the third year, only the topic of the second year differed:

Related publications

Lopopolo, O. & Zanda, F. (2025): Assessing Inter- and Intra-Rater Reliability in Multi-Layer Annotations: a Study on Tense and Aspect in Learners of English as L3. In K. Ackerley & E. Castello (Eds.), Continuing Learner Corpus Research: Challenges and Opportunities, Presses Universitaires de Louvain, 55-88. [https://pul.uclouvain.be/book/?gcoi=29303100485770]

Bienati, A., & Frey, J.-C. (2025): Development of causal connectives in Italian L1 and L2 student writing: A comparison of argumentative texts from lower and upper secondary school. In K. Ackerley & E. Castello (Eds.), Continuing Learner Corpus Research: Challenges and Opportunities, Presses universitaires Louvain, 197-212. [https://pul.uclouvain.be/book/?gcoi=29303100485770]

Lopopolo, O., Bienati, A., Frey, J.-C., Glaznieks, A. & Spina, S. (2025): Categorizing Speakers’ Language Background: Theoretical Assumptions and Methodological Challenges for Learner Corpus Research. Special Issue of Research Methods in Applied Linguistics. [https://doi.org/10.1016/j.rmal.2024.100170]

Leone-Pizzighella, A. R., Bienati, A., & Frey, J.-C. (2024): Discourse markers in the curricularization of ‘academic language’. A mixed methods analysis of tipo and praticamente in Italian secondary schools. In L. Cirillo & R. Nodari (Eds.), Studi AItLA 18: Contesti, pratiche e risorse della comunicazione multimodale, 149–162. [https://hdl.handle.net/10863/42783]

Lopopolo, O. (2024): Acquisition of additional languages as emerging multilingual constructicons. A Construction Grammar approach to progressive aspectuality. Unpublished PhD thesis. Università per Stranieri di Perugia.

Glaznieks, A., Frey, J.-C. & Abel, A. (2023): Weil-Sätze bei Lernenden des Deutschen. Vergleich zwischen immersiv und nicht-immersiv Deutschlernenden in Südtirol. In M. Beißwenger, E. Gredel, L. Lemnitzer & R. Schneider (Eds.), Korpusgestützte Sprachanalyse. Grundlagen, Anwendungen und Analysen. Tübingen: Narr Francke Attempto, 401-423.

Schmalz, V. J., Frey, J.-C., & Stemle, E. W. (2021): Introducing a Gold Standard Corpus from Young Multilinguals for the Evaluation of Automatic UD-PoS Taggers for Italian. Proceedings of the Eighth Italian Conference on Computational Linguistics, Milan, Italy, June 29-July 1, 2022. [https://hdl.handle.net/10863/37103]

If you have used LEONIDE in your work and want to list your publications here, please email porta@eurac.edu!