LEONIDE is a collection of longitudinal learner data produced in the three languages Italian, German and English (Longitudinal lEarner cOrpus iN Italiano, Deutsch, English). All data was collected to document the development of plurilingual linguistic skills of middle-school-aged pupils and thus to obtain a global view of their individual linguistic repertoire.
The corpus contains around 2.500 texts from 163 pupils, who participated in the project “One school, many languages” conducted in eight schools in the officially multilingual Italian province of South Tyrol – Alto Adige.
Please watch the video below to learn more about LEONIDE.
Corpus Information
sub-corpus | year | # tokens | # texts | # writers | writers’ age | language |
---|---|---|---|---|---|---|
LEONIDE_DE | 2015-18 | ca. 74,000 | 833 | 161 | 11-14 years | German |
LEONIDE_EN | 2015-18 | ca. 70,000 | 835 | 159 | 11-14 years | English |
LEONIDE_IT | 2015-18 | ca. 93,000 | 844 | 162 | 11-14 years | Italian |
LEONIDE TOTAL | 2015-18 | 237,000 | 2,510 | 163 | 11-14 years | German, English, Italian |
Corpus Access
The Corpus can be queried via the ANNIS interface or downloaded on the Eurac Research Clarin Repository.
Reference Paper
Glaznieks, A., Frey, J.-C., Stopfner, M., Zanasi, L. & Nicolas, L. (2022): LEONIDE: A longitudinal trilingual corpus of young learners of Italian, German and English. International Journal of Learner Corpus Research 8:1, 97-120. https://doi.org/10.1075/ijlcr.21004.gla
Documentation
Writing Tasks
Picture story task
We used various picture stories for the picture story task. In the first year, we used three Father and Son stories by Erich Ohser (aka E. O. Plauen):
- task in German (as L2 and language of instruction): “Die gute Gelegenheit”
- task in Italian (as L2 and language of instruction): “Der gelöschte Vater”
- task in English (as L3): “Der Schmöker”
In the second year, we used different sections (without text) of the graphic novel “This one summer” by Mariko Tamaki & Jilian Tamaki (2014, page 89 for the task in German as L2 and language of instruction, page 161 for Italian as L2 and langauge of instruction, page 162 for English as L3), and, in the third year, short extracts of Marjane Satrapi’s “Persepolis” (2007, p. 151, English as L3, without text), Vera Brosgol’s “Anya’s Ghost” (2011, p. 7, Italian as L2 and language of instruction) and Shaun Tan’s “The Arrival” (2006, p. 9, German as L2 and language of instruction).
Opinion text tasks
The topics for the opinion texts were presented in the target language. The topics varied for language (language of instruction vs. language taught as L2/L3). The topic of the first year was repeated in the third year, only the topic of the second year differed:
- for the task in the language that is used as main language of instruction (Italian/German), the topic in the 1st and 3rd year was about languages that students should learn on lower secondary level;
- for the task in the language that is used as main language of instruction (Italian/German), the topic in the 2nd year was about refugees coming to Europe;
- for the task in the language that is taught as L2 (Italian/German), the topic in the 1st and 3rd year was about the students’ future plans regarding their profession;
- for the task in the language that is taught as L2 (Italian/German), the topic in the 2nd year was about sucessful strategies to learn a new language;
- for the task in the language that is taught as L3 (English), the topic in the 1st and 3rd year was about the students’ favorite subjects at school;
- for the task in the language that is taught as L3 (English), the topic in the 2nd year was about the students’ leisure activities.
Related publications
Lopopolo, O. & Zanda, F. (2025): Assessing Inter- and Intra-Rater Reliability in Multi-Layer Annotations: a Study on Tense and Aspect in Learners of English as L3. In K. Ackerley & E. Castello (Eds.), Continuing Learner Corpus Research: Challenges and Opportunities, Presses Universitaires de Louvain, 55-88. [https://pul.uclouvain.be/book/?gcoi=29303100485770]
Bienati, A., & Frey, J.-C. (2025): Development of causal connectives in Italian L1 and L2 student writing: A comparison of argumentative texts from lower and upper secondary school. In K. Ackerley & E. Castello (Eds.), Continuing Learner Corpus Research: Challenges and Opportunities, Presses universitaires Louvain, 197-212. [https://pul.uclouvain.be/book/?gcoi=29303100485770]
Lopopolo, O., Bienati, A., Frey, J.-C., Glaznieks, A. & Spina, S. (2025): Categorizing Speakers’ Language Background: Theoretical Assumptions and Methodological Challenges for Learner Corpus Research. Special Issue of Research Methods in Applied Linguistics. [https://doi.org/10.1016/j.rmal.2024.100170]
Leone-Pizzighella, A. R., Bienati, A., & Frey, J.-C. (2024): Discourse markers in the curricularization of ‘academic language’. A mixed methods analysis of tipo and praticamente in Italian secondary schools. In L. Cirillo & R. Nodari (Eds.), Studi AItLA 18: Contesti, pratiche e risorse della comunicazione multimodale, 149–162. [https://hdl.handle.net/10863/42783]
Lopopolo, O. (2024): Acquisition of additional languages as emerging multilingual constructicons. A Construction Grammar approach to progressive aspectuality. Unpublished PhD thesis. Università per Stranieri di Perugia.
Glaznieks, A., Frey, J.-C. & Abel, A. (2023): Weil-Sätze bei Lernenden des Deutschen. Vergleich zwischen immersiv und nicht-immersiv Deutschlernenden in Südtirol. In M. Beißwenger, E. Gredel, L. Lemnitzer & R. Schneider (Eds.), Korpusgestützte Sprachanalyse. Grundlagen, Anwendungen und Analysen. Tübingen: Narr Francke Attempto, 401-423.
Schmalz, V. J., Frey, J.-C., & Stemle, E. W. (2021): Introducing a Gold Standard Corpus from Young Multilinguals for the Evaluation of Automatic UD-PoS Taggers for Italian. Proceedings of the Eighth Italian Conference on Computational Linguistics, Milan, Italy, June 29-July 1, 2022. [https://hdl.handle.net/10863/37103]
If you have used LEONIDE in your work and want to list your publications here, please email porta@eurac.edu!