LEONIDE is a collection of longitudinal learner data produced in the three languages Italian, German and English (Longitudinal lEarner cOrpus iN Italiano, Deutsch, English). All data was collected to document the development of plurilingual linguistic skills of middle-school-aged pupils and thus to obtain a global view of their individual linguistic repertoire.

The corpus contains around 2.500 texts from 163 pupils, who participated in the project “One school, many languages” conducted in eight schools in the officially multilingual Italian province of South Tyrol – Alto Adige.

Please watch the video below to learn more about LEONIDE.

The video has been presented at the 14th Teaching and Language Corpora (TaLC) conference in July 2020, organized by the Universté de Perpignan (France).

Corpus Information

sub-corpusyear# tokens# texts# writerswriters’ agelanguage
LEONIDE_DE2015-18ca. 74,00083316111-14 yearsGerman
LEONIDE_EN2015-18ca. 70,00083315911-14 yearsEnglish
LEONIDE_IT2015-18ca. 93,00084416211-14 yearsItalian
LEONIDE TOTAL2015-18237,0002,51016311-14 yearsGerman, English, Italian


Writing Tasks
Picture story task

We used various picture stories for the picture story task. In the first year, we used three Father and Son stories by Erich Ohser (aka E. O. Plauen):

In the second year, we used different sections (without text) of the graphic novel “This one summer” by Mariko Tamaki & Jilian Tamaki (2014, page 89 for the task in German as L2 and language of instruction, page 161 for Italian as L2 and langauge of instruction, page 162 for English as L3), and, in the third year, short extracts of Marjane Satrapi’s “Persepolis” (2007, p. 151, English as L3, without text), Vera Brosgol’s “Anya’s Ghost” (2011, p. 7, Italian as L2 and language of instruction) and Shaun Tan’s “The Arrival” (2006, p. 9, German as L2 and language of instruction).

Opinion text tasks

The topics for the opinion texts were presented in the target language. The topics varied for language (language of instruction vs. language taught as L2/L3). The topic of the first year was repeated in the third year, only the topic of the second year differed:

Reference Paper

Glaznieks, A., Frey, J.-C., Stopfner, M., Zanasi, L. & Nicolas, L. (2022): LEONIDE: A longitudinal trilingual corpus of young learners of Italian, German and English. International Journal of Learner Corpus Research 8:1, 97-120. https://doi.org/10.1075/ijlcr.21004.gla

Corpus Access

The Corpus can be queried via the ANNIS interface or downloaded on the Eurac Research Clarin Repository.