RKI-ITA-2000 is a bilingual (Russian-Italian) corpus containing a set of selected textbooks (coursebooks and reference grammar) for teaching/learning Russian as a Foreign Language (RFL = RKI Russkij Kak Inostrannyj), selected from manuals published in Italy since 2000 and specifically addressed to Italian-speaking students. This corpus is aimed at investigating the way certain metalinguistic notions are treated in the manuals used in the Italian university context (Beaudrie et al. 2021). While acquiring grammatical skills is essential in the process of learning a foreign language, these skills are not always accompanied by adequate metalinguistic awareness, enabling students to develop “the capacity to use knowledge about language” (Bialystok 2001: 124) or “the ability to think about and reflect upon the nature and functions of language” (Kümmerling-Meibauer 1999: 158). Textbooks were retrieved by investigating (as well as websites, libraries and different online sources), in particular Russian teaching programmes both at university (Bachelor's and Master's degrees), and secondary school for a total of around 40 universities and over 90 schools throughout Italy. We, therefore, selected the textbooks that would form the initial corpus on the basis of a number of criteria (Tognini-Bonelli 2001): (i) time range (published from 2000 to the present day); (ii) level of language proficiency covered (A1-B2); (iii) type of textbook (particularly courses and grammars, more functional to the scope of our analysis); (iv) non-self-publishing; (v) frequency of use both at university and at school. Once we had retrieved (also by scanning) the selected set of manuals in pdf format, we transformed them through different OCR programmes, into editable format, in order to be able to extract the constituent elements of text considered useful for our investigations: a) grammatical explanations; b) instructions on how to carry out the exercises. These constituent elements of text, once extracted and post-edited (either manually or through the help of some AI tool), were then meta-tagged in an exe-xml. spreadsheet (Kilgariff 2013), so to upload them into the Sketch Engine software tool and make them processable by most qualitative and quantitative analysis. The corpus meets the requirements of balance and representativeness (Noseda et al. 2019 ): 7 courses for a total of 15 textbooks (covering levels A1-B2 and published from 2002 and 2022) and 6 grammars for a total of 7 textbooks (covering levels A1-B2 and published from 2004 and 2020). It includes a tagging system that records, for each manual, the following information: title, author(s), year of publication, typology (course or grammar), and level of linguistic competence. Further tags make it possible to distinguish the different constituent elements of the manuals collected in the corpus. The tagging systems allow for both qualitative and quantitative analysis to be conducted from a diachronic perspective as well. The creation and the study of this corpus falls within the Project of Excellence in Inclusive Humanities (2023-2027) of the Department of Foreign Languages and Literature of the University of Verona, and specifically within Work Package 1 (WP 1.8-12), sub-project “Extraction of Metalinguistic Notions from Textbooks for Teaching Russian as a Foreign and Heritage Language”. Its main goal is to contribute to the development of inclusive educational strategies, with a particular focus on heritage students of Russian as a Foreign Language (RFL).

RKI-ITA-2000

Tania Triberio
;
Giorgia Pomarolli
;
Daniele Artoni
2024-01-01

Abstract

RKI-ITA-2000 is a bilingual (Russian-Italian) corpus containing a set of selected textbooks (coursebooks and reference grammar) for teaching/learning Russian as a Foreign Language (RFL = RKI Russkij Kak Inostrannyj), selected from manuals published in Italy since 2000 and specifically addressed to Italian-speaking students. This corpus is aimed at investigating the way certain metalinguistic notions are treated in the manuals used in the Italian university context (Beaudrie et al. 2021). While acquiring grammatical skills is essential in the process of learning a foreign language, these skills are not always accompanied by adequate metalinguistic awareness, enabling students to develop “the capacity to use knowledge about language” (Bialystok 2001: 124) or “the ability to think about and reflect upon the nature and functions of language” (Kümmerling-Meibauer 1999: 158). Textbooks were retrieved by investigating (as well as websites, libraries and different online sources), in particular Russian teaching programmes both at university (Bachelor's and Master's degrees), and secondary school for a total of around 40 universities and over 90 schools throughout Italy. We, therefore, selected the textbooks that would form the initial corpus on the basis of a number of criteria (Tognini-Bonelli 2001): (i) time range (published from 2000 to the present day); (ii) level of language proficiency covered (A1-B2); (iii) type of textbook (particularly courses and grammars, more functional to the scope of our analysis); (iv) non-self-publishing; (v) frequency of use both at university and at school. Once we had retrieved (also by scanning) the selected set of manuals in pdf format, we transformed them through different OCR programmes, into editable format, in order to be able to extract the constituent elements of text considered useful for our investigations: a) grammatical explanations; b) instructions on how to carry out the exercises. These constituent elements of text, once extracted and post-edited (either manually or through the help of some AI tool), were then meta-tagged in an exe-xml. spreadsheet (Kilgariff 2013), so to upload them into the Sketch Engine software tool and make them processable by most qualitative and quantitative analysis. The corpus meets the requirements of balance and representativeness (Noseda et al. 2019 ): 7 courses for a total of 15 textbooks (covering levels A1-B2 and published from 2002 and 2022) and 6 grammars for a total of 7 textbooks (covering levels A1-B2 and published from 2004 and 2020). It includes a tagging system that records, for each manual, the following information: title, author(s), year of publication, typology (course or grammar), and level of linguistic competence. Further tags make it possible to distinguish the different constituent elements of the manuals collected in the corpus. The tagging systems allow for both qualitative and quantitative analysis to be conducted from a diachronic perspective as well. The creation and the study of this corpus falls within the Project of Excellence in Inclusive Humanities (2023-2027) of the Department of Foreign Languages and Literature of the University of Verona, and specifically within Work Package 1 (WP 1.8-12), sub-project “Extraction of Metalinguistic Notions from Textbooks for Teaching Russian as a Foreign and Heritage Language”. Its main goal is to contribute to the development of inclusive educational strategies, with a particular focus on heritage students of Russian as a Foreign Language (RFL).
2024
RKI, RFL, Russian, Textbooks, OCR, Sketchengine, inclusive educational strategies
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1159467
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact