Grup de Recerca per a l'Estudi del Repertori Lingüístic (GRERLI)
Grup consolidat reconegut i finançat per la Generalitat de Catalunya (2005-2008)


Català - English

index presentation members projects publications doctoral theses activities corpus

 

< Spenc Corpusr >

<CesCa Corpus >

< Spencer Corpus >

The Spencer corpus is made up of 4 subcopora:

•  Spanish L1. Texts obtained in Cordoba (monolingual environment) and Barcelona (bilingual environment). Spencer Project: Developing Literacy in different contexts and in different languages

•  Catalan L1 . Texts obtained in Barcelona. Projects Discourse processing and organization of expository texts, both oral and (ref.: 1999-RED-5020-2A) and Linguistic depersonalization resources: crossslinguistic, developmental, and didactic perspectives (ref : BSO2000-0676 )

•  Spanish L2 . Texts collected in Murcia and Madrid from subjects of Arab, Chinese, and Korean origin. Project The development of linguistic repertoire in non-native speakers of Spanish and Catalan (ref : SEJ2006-11083 )

•  Catalan L2 . Texts collected in the Barcelona metropolitan area, from subjects of Arab, Chinese, and Korean origin. Project The development of linguistic repertoire in non-native speakers of Spanish and Catalan (ref : SEJ2006-11083 )

These 4 subcorpora are formatted for use by native and non-native Spanish and Catalan speakers, in two registers ( narrative and expository ) and two modalities ( oral and written ), starting from the same production conditions ( Berman and Verhoeven , 2002; Aparici, Argerich , Perera, Rosado and Tolchinsky ( eds .), 2000; Tolchinsky and Rosado, 2005).

In terms of subject characteristics, there are 4 groups, related to age or level of linguistic training: 9 years old (4th course of elementary school), 12 years old (2nd course of middle school), 16 years old (2nd course of high school) and adults (university students).

These subcorpora of native speakers (Spanish L1 and Catalan L1) include the productions of 20 subjects per age group (800 texts in total) and the subcorpora of non-native speakers include the productions from an average of 10 subjects per group (in total, 450 texts).

Access to Spencer Corpus ( http://clic.ub.edu/es/spencer-es )

<CesCa Corpus >

Written Scholastic Catalan in Catalonia

The CesCa project aims to provide the educational community with a fundamental tool for knowing the linguistic usages of its students: a reference corpus of written scholastic Catalan in Catalonia with derivative data to be obtained after its processing.

The project has collected and processed 2,426 texts produced by children since the last course of early education (P5) through the last course of compulsory education (4th of ESO), from 31 regional education centers in Catalonia.

The corpus contains Vocabularies produced in 5 lexical fields:

Here you will find organized information about:

Access to CesCa