![]() |
Grup d'Investigació en Lingüística Computacional - Universitat de Barcelona |
|
The project has developed a set of generally usable software tools to manipulate and analyse text corpora, together with lexicons and multilingual corpora in seven European languages. It has established conventions for the encoding of corpora and harmonised specifications for computational lexicons, building on and contributing to the preliminary recommendations of the relevant international and European standardisation initiatives. All project results are freely and publicly available.
MULTEXT has developed the first set of publicly available large-scale resources and tools for use in corpus-based language engineering applications. The project's specific achievements fall into three areas:
:
:
:
Start date January 1994
Duration 26 months
Total Effort ca. 350 person-months
Text-oriented methods and software tools have come to be of primary interest to the NLP community. It is therefore expected that the availability of basic multilingual tools and data will improve and extend R&D across a wide range of disciplines, including not only the various areas of language engineering, but also fields such as speech technology, language learning, lexicography and lexicology, information retrieval, etc. The project's methodologies and results are being used in a related project under the Copernicus programme, MULTEXT-EAST, thus extending the application to thirteen western and eastern European languages. Extensions to regional and non-European languages are also underway.
| Organisation | Role | Country |
|---|---|---|
| Laboratoire Parole et Langage-CNRS | C | FR |
| Universitat Autonoma de Barcelona, Fundación Bosch Gimpera |
A | ES |
| Universitat de Barcelona | A | ES |
| University of Umea | A | SE |
| Institut Dalle Molle pour les Etudes Sémantiques et Cognitives, Geneva |
P | CH |
| ILC - CNR, Pisa | P | IT |
| University of Edinburgh, HCRC-LTG |
P | UK |
| Universiteit Utrecht, Stichting Taaltechnologie |
A | NL |
| Universität Münster | A | DE |
| INCYTA | P | ES |
| Digital Equipment BV | P | NL |
| SITE EUROLANG-Sonovision Itep Technologies | P | FR |
| Rank Xerox Research Centre | A | FR |
e_mail: veronis@univ-aix.fr

|
Research Projects |
Resources |
Demos |
Publications |
Activities |
Staff |
Address
