OLAC Record

Title:Collaborative corpus building for minorized languages using wiki-technology. Documenting the Asturian language
Bibliographic Citation:Larusson, Johann, Saurí, Roser, Viejo, Xulio; 2009-04-08T01:35:16Z; Kaipuleohone University of Hawai'i Digital Language Archive;http://hdl.handle.net/10125/4984.
Creator:Larusson, Johann
Saurí, Roser
Viejo, Xulio
Description:Eslema is the first project devoted to building a corpus for Asturian. Asturian (or Asturian-Leonese) is the Romance language autochthonous of most of the territory in Asturias, Leon and Zamora provinces (Spain), and the district of Miranda do Douro (Portugal). Its community of speakers is estimated to be around 300,000 people, corresponding to approximately a third of the population of the area where Asturian is spoken. These figures bode ill for the future of the language since Asturian competence is notably reduced among young people, a fact that seriously threatens its generational transmission (Llera Ramo, 2002). Being the corpus of a minorized language, Eslema’s main goals are both (a) documenting Asturian in a systematic way, and (b) helping set the foundation for codifying and fully normalizing it as the language of use in any possible social context. As such, the project is conceived as a general framework for developing several subcorpora, including documents of a varied typology and from different historical periods, representing both written and oral discourse (Author, 2008a). Eslema’s scarcity of funding has prompted an alternative search for much needed resources. As with many Western minorized languages Asturian speakers feel a degree of commitment to the language and its survival. Using this to our advantege, we have developed a wiki-based environment that enables the entire Asturian community to collaboratively collect and annotate texts online, enlarging Eslema at a minimum cost. Wikis are ideally suited for this kind of activity. A wiki is essentially a website enabling non-collocated users to easily asynchronously co-edit and share documents. Wikis are very loosely structured and do not favor a particular type of content or a “tech-savvy” method of manipulating the content. Previous research has developed a platform called the WikiDesignPlatform (WDP) to support different kinds of wiki-based collaborative learning activities (Author, 2008b). The WDP provides a suite of awareness, navigational, and communicative components that can be easily layered on top of, or coupled with, standard wiki features. Using the WDP platform, we are able to quickly engineer an online workspace tailored to the needs of community. Users can easily suggest documents for classification, collectively classify texts, and communicate their work. Using the WDP’s awareness features, users can keep current on the progress of their work and the advancement of individual documents. This paper, presents the collaborative WDP-based environment we have built, its application and results in compiling the Asturian corpus. References: Author (2008a) Eslema. Towards a Corpus for Asturian. In Collaboration: interoperability between people in the creation of language resources for less-resourced languages. A SALTMIL workshop. LREC 2008. Marrakesh. Author (2008b). Supporting and Tracking Collective Cognition in Wikis. In Proceedings of ICLS 2008: International Conference for the Learning Sciences: Vol. 3 (pp. 330-337). The International Society of the Learning Sciences. Llera Ramo, F. (2002). II Estudiu siciollingüísticu d’Asturies. Avance de datos. In Lletres Asturianes, 89, 181–197.
Identifier (URI):http://hdl.handle.net/10125/4984
Language (ISO639):eng
Table Of Contents:4984.pdf
Type:Conference Paper
Type (DCMI):Text


Archive:  Language Documentation and Conservation
Description:  http://www.language-archives.org/archive/ldc.scholarspace.manoa.hawaii.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:scholarspace.manoa.hawaii.edu:10125/4984
DateStamp:  2009-04-09
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Larusson, Johann; Saurí, Roser; Viejo, Xulio. 2009-04-08T01:35:16Z. Language Documentation and Conservation.
Terms: area_Europe country_GB dcmi_Text iso639_eng

Up-to-date as of: Sun Mar 1 15:44:19 EST 2020