Title:Developmental corpus of Slovene (without language corrections) Šolar-Clear
Bibliographic Citation:http://hdl.handle.net/11356/1150
Creator:Rozman, Tadeja
Stritar Kučuk, Mojca
Kosem, Iztok
Krek, Simon
Krapš Vodopivec, Irena
Arhar Holdt, Špela
Stabej, Marko
Laskowski, Cyprian
Klemenc, Bojan
Date (W3CDTF):2018-11-21T16:52:08Z
Date Available:2018-11-21T16:52:08Z
Description:Šolar-Clear is an adapted version of the Šolar 1.0 corpus, cf. http://hdl.handle.net/11356/1036. The Šolar(-Clear) corpus consists of texts written by students in Slovene primary and secondary schools. School essays form the majority of the corpus (64.2%) while other material includes texts created during lessons, such as text recapitulations or descriptions, examples of formal applications etc. Unlike the original Šolar corpus, Šolar-Clear only includes student texts while language corrections and other types of feedback from the teachers are not included. The corpus can thus be used for processing tasks where the inclusion of corrections hinders or complicates the procedures (e.g. for comparative data extraction, training of language models etc).
Identifier (URI):http://hdl.handle.net/11356/1150
Language (ISO639):slv
Publisher:Trojina, Institute for Applied Slovene Studies
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Subject:student writing
developmental corpus
Type (DCMI):Text
Type (OLAC):primary_text


Citation: Rozman, Tadeja; Stritar Kučuk, Mojca; Kosem, Iztok; Krek, Simon; Krapš Vodopivec, Irena; Arhar Holdt, Špela; Stabej, Marko; Laskowski, Cyprian; Klemenc, Bojan. 2018. Trojina, Institute for Applied Slovene Studies.
Terms: area_Europe country_SI dcmi_Text iso639_slv olac_primary_text

