OLAC Record

Title:CMC shortening corpus Janes-Kratko 1.0
Bibliographic Citation:http://hdl.handle.net/11356/1087
Creator:Goli, Teja
Osrajnik, Eneja
Fišer, Darja
Erjavec, Tomaž
Date (W3CDTF):2017-01-20T14:05:33Z
Date Available:2017-01-20T14:05:33Z
Description:Janes-Kratko is a corpus of Slovene tweets manually annotated with shortening phenomena according to the supplied typology covering different types of spelling, lexical and syntactic shortenings. The corpus was sampled from the Janes-Norm corpus (http://hdl.handle.net/11356/1084), which was manually annotated for tokenisation, sentence segmentation and word normalisation of non-standard Slovene and automatically annotated with morphosyntactic descriptions and lemmas. The corpus is further described in: GOLI, Teja, OSRAJNIK, Eneja, FIŠER, Darja. Analiza krajšanja slovenskih sporočil na družbenem omrežju Twitter. Proceedings of the Conference on Language Technologies & Digital Humanities, Ljubljana, Slovenia. 2016, pp. 77-82. http://www.sdjt.si/wp/dogodki/konference/jtdh-2016/zbornik/
Identifier (URI):http://hdl.handle.net/11356/1087
Language (ISO639):slv
Publisher:Jožef Stefan Institute
Rights:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Subject:computer-mediated communication
shortening phenomena
manual annotation
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  Slovenian language resource repository CLARIN.SI
Description:  http://www.language-archives.org/archive/clarin.si
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.clarin.si:11356/1087
DateStamp:  2018-12-04
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Goli, Teja; Osrajnik, Eneja; Fišer, Darja; Erjavec, Tomaž. 2017. Jožef Stefan Institute.
Terms: area_Europe country_SI dcmi_Text iso639_slv olac_primary_text

Up-to-date as of: Thu Dec 5 9:50:11 EST 2019