OLAC Record

Title:Tweet comma corpus Janes-Vejica 1.0
Bibliographic Citation:http://hdl.handle.net/11356/1088
Creator:Popič, Damjan
Zupan, Katja
Logar, Polona
Kavčič, Teja
Erjavec, Tomaž
Fišer, Darja
Date (W3CDTF):2017-02-16T12:28:26Z
Date Available:2017-02-16T12:28:26Z
Description:Janes-Vejica is a corpus of Slovene tweets where commas are annotated with the reason for their (in)correct use, according to the supplied typology. The corpus was sampled from the Janes-Norm corpus (http://hdl.handle.net/11356/1084), which was manually annotated for tokenisation, sentence segmentation, and word normalisation, and automatically for morphosyntactic descriptions and lemmas. The corpus is further described in: POPIČ, Damjan, FIŠER, Darja, ZUPAN, Katja, LOGAR, Polona. Raba vejice v uporabniških spletnih vsebinah. Proceedings of the Conference on Language Technologies & Digital Humanities, Ljubljana, Slovenia. 2016, pp. 149-153. http://www.sdjt.si/wp/dogodki/konference/jtdh-2016/zbornik/
Identifier (URI):http://hdl.handle.net/11356/1088
Language (ISO639):slv
Publisher:Jožef Stefan Institute
Rights:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Subject:computer-mediated communication
comma placement
manual annotation
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  Slovenian language resource repository CLARIN.SI
Description:  http://www.language-archives.org/archive/clarin.si
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.clarin.si:11356/1088
DateStamp:  2018-12-04
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Popič, Damjan; Zupan, Katja; Logar, Polona; Kavčič, Teja; Erjavec, Tomaž; Fišer, Darja. 2017. Jožef Stefan Institute.
Terms: area_Europe country_SI dcmi_Text iso639_slv olac_primary_text

Up-to-date as of: Tue Aug 20 10:27:05 EDT 2019