OLAC Record
oai:www.clarin.si:11356/1240

Metadata
Title:Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.1
Bibliographic Citation:http://hdl.handle.net/11356/1240
Creator:Ljubešić, Nikola
Erjavec, Tomaž
Batanović, Vuk
Miličević, Maja
Samardžić, Tanja
Date (W3CDTF):2019-09-11T15:40:03Z
Date Available:2019-09-11T15:40:03Z
Description:ReLDI-NormTagNER-sr 2.1 is a manually annotated corpus of Serbian tweets. It is meant as a gold-standard training and testing dataset for tokenisation, sentence segmentation, word normalisation, morphosyntactic tagging, lemmatisation and named entity recognition of non-standard Serbian. Each tweet is also annotated for its automatically assigned standardness levels (T = technical standardness, L = linguistic standardness). As an update to version 2.0, version 2.1 corrects some annotation errors and adds morphosyntactic annotations in the Universal Dependencies formalism in addition to the MULTEXT-East morphosyntactic descriptions. The corpus is now also available in CoNLL-U format.
Identifier (URI):http://hdl.handle.net/11356/1240
Language:Serbian
Language (ISO639):srp
Publisher:Jožef Stefan Institute
Replaces (URI):http://hdl.handle.net/11356/1171
Rights:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
https://creativecommons.org/licenses/by-sa/4.0/
Subject:computer-mediated communication
tokenisation
word normalisation
part-of-speech tagging
lemmatisation
named entities
manual annotation
TEI
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  Slovenian language resource repository CLARIN.SI
Description:  http://www.language-archives.org/archive/clarin.si
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.clarin.si:11356/1240
DateStamp:  2019-10-10
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk; Miličević, Maja; Samardžić, Tanja. 2019. Jožef Stefan Institute.
Terms: area_Europe country_RS dcmi_Text iso639_srp olac_primary_text


http://www.language-archives.org/item.php/oai:www.clarin.si:11356/1240
Up-to-date as of: Fri Jan 10 9:22:59 EST 2020