![]() |
OLAC Record oai:www.clarin.si:11356/1140 |
Metadata | ||
Title: | News comment corpus Janes-News 1.0 | |
Bibliographic Citation: | http://hdl.handle.net/11356/1140 | |
Creator: | Erjavec, Tomaž | |
Ljubešić, Nikola | ||
Fišer, Darja | ||
Date (W3CDTF): | 2017-08-31T07:19:08Z | |
Date Available: | 2017-08-31T07:19:08Z | |
Description: | Janes-News is an annotated corpus of comments on online news articles from websites rtvslo.si, mladina.si, and reporter.si from the period 2007-03 to 2015-01. The corpus is structured into individual texts containing the comments on a news article, together with their metadata. The texts in the corpus are tokenised, sentence segmented, word normalised, morphosyntactically tagged, lemmatised and annotated with named entities. Due to protection of privacy, usernames are not included in the metadata and 'person' as well as 'person derivative' named entities have been removed from the texts. | |
Identifier (URI): | http://hdl.handle.net/11356/1140 | |
Language: | Slovenian | |
Language (ISO639): | slv | |
Publisher: | Jožef Stefan Institute | |
Rights: | Creative Commons - Attribution 4.0 International (CC BY 4.0) | |
https://creativecommons.org/licenses/by/4.0/ | ||
Subject: | computer-mediated communication | |
news comments | ||
word normalisation | ||
named entities | ||
TEI | ||
Type: | corpus | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | Slovenian language resource repository CLARIN.SI | |
Description: | http://www.language-archives.org/archive/clarin.si | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.clarin.si:11356/1140 | |
DateStamp: | 2019-10-10 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. 2017. Jožef Stefan Institute. | |
Terms: | area_Europe country_SI dcmi_Text iso639_slv olac_primary_text |