OLAC Record
oai:www.clarin.si:11356/1169

Metadata
Title:cSMTiser: word standardisation
Bibliographic Citation:http://hdl.handle.net/11356/1169
Creator:Ljubešić, Nikola
Perovšek, Matic
Erjavec, Tomaž
Date (W3CDTF):2017-11-27T15:14:31Z
Date Available:2017-11-27T15:14:31Z
Description:Word standardisation of non-standard language as found in user-generated content, using cSMTiser (https://github.com/clarinsi/csmtiser), a tool for text normalisation via character-level machine translation. The tool has been trained on the Janes-Norm dataset (http://hdl.handle.net/11356/1084) and background resources.
Identifier (URI):http://hdl.handle.net/11356/1169
Language:Slovenian
Language (ISO639):slv
Publisher:Jožef Stefan Institute
Subject:word normalisation
Type:toolService
Type (DCMI):Software

OLAC Info

Archive:  Slovenian language resource repository CLARIN.SI
Description:  http://www.language-archives.org/archive/clarin.si
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.clarin.si:11356/1169
DateStamp:  2018-01-22
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Ljubešić, Nikola; Perovšek, Matic; Erjavec, Tomaž. 2017. Jožef Stefan Institute.
Terms: area_Europe country_SI dcmi_Software iso639_slv


http://www.language-archives.org/item.php/oai:www.clarin.si:11356/1169
Up-to-date as of: Wed Jul 17 9:50:49 EDT 2019