OLAC Record oai:lindat.mff.cuni.cz:11234/1-2586 |
Metadata | ||
Title: | Indonesian web corpus (idWac) | |
Bibliographic Citation: | http://hdl.handle.net/11234/1-2586 | |
Creator: | Medveď, Marek | |
Suchomel, Vít | ||
Date (W3CDTF): | 2018-01-09T15:57:37Z | |
Date Available: | 2018-01-09T15:57:37Z | |
Description: | Indonesian text corpus from web. Crawling done by SpiderLing in 2017. Filtering by JusText and Onion (see http://corpus.tools/ for details). Tagged and lemmatized by MorphInd (http://septinalarasati.com/morphind/). | |
Identifier (URI): | http://hdl.handle.net/11234/1-2586 | |
Language: | Indonesian | |
Language (ISO639): | ind | |
Publisher: | Natural Language Processing Centre, Faculty of Informatics, Masaryk University | |
Rights: | NLP Centre Web Corpus License | |
https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC | ||
Subject: | corpus | |
lemmatization | ||
PoS tagging | ||
Type: | corpus | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-2586 | |
DateStamp: | 2021-06-29 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Medveď, Marek; Suchomel, Vít. 2018. Natural Language Processing Centre, Faculty of Informatics, Masaryk University. | |
Terms: | area_Asia country_ID dcmi_Text iso639_ind olac_primary_text |