OLAC Record
oai:lindat.mff.cuni.cz:11234/1-3195

Metadata
Title:Large-Scale Colloquial Persian 0.5
Bibliographic Citation:http://hdl.handle.net/11234/1-3195
Creator:Abdi Khojasteh, Hadi
Ansari, Ebrahim
Bohlouli, Mahdi
Date (W3CDTF):2020-03-18T10:44:43Z
Date Available:2020-03-18T10:44:43Z
Description:"Large Scale Colloquial Persian Dataset" (LSCP) is hierarchically organized in asemantic taxonomy that focuses on multi-task informal Persian language understanding as a comprehensive problem. LSCP includes 120M sentences from 27M casual Persian tweets with its dependency relations in syntactic annotation, Part-of-speech tags, sentiment polarity and automatic translation of original Persian sentences in five different languages (EN, CS, DE, IT, HI).
Identifier (URI):http://hdl.handle.net/11234/1-3195
Language:Persian
English
German
Czech
Italian
Hindi
Language (ISO639):fas
eng
deu
ces
ita
hin
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Institute for Advanced Studies in Basic Sciences (IASBS)
Rights:Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
http://creativecommons.org/licenses/by-nc-nd/4.0/
Subject:PoS tagging
corpus
annotated corpus
multilingual
derivation
dependency parser
machine translation
informal language
spoken language
monolingual corpus
bilingual corpus annotation
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-3195
DateStamp:  2020-03-18
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Abdi Khojasteh, Hadi; Ansari, Ebrahim; Bohlouli, Mahdi. 2020. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Asia area_Europe country_CZ country_DE country_GB country_IN country_IT dcmi_Text iso639_ces iso639_deu iso639_eng iso639_fas iso639_hin iso639_ita olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-3195
Up-to-date as of: Sun Apr 26 14:01:42 EDT 2020