OLAC Record oai:catalogue.elra.info:ELRA-W0092 |
Metadata | ||
Title: | TRAD Pashto Monolingual text Corpus | |
Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
Date Available (W3CDTF): | 2016-04-06 | |
Date Issued (W3CDTF): | 2016-04-06 | |
Date Modified (W3CDTF): | 2016-04-06 | |
Description: | This is a monolingual text corpus in Pashto. The corpus contains about 112,000,000 tokens collected from 46 different blogs and websites. Identified and negotiated or freely available sources have been crawled in 2012, cleaned and XML-formatted. Pashto is an indo-iranian language spoken by the Pashtun people mainly in Pakistan and Afghanistan.This corpus was produced by ELDA within the PEA TRAD project supported by the French Ministry of Defence (DGA). | |
Identifier: | ELRA-W0092 | |
ISLRN: 394-903-293-388-0 | ||
Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-W0092/ | |
Language: | Pushto; Pashto | |
Language (ISO639): | pus | |
Medium: | Not specified | |
Publisher: | ELRA (European Language Resources Association) | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | ELRA Catalogue of Language Resources | |
Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:catalogue.elra.info:ELRA-W0092 | |
DateStamp: | 2016-04-06 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | n.a. 2016. ELRA (European Language Resources Association). | |
Terms: | dcmi_Text iso639_pus olac_primary_text |