![]() |
OLAC Record oai:lindat.mff.cuni.cz:11234/1-1662 |
| Metadata | ||
| Title: | Deltacorpus | |
| Bibliographic Citation: | http://hdl.handle.net/11234/1-1662 | |
| Creator: | Mareček, David | |
| Yu, Zhiwei | ||
| Zeman, Daniel | ||
| Žabokrtský, Zdeněk | ||
| Date (W3CDTF): | 2016-03-22T16:44:19Z | |
| Date Available: | 2016-03-22T16:44:19Z | |
| Description: | Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia). | |
| Identifier (URI): | http://hdl.handle.net/11234/1-1662 | |
| Is Replaced By (URI): | http://hdl.handle.net/11234/1-1743 | |
| Language: | Belarusian | |
| Bosnian | ||
| Bulgarian | ||
| Czech | ||
| Serbo-Croatian | ||
| Croatian | ||
| Upper Sorbian | ||
| Macedonian | ||
| Polish | ||
| Russian | ||
| Slovak | ||
| Slovenian | ||
| Serbian | ||
| Ukrainian | ||
| Latvian | ||
| Lithuanian | ||
| Afrikaans | ||
| Danish | ||
| German | ||
| English | ||
| Faroese | ||
| Western Frisian | ||
| Swiss German | ||
| Icelandic | ||
| Limburgan | ||
| Luxembourgish | ||
| Low German | ||
| Dutch | ||
| Norwegian Nynorsk | ||
| Norwegian | ||
| Scots | ||
| Swedish | ||
| Yiddish | ||
| Aragonese | ||
| Asturian | ||
| Catalan | ||
| French | ||
| Galician | ||
| Haitian | ||
| Italian | ||
| Latin | ||
| Lombard | ||
| Neapolitan | ||
| Piemontese | ||
| Portuguese | ||
| Romanian | ||
| Spanish | ||
| Venetian | ||
| Walloon | ||
| Breton | ||
| Welsh | ||
| Scottish Gaelic | ||
| Irish | ||
| Modern Greek (1453-) | ||
| Armenian | ||
| Albanian | ||
| Dimli (individual language) | ||
| Persian | ||
| Gilaki | ||
| Kurdish | ||
| Tajik | ||
| Bengali | ||
| Bishnupriya | ||
| Gujarati | ||
| Fiji Hindi | ||
| Hindi | ||
| Marathi | ||
| Nepali (macrolanguage) | ||
| Urdu | ||
| Amharic | ||
| Arabic | ||
| Egyptian Arabic | ||
| Hebrew | ||
| Estonian | ||
| Finnish | ||
| Hungarian | ||
| Basque | ||
| Georgian | ||
| Chuvash | ||
| Azerbaijani | ||
| Turkish | ||
| Uzbek | ||
| Kazakh | ||
| Tatar | ||
| Yakut | ||
| Korean | ||
| Mongolian | ||
| Telugu | ||
| Kannada | ||
| Malayalam | ||
| Tamil | ||
| Newari | ||
| Vietnamese | ||
| Indonesian | ||
| Javanese | ||
| Malagasy | ||
| Maori | ||
| Malay (macrolanguage) | ||
| Pampanga | ||
| Sundanese | ||
| Tagalog | ||
| Waray (Philippines) | ||
| Swahili (macrolanguage) | ||
| Esperanto | ||
| Ido | ||
| Interlingua (International Auxiliary Language Association) | ||
| Volapük | ||
| Language (ISO639): | bel | |
| bos | ||
| bul | ||
| ces | ||
| hbs | ||
| hrv | ||
| hsb | ||
| mkd | ||
| pol | ||
| rus | ||
| slk | ||
| slv | ||
| srp | ||
| ukr | ||
| lav | ||
| lit | ||
| afr | ||
| dan | ||
| deu | ||
| eng | ||
| fao | ||
| fry | ||
| gsw | ||
| isl | ||
| lim | ||
| ltz | ||
| nds | ||
| nld | ||
| nno | ||
| nor | ||
| sco | ||
| swe | ||
| yid | ||
| arg | ||
| ast | ||
| cat | ||
| fra | ||
| glg | ||
| hat | ||
| ita | ||
| lat | ||
| lmo | ||
| nap | ||
| pms | ||
| por | ||
| ron | ||
| spa | ||
| vec | ||
| wln | ||
| bre | ||
| cym | ||
| gla | ||
| gle | ||
| ell | ||
| hye | ||
| sqi | ||
| diq | ||
| fas | ||
| glk | ||
| kur | ||
| tgk | ||
| ben | ||
| bpy | ||
| guj | ||
| hif | ||
| hin | ||
| mar | ||
| nep | ||
| urd | ||
| amh | ||
| ara | ||
| arz | ||
| heb | ||
| est | ||
| fin | ||
| hun | ||
| eus | ||
| kat | ||
| chv | ||
| aze | ||
| tur | ||
| uzb | ||
| kaz | ||
| tat | ||
| sah | ||
| kor | ||
| mon | ||
| tel | ||
| kan | ||
| mal | ||
| tam | ||
| new | ||
| vie | ||
| ind | ||
| jav | ||
| mlg | ||
| mri | ||
| msa | ||
| pam | ||
| sun | ||
| tgl | ||
| war | ||
| swa | ||
| epo | ||
| ido | ||
| ina | ||
| vol | ||
| Publisher: | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) | |
| Rights: | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) | |
| http://creativecommons.org/licenses/by-sa/4.0/ | ||
| Subject: | part of speech | |
| tagging | ||
| semi-supervised | ||
| cross-language | ||
| Type: | corpus | |
| Type (DCMI): | Text | |
| Type (OLAC): | primary_text | |
OLAC Info |
||
| Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
| Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
| GetRecord: | OAI-PMH request for OLAC format | |
| GetRecord: | Pre-generated XML file | |
OAI Info |
||
| OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-1662 | |
| DateStamp: | 2021-06-29 | |
| GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
| Citation: | Mareček, David; Yu, Zhiwei; Zeman, Daniel; Žabokrtský, Zdeněk. 2016. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL). | |
| Terms: | area_Africa area_Americas area_Asia area_Europe area_Pacific country_AM country_BA country_BD country_BE country_BG country_BY country_CH country_CZ country_DE country_DK country_EG country_ES country_ET country_FI country_FJ country_FR country_GB country_GE country_GR country_HR country_HT country_HU country_ID country_IE country_IL country_IN country_IR country_IS country_IT country_KR country_KZ country_LT country_LU country_MK country_NL country_NO country_NP country_NZ country_PH country_PK country_PL country_PT country_RO country_RS country_RU country_SE country_SI country_SK country_TJ country_TR country_UA country_VA country_VN country_ZA dcmi_Text iso639_afr iso639_amh iso639_ara iso639_arg iso639_arz iso639_ast iso639_aze iso639_bel iso639_ben iso639_bos iso639_bpy iso639_bre iso639_bul iso639_cat iso639_ces iso639_chv iso639_cym iso639_dan iso639_deu iso639_diq iso639_ell iso639_eng iso639_epo iso639_est iso639_eus iso639_fao iso639_fas iso639_fin iso639_fra iso639_fry iso639_gla iso639_gle iso639_glg iso639_glk iso639_gsw iso639_guj iso639_hat iso639_hbs iso639_heb iso639_hif iso639_hin iso639_hrv iso639_hsb iso639_hun iso639_hye iso639_ido iso639_ina iso639_ind iso639_isl iso639_ita iso639_jav iso639_kan iso639_kat iso639_kaz iso639_kor iso639_kur iso639_lat iso639_lav iso639_lim iso639_lit iso639_lmo iso639_ltz iso639_mal iso639_mar iso639_mkd iso639_mlg iso639_mon iso639_mri iso639_msa iso639_nap iso639_nds iso639_nep iso639_new iso639_nld iso639_nno iso639_nor iso639_pam iso639_pms iso639_pol iso639_por iso639_ron iso639_rus iso639_sah iso639_sco iso639_slk iso639_slv iso639_spa iso639_sqi iso639_srp iso639_sun iso639_swa iso639_swe iso639_tam iso639_tat iso639_tel iso639_tgk iso639_tgl iso639_tur iso639_ukr iso639_urd iso639_uzb iso639_vec iso639_vie iso639_vol iso639_war iso639_wln iso639_yid olac_primary_text | |