OLAC Record oai:catalogue.elra.info:ELRA-W0124 |
Metadata | ||
Title: | English-Vietnamese Parallel Corpus | |
Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
Date Available (W3CDTF): | 2018-01-17 | |
Date Issued (W3CDTF): | 2018-01-17 | |
Date Modified (W3CDTF): | 2018-01-17 | |
Description: | This is a corpus of 500,000 English-Vietnamese sentence pairs, built to develop SMT (Statistical Machine Translation) systems. The parallel corpus contains English documents translated by professional translators into Vietnamese. The source texts include books, dictionaries, newspapers, online news, collected between 2000 and 2007.All Vietnamese sentences have been word-segmented and morphologically analyzed. The texts are provided in TEI format. | |
Identifier: | ELRA-W0124 | |
ISLRN: 838-483-738-912-8 | ||
Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-W0124/ | |
Language: | English | |
Vietnamese | ||
Language (ISO639): | eng | |
vie | ||
Medium: | Not specified | |
Publisher: | ELRA (European Language Resources Association) | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | ELRA Catalogue of Language Resources | |
Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:catalogue.elra.info:ELRA-W0124 | |
DateStamp: | 2018-01-17 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | n.a. 2018. ELRA (European Language Resources Association). | |
Terms: | area_Asia area_Europe country_GB country_VN dcmi_Text iso639_eng iso639_vie olac_primary_text |