OLAC Record

Title:EnTam: An English-Tamil Parallel Corpus (EnTam v2.0)
Bibliographic Citation:http://hdl.handle.net/11234/1-1454
Creator:Ramasamy, Loganathan
Bojar, Ondřej
Žabokrtský, Zdeněk
Date (W3CDTF):2014-10-31T23:07:27Z
Date Available:2014-10-31T23:07:27Z
Description:EnTam is a sentence aligned English-Tamil bilingual corpus from some of the publicly available websites that we have collected for NLP research involving Tamil. The standard set of processing has been applied on the the raw web data before the data became available in sentence aligned English-Tamil parallel corpus suitable for various NLP tasks. The parallel corpus includes texts from bible, cinema and news domains.
Identifier (URI):http://hdl.handle.net/11234/1-1454
Language (ISO639):eng
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Subject:parallel corpus
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-1454
DateStamp:  2021-06-29
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Ramasamy, Loganathan; Bojar, Ondřej; Žabokrtský, Zdeněk. 2014. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Asia area_Europe country_GB country_IN dcmi_Text iso639_eng iso639_tam olac_primary_text

Up-to-date as of: Thu Oct 5 0:40:21 EDT 2023