Bibliographic Citation:http://hdl.handle.net/11372/LRT-1413
Date (W3CDTF):2014-07-30T21:35:35Z
Date Available:2014-07-30T21:35:35Z
Description:Text preprocess (this preprocess service requires that the input text be in plain text format (file .txt) and UTF-8). Basically, it carries out: (i) text segmentation into minor structural units (titles, paragraphs, sentences, etc.); (ii) detection of entities not found in dictionaries (numbers, abbreviations, URLs, emails, proper nouns, etc.); and (iii) the keeping of sequences of two or more words in a single block (dates, phrases, proper nouns, etc.).
Identifier (URI):http://hdl.handle.net/11372/LRT-1413
Language:No linguistic content
Language (ISO639):zxx
Publisher:Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type (DCMI):Software


Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
OaiIdentifier:  oai:lindat.mff.cuni.cz:11372/LRT-1413
DateStamp:  2021-06-29
Citation: n.a. 2014. Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra.
