OLAC Record oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-68D9-0 |
Metadata | ||
Title: | English Models (Morphium + WSJ) for MorphoDiTa | |
Bibliographic Citation: | http://hdl.handle.net/11858/00-097C-0000-0023-68D9-0 | |
Creator: | Straka, Milan | |
Straková, Jana | ||
Date (W3CDTF): | 2014-04-03T15:58:02Z | |
Date Available: | 2014-04-03T15:58:02Z | |
Description: | English models for MorphoDiTa, providing morphological analysis, morphological generation and part-of-speech tagging. The morphological dictionary is created from Morphium and SCOWL (Spell Checker Oriented Word Lists), the PoS tagger is trained on WSJ (Wall Street Journal). | |
This work has been using language resources developed and/or stored and/or distributed by the LINDAT/CLARIN project of the Ministry of Education of the Czech Republic (project LM2010013). The morphological POS analyzer development was supported by grant of the Ministry of Education, Youth and Sports of the Czech Republic No. LC536 "Center for Computational Linguistics". The morphological POS analyzer research was performed by Johanka Spoustová (Spoustová 2008; the Treex::Tool::EnglishMorpho::Analysis Perl module). The lemmatizer was implemented by Martin Popel (Popel 2009; the Treex::Tool::EnglishMorpho::Lemmatizer Perl module). The lemmatizer is based on morpha, which was released under LGPL licence as a part of RASP system (http://ilexir.co.uk/applications/rasp). The tagger algorithm and feature set research was supported by the projects MSM0021620838 and LC536 of Ministry of Education, Youth and Sports of the Czech Republic, GA405/09/0278 of the Grant Agency of the Czech Republic and 1ET101120503 of Academy of Sciences of the Czech Republic. The research was performed by Drahomíra "johanka" Spoustová, Jan Hajič, Jan Raab and Miroslav Spousta. | ||
Identifier (URI): | http://hdl.handle.net/11858/00-097C-0000-0023-68D9-0 | |
Language: | English | |
Language (ISO639): | eng | |
Publisher: | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) | |
Rights: | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) | |
http://creativecommons.org/licenses/by-nc-sa/3.0/ | ||
Subject: | MorphoDiTa | |
English | ||
morphological analysis | ||
morphological generation | ||
PoS tagging | ||
English language | ||
Subject (ISO639): | eng | |
Type: | languageDescription | |
Type (DCMI): | Text | |
Type (OLAC): | language_description | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-68D9-0 | |
DateStamp: | 2021-06-29 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Straka, Milan; Straková, Jana. 2014. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL). | |
Terms: | area_Europe country_GB dcmi_Text iso639_eng olac_language_description | |
Inferred Metadata | ||
Country: | United Kingdom | |
Area: | Europe |