OLAC Record
oai:lindat.mff.cuni.cz:11372/LRT-5102

Metadata
Title:CALEM (Comprehensive Arabic LEMmas)
Bibliographic Citation:http://hdl.handle.net/11372/LRT-5102
Creator:Namly, Driss
Bouzoubaa, Karim
El Jihad, Abdelhamid
Date (W3CDTF):2023-03-27T14:29:53Z
Date Available:2023-03-27T14:29:53Z
Description:Comprehensive Arabic LEMmas is a lexicon covering a large list of Arabic lemmas and their corresponding inflected word forms (stems) with details (POS + Root). Each lexical entry represents a lemma followed by all its possible stems and each stem is enriched by its morphological features especially the root and the POS. It is composed of 164,845 lemmas representing 7,200,918 stems, detailed as follow: 757 Arabic particles 2,464,631 verbal stems 4,735,587 nominal stems The lexicon is provided as an LMF conformant XML-based file in UTF8 encoding, which represents about 1,22 Gb of data. Citation: – Namly Driss, Karim Bouzoubaa, Abdelhamid El Jihad, and Si Lhoussain Aouragh. “Improving Arabic Lemmatization Through a Lemmas Database and a Machine-Learning Technique.” In Recent Advances in NLP: The Case of Arabic Language, pp. 81-100. Springer, Cham, 2020.
Identifier (URI):http://hdl.handle.net/11372/LRT-5102
Language:Arabic
Language (ISO639):ara
Publisher:ALELM
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
http://creativecommons.org/licenses/by-nc-sa/4.0/
Subject:lexicon
lemmatization
stemming;
Arabic language
Subject (ISO639):ara
Type:lexicalConceptualResource
Type (DCMI):Text
Type (OLAC):lexicon

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11372/LRT-5102
DateStamp:  2023-03-27
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Namly, Driss; Bouzoubaa, Karim; El Jihad, Abdelhamid. 2023. ALELM.
Terms: dcmi_Text iso639_ara olac_lexicon

Inferred Metadata

Country: 
Area: 


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11372/LRT-5102
Up-to-date as of: Thu Oct 5 0:43:34 EDT 2023