OLAC Record oai:www.ldc.upenn.edu:LDC2023L01 |
Metadata | ||
Title: | Moroccan Arabic - English Lexical Database | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Maamouri, Mohamed, and David Graff. Moroccan Arabic - English Lexical Database LDC2023L01. Web Download. Philadelphia: Linguistic Data Consortium, 2023 | |
Contributor: | Maamouri, Mohamed | |
Graff, David | ||
Date (W3CDTF): | 2023 | |
Date Issued (W3CDTF): | 2023-06-15 | |
Description: | *Introduction* Moroccan Arabic - English Lexical Database was developed by the Linguistic Data Consortium (LDC). It is comprised of a set of five interrelated tables presenting each Moroccan Arabic word as an orthographic form in Arabic script and a pronunciation form in International Phonetic Alphabet (IPA) format. This release contains over 21,000 Moroccan Arabic words in Arabic script and IPA notation and more than 33,000 English tokens. This lexical database is the result of a collaboration with Georgetown University Press (GUP) to enhance and update three dialectal Arabic dictionaries -- Iraqi, Moroccan and Syrian -- originally published in paper form in the 1960s by GUP. LDC also undertook to develop a lexical database for each dialect. The Georgetown Dictionary of Moroccan Arabic was published in 2019; this work was based on, and expanded, A Dictionary of Moroccan Arabic. The several enhancements developed by LDC included facilitating comparisons across Arabic dialects and Modern Standard Arabic by providing Arabic script spellings and IPA pronunciations to Moroccan words and phrases; promoting ease of use by language learners and researchers by developing reasonable orthographic conventions for applying the Arabic alphabet to the dialect; and facilitating a user's understanding of morphological and lexical relations by adding information on the linguistic structures of Moroccan Arabic. *Data* The number of entries in each table are as follows: * Roots 3,567 * Lemmas 14,255 * Wordforms 19,927 * Definitions 24,911 * Phrases 4,418 Each table is presented as a tab-delimited, plain-text file with Unicode UTF-8 character encoding and UNIX/Linux-style line terminations (line-feed character only, no carriage-return). *Acknowledgments* This work was supported by the U.S. Department of Education International Research Studies Program (#P017A0800441) with additional support from GUP and LDC. *Samples* Please view these samples: * Roots * Lemmas * Wordforms * Definitions * Phrases *Updates* None at this time. | |
Extent: | Corpus size: 3313 KB | |
Identifier: | LDC2023L01 | |
https://catalog.ldc.upenn.edu/LDC2023L01 | ||
ISLRN: 107-292-828-045-8 | ||
DOI: 10.35111/8fz8-r860 | ||
Language: | Moroccan Arabic | |
English | ||
Language (ISO639): | ary | |
eng | ||
License: | Moroccan Arabic - English Lexical Database Agreement: https://catalog.ldc.upenn.edu/license/moroccan-arabic-english-lexical-database-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2023L01 | |
Rights Holder: | Portions © 2023 Georgetown University Press, © 2023 Trustees of the University of Pennsylvania | |
Type (DCMI): | Text | |
Type (OLAC): | lexicon | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2023L01 | |
DateStamp: | 2024-01-01 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Maamouri, Mohamed; Graff, David. 2023. Linguistic Data Consortium. | |
Terms: | area_Africa area_Europe country_GB country_MA dcmi_Text iso639_ary iso639_eng olac_lexicon |