OLAC Record
oai:catalogue.elra.info:ELRA-S0388

Metadata
Title:GlobalPhone Bulgarian Pronunciation Dictionary 260k entries (extended version)
Access Rights: Rights available for: nonCommercialUse, commercialUse
Date Available (W3CDTF):2017-04-06
Date Issued (W3CDTF):2017-04-06
Date Modified (W3CDTF):2017-04-06
Description:This extended version of the Bulgarian Pronunciation Dictionary called Bulgarian-Dict260k contains pronunciations of more than 260,000 word forms. The dictionary matches in phone set and format the original GlobalPhone Bulgarian Pronunciation Dictionary (see ELRA-S0351) of 20,000 word forms. Bulgarian-Dict260k was built based on the extension of the Bulgarian GlobalPhone text database to improve language modeling and to reduce the high Out-Of-Vocabulary rate resulting from the rich morphology of the Bulgarian language. For this purpose, roughly 9 Million word tokens were collected from the internet sources of national, international, and economic news available from the online newspapers "Banker" (http://www.banker.bg/), "Kesh" (http://www.cash.bg), and “Sega" (http://www.segabg.com/). After text cleaning and normalization, all word forms were extracted. Pronunciations were created in an automatic process using hand-crafted grapheme-to-phoneme rules. The generated pronunciations were manually cross-checked by native speakers, correcting potential errors of the automatic generation.
Identifier:ELRA-S0388
ISLRN: 799-402-906-876-5
Identifier (URI):https://catalog.elra.info/en-us/repository/browse/ELRA-S0388/
Language:Bulgarian
Language (ISO639):bul
Medium:Not specified
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Sound
Type (OLAC):lexicon

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-S0388
DateStamp:  2017-04-06
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2017. ELRA (European Language Resources Association).
Terms: area_Europe country_BG dcmi_Sound dcmi_Text iso639_bul olac_lexicon


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-S0388
Up-to-date as of: Fri Mar 8 7:24:30 EST 2024