OLAC Record

Title:MASRI Synthetic
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Hernández Mena, Carlos Daniel, et al. MASRI Synthetic LDC2022S08. Web Download. Philadelphia: Linguistic Data Consortium, 2022
Contributor:Hernández Mena, Carlos Daniel
Gatt, Albert
Borg, Claudia
DeMarco, Andrea
van der Plas, Lonneke
Date (W3CDTF):2022
Date Issued (W3CDTF):2022-09-15
Description:*Introduction* MASRI (Maltese Automatic Speech Recognition I) Synthetic was developed by the MASRI team at the University of Malta and consists of approximately 99 hours of synthesized Maltese speech. *Data* Source sentences were extracted from the Maltese Language Resource Server (MLRS) corpus, comprised of written or transcribed Maltese covering various genres, including parliamentary debates, news, law, opinion, sports, culture, academic, literature and religious texts. Text was processed through the CrimsonWing text-to-speech system to generate speech files. Synthesized speech was created with 210 voices (105 male and 105 female). Audio files are presented as 16kHz, 16-bit, single channel flac files. When uncompressed, they produce PCM wav files. Transcripts are contained in a single plain text file encoded as UTF-8. *Samples* Please view the following samples: * Female Audio (FLAC) * Female Transcript (TXT) * Male Audio (FLAC) * Male Transcript (TXT) *Updates* None at this time.
Extent:Corpus size: 7039442 KB
Format:Sampling Rate: 16000
Sampling Format: flac
ISBN: 1-58563-995-8
ISLRN: 518-019-551-096-3
DOI: 10.35111/wc8h-h752
Language (ISO639):mlt
License:MASRI Synthetic Agreement: https://catalog.ldc.upenn.edu/license/masri-synthetic-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2022S08
Rights Holder:Portions © 2022 University of Malta, © 2022 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text


Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2022S08
DateStamp:  2023-01-01
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Hernández Mena, Carlos Daniel; Gatt, Albert; Borg, Claudia; DeMarco, Andrea; van der Plas, Lonneke. 2022. Linguistic Data Consortium.
Terms: area_Europe country_MT dcmi_Sound dcmi_Text iso639_mlt olac_primary_text

Up-to-date as of: Sun Jun 16 7:35:11 EDT 2024