OLAC Record
oai:www.ldc.upenn.edu:LDC2006S46

Metadata
Title:Arabic Broadcast News Speech
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Maamouri, Mohamed, David Graff, and Christopher Cieri. Arabic Broadcast News Speech LDC2006S46. Web Download. Philadelphia: Linguistic Data Consortium, 2006
Contributor:Maamouri, Mohamed
Graff, David
Cieri, Christopher
Date (W3CDTF):2006
Date Issued (W3CDTF):2006-12-19
Description:*Introduction* Arabic Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and contains eight audio files totalling 10 hours of Arabic broadcast speech. The data was recorded by LDC from Voice of America (VOA) satellite radio news broadcasts in Arabic transmitted between June 2000 and January 2001. The corresponding transcripts for these speech files are available in Arabic Broadcast News Transcripts (LDC2006T20). This work was undertaken in the Networking Data Centers (NetDC) project (MLIS-5017, NSF IIS-9982201) in conjunction with the European Language Resources Association (ELRA). ELRA collected 22.5 hours of Arabic broadcast data from Radio Orient (France) that is available in NetDC Arabic Broadcast News Speech Corpus (ELRA-S0157). The goal of the NetDC project was to improve the infrastructure for language resources by designing and implementing new modes of cooperation between LDC and ELRA. *Data* The recordings were captured from a dedicated satellite receiver and stored as 16-bit PCM, 16-kHz, single-channel, in NIST SPHERE format. The duration of each recording is either 60 minutes or 120 minutes, depending on the VOA broadcast schedule. The date (YYYYMMDD), start-time, and end-time (HHMM EST) for each recording are indicated in the file names. The sample data are not compressed. *Samples* For an example of the speech in this corpus, please listen to this sample (WAV). *Updates* None at this time.
Extent:Corpus size: 1153433 KB
Format:Sampling Rate: 16000
Sampling Format: pcm
Identifier:LDC2006S46
https://catalog.ldc.upenn.edu/LDC2006S46
ISBN: 1-58563-419-0
ISLRN: 537-141-493-555-6
DOI: 10.35111/njz5-b969
Language:Standard Arabic
Language (ISO639):arb
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2006S46
Rights Holder:Portions © 2000, 2001, 2002, 2005, 2006 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2006S46
DateStamp:  2021-04-16
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Maamouri, Mohamed; Graff, David; Cieri, Christopher. 2006. Linguistic Data Consortium.
Terms: area_Asia country_SA dcmi_Sound iso639_arb olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2006S46
Up-to-date as of: Thu Oct 24 7:30:17 EDT 2024