OLAC Record
oai:www.ldc.upenn.edu:LDC93S4B

Metadata
Title:ATIS0 Pilot
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Hemphill, Charles T., et al. ATIS0 Pilot LDC93S4B. Web Download. Philadelphia: Linguistic Data Consortium, 1993
Contributor:Hemphill, Charles T.
Godfrey, John J.
Doddington, George R.
Garofolo, John S.
Fiscus, Jonathan G.
Dahlgren, Nancy
Fisher, William
Tjaden, Brett
Pallett, David
Date (W3CDTF):1993
Description:LDC93S4A - Complete ATIS0 corpus LDC93S4B - ATIS0 Pilot LDC93S4B-2 - ATIS0 Read LDC93S4B-3 - ATIS0 SD-Read The ATIS0 Corpus is comprised of six parts: one with spontaneous data from 36 speakers; one with read versions of the data from 20 of those speakers, along with some adaptation material; and four with extensive speaker dependent material from the ATIS domain, read by ten of the same speakers. All ATIS speech data is recorded at 16kHz sample rate, 16-bit quantization, from two different microphones, a close-talking (Sennheiser HMD414) and a desk-top (Crown PCC-160) model. The first disc (ATIS0 Pilot) contains spontaneous utterances elicited in a "Wizard-of-Oz" simulation, along with the relational database containing the travel information (excluding connecting flights). 36 speakers produced a total of 912 utterances. The second disc (ATIS0 Read) contains "read" versions of the spontaneous utterances for 20 of the 36 speakers above, for a total of 478 productions. This is supplemented by a set of 40 "adaptation" sentences read by each of the 20 speakers. The third through the sixth discs (ATIS0 SD-Read) contain "read" speech in the ATIS domain for ten of the speakers on the first disc. They read a total of 3,171 utterances, or approximately 317 utterances per speaker. This data was collected for the purpose of training speaker-dependent speech recognition systems for the ATIS0 domain. Two of these four discs contain the close-talking (Sennheiser) microphone data and the other two contain corresponding data for the desk-top (Crown PCC-160) microphone. Thus there are 6,342 waveform files on the four discs.
Format:Sampling Rate: 16000
Sampling Format: 1-channel pcm
Identifier:LDC93S4B
https://catalog.ldc.upenn.edu/LDC93S4B
ISBN: 1-58563-002-0
ISLRN: 477-521-980-972-9
DOI: 10.35111/4t8c-r397
Language:English
Language (ISO639):eng
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC93S4B
Rights Holder:Portions © 1993 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC93S4B
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Hemphill, Charles T.; Godfrey, John J.; Doddington, George R.; Garofolo, John S.; Fiscus, Jonathan G.; Dahlgren, Nancy; Fisher, William; Tjaden, Brett; Pallett, David. 1993. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC93S4B
Up-to-date as of: Mon Mar 25 7:19:49 EDT 2024