OLAC Record oai:www.ldc.upenn.edu:LDC93S4B |
Metadata | ||
Title: | ATIS0 Pilot | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Hemphill, Charles T., et al. ATIS0 Pilot LDC93S4B. Web Download. Philadelphia: Linguistic Data Consortium, 1993 | |
Contributor: | Hemphill, Charles T. | |
Godfrey, John J. | ||
Doddington, George R. | ||
Garofolo, John S. | ||
Fiscus, Jonathan G. | ||
Dahlgren, Nancy | ||
Fisher, William | ||
Tjaden, Brett | ||
Pallett, David | ||
Date (W3CDTF): | 1993 | |
Description: | LDC93S4A - Complete ATIS0 corpus LDC93S4B - ATIS0 Pilot LDC93S4B-2 - ATIS0 Read LDC93S4B-3 - ATIS0 SD-Read The ATIS0 Corpus is comprised of six parts: one with spontaneous data from 36 speakers; one with read versions of the data from 20 of those speakers, along with some adaptation material; and four with extensive speaker dependent material from the ATIS domain, read by ten of the same speakers. All ATIS speech data is recorded at 16kHz sample rate, 16-bit quantization, from two different microphones, a close-talking (Sennheiser HMD414) and a desk-top (Crown PCC-160) model. The first disc (ATIS0 Pilot) contains spontaneous utterances elicited in a "Wizard-of-Oz" simulation, along with the relational database containing the travel information (excluding connecting flights). 36 speakers produced a total of 912 utterances. The second disc (ATIS0 Read) contains "read" versions of the spontaneous utterances for 20 of the 36 speakers above, for a total of 478 productions. This is supplemented by a set of 40 "adaptation" sentences read by each of the 20 speakers. The third through the sixth discs (ATIS0 SD-Read) contain "read" speech in the ATIS domain for ten of the speakers on the first disc. They read a total of 3,171 utterances, or approximately 317 utterances per speaker. This data was collected for the purpose of training speaker-dependent speech recognition systems for the ATIS0 domain. Two of these four discs contain the close-talking (Sennheiser) microphone data and the other two contain corresponding data for the desk-top (Crown PCC-160) microphone. Thus there are 6,342 waveform files on the four discs. | |
Format: | Sampling Rate: 16000 | |
Sampling Format: 1-channel pcm | ||
Identifier: | LDC93S4B | |
https://catalog.ldc.upenn.edu/LDC93S4B | ||
ISBN: 1-58563-002-0 | ||
ISLRN: 477-521-980-972-9 | ||
DOI: 10.35111/4t8c-r397 | ||
Language: | English | |
Language (ISO639): | eng | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC93S4B | |
Rights Holder: | Portions © 1993 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC93S4B | |
DateStamp: | 2020-11-30 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Hemphill, Charles T.; Godfrey, John J.; Doddington, George R.; Garofolo, John S.; Fiscus, Jonathan G.; Dahlgren, Nancy; Fisher, William; Tjaden, Brett; Pallett, David. 1993. Linguistic Data Consortium. | |
Terms: | area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text |