OLAC Record oai:www.ldc.upenn.edu:LDC2004S05 |
Metadata | ||
Title: | ISL Meeting Speech Part 1 | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Burger, Susanne, Victoria MacLaren, and Alex Waibel. ISL Meeting Speech Part 1 LDC2004S05. Web Download. Philadelphia: Linguistic Data Consortium, 2004 | |
Contributor: | Burger, Susanne | |
MacLaren, Victoria | ||
Waibel, Alex | ||
Date (W3CDTF): | 2004 | |
Date Issued (W3CDTF): | 2004-05-21 | |
Description: | *Introduction* ISL Meeting Speech Part 1 was produced by the Linguistic Data Consortium (LDC) and contains 10 hours of recorded English meeting speech. This is the first subset of the ISL Meeting Corpus (112 meetings). It contains 18 meetings collected at the Interactive Systems Laboratories (ISL) at Carnegie Mellon University in Pittsburgh, PA, during the years 2000-2001. The recorded meetings were either natural meetings where participants needed to meet in the real world, or artificial meetings, which were designed explicitly for the purposes of data collection but still had real topics and tasks. The duration of the meetings in this corpus ranges from eight to 64 minutes and averages 34 minutes. The associated word-level orthographic transcriptions for these speech files are available as ISL Meeting Transcripts Part 1 (LDC2004T10). *Data* During meeting recordings, each speaker wore an individual lapel microphone and was recorded via an Alesis 8-channel mix board and an ECHO Layla 8-channel sound card. This setup was designed to obtain a consumer- or application-style sound quality. All meetings were recorded in the same instrumented meeting area. The speech for each meeting consists of wave files for each channel and a wave file containing a mix of all channels. In total, there are 105 audio files totaling 54 hours of audio, which represent 10 hours of meeting speech. The audio was collected at a 16 kHz sample-rate. Audio files for each meeting are provided as separate time-synchronous recordings for each channel, encoded as 16-bit (little-endian) wave files. There are a total of 31 unique speakers in the corpus. Meetings involved anywhere from three to nine participants, averaging at five. The corpus contains a significant proportion of non-native English speakers, varying in fluency. *Samples* For an example of the data in this corpus, please listen to this sample (WAV). *Sponsorship* The collection and preparation of this corpus was made possible in large part through funding from DARPA, both through the GENOA project and through ROAR. *Updates* None at this time. | |
Extent: | Corpus size: 6081740 KB | |
Format: | Sampling Rate: 16000 | |
Sampling Format: pcm | ||
Identifier: | LDC2004S05 | |
https://catalog.ldc.upenn.edu/LDC2004S05 | ||
ISBN: 1-58563-294-5 | ||
ISLRN: 459-840-211-562-6 | ||
DOI: 10.35111/64zw-4k57 | ||
Language: | English | |
Language (ISO639): | eng | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2004S05 | |
Rights Holder: | Portions © 2000-2003 Interactive Systems Laboratories, Carnegie Mellon University, Pittsburgh, © 2004 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2004S05 | |
DateStamp: | 2024-03-29 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Burger, Susanne; MacLaren, Victoria; Waibel, Alex. 2004. Linguistic Data Consortium. | |
Terms: | area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text |