OLAC Record oai:www.ldc.upenn.edu:LDC2023S09 |
Metadata | ||
Title: | REMIX Telephone Collection | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Graff, David, et al. REMIX Telephone Collection LDC2023S09. Web Download. Philadelphia: Linguistic Data Consortium, 2023 | |
Contributor: | Graff, David | |
Jones, Karen | ||
Strassel, Stephanie | ||
Walker, Kevin | ||
Date (W3CDTF): | 2023 | |
Date Issued (W3CDTF): | 2023-11-15 | |
Description: | *Introduction* REMIX Telephone Collection was developed by the Linguistic Data Consortium (LDC) and contains 320 hours of English conversational telephone speech from 358 speakers who had completed all tasks in one of the previous LDC Mixer collections, specifically, Mixers 4-7. The data was collected in 2012; recordings in this corpus were used to support the NIST 2012 Speaker Recognition Evaluation. *Data* The audio recordings were generated using LDC's computer telephony system capable of collecting speech from the telephone network. Recruited speakers were connected through a robot operator to carry on casual conversations on suggested topics lasting up to 10 minutes. Subjects were asked to complete 12 calls, half of those in a "noisy" environment. Examples of proposed noisy environments included using a speakerphone, calling from a busy street, noisy store or office, or calling from a room with loud background noise. The documentation for this release includes call topics, the number of calls per subject, the number of noisy calls and certain speaker demographic information (e.g., year of birth, education level, occupation). The REMIX collection contains 1917 telephone recordings. The files are formatted as 2-channel, 8-bit, mu-law encoded sample data recorded at 8000 samples/second, with a NIST SPHERE-format header on each file. *Samples* SPH file *Updates* None at this time. | |
Extent: | Corpus size: 18057658 KB | |
Format: | Sampling Rate: 8000 | |
Sampling Format: mulaw | ||
Identifier: | LDC2023S09 | |
https://catalog.ldc.upenn.edu/LDC2023S09 | ||
ISLRN: 602-562-840-191-7 | ||
DOI: 10.35111/600z-f268 | ||
Language: | English | |
Language (ISO639): | eng | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2023S09 | |
Rights Holder: | Portions © 2012, 2023 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2023S09 | |
DateStamp: | 2024-01-01 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Graff, David; Jones, Karen; Strassel, Stephanie; Walker, Kevin. 2023. Linguistic Data Consortium. | |
Terms: | area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text |