OLAC Record oai:www.ldc.upenn.edu:LDC2020S06 |
Metadata | ||
Title: | CALLFRIEND Mandarin Chinese-Taiwan Dialect Second Edition | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Canavan, Alexandra, George Zipperlen, and John Bartlett. CALLFRIEND Mandarin Chinese-Taiwan Dialect Second Edition LDC2020S06. Web Download. Philadelphia: Linguistic Data Consortium, 2020 | |
Contributor: | Canavan, Alexandra | |
Zipperlen, George | ||
Bartlett, John | ||
Date (W3CDTF): | 2020 | |
Date Issued (W3CDTF): | 2020-06-22 | |
Description: | *Introduction* CALLFRIEND Mandarin Chinese-Taiwan Dialect Second Edition was developed by the Linguistic Data Consortium (LDC) and consists of approximately 27 hours of unscripted telephone conversations between native speakers of the Taiwan dialect of Mandarin Chinese. This second edition updates the audio files to wav format, simplifies the directory structure and adds documentation and metadata. The first edition is available as CALLFRIEND Mandarin Chinese-Taiwan Dialect (LDC96S56). The CALLFRIEND series is a collection of telephone conversations in several languages conducted by LDC in support of language identification technology development. Languages covered in the collection include American English, Canadian French, Egyptian Arabic, Farsi, German, Hindi, Japanese, Korean, Mandarin Chinese, Spanish, Tamil and Vietnamese. *Data* All data was collected before July 1997. Participants could speak with a person of their choice on any topic; most called family members and friends. All calls originated in North America. The recorded conversations last up to 30 minutes. The data was recorded as 8kHz u-law SPH encoded stereo files, with one end of the phone call on each channel. In this release, files were converted to WAV format, and information from the original SPH headers is described in the documentation. SPH files are not included in this second edition. The audio files were originally split into train, dev and test folders of 20 recordings each, but they are combined in this release. Completed calls passed through a human auditing process to verify that the target language was spoken by the participants, to check the quality of the recordings, and to record information about dialect, noise and distortion. *Samples* Please listen to this audio sample (WAV). *Updates* None at this time. | |
Extent: | Corpus size: 2338111 KB | |
Format: | Sampling Rate: 8000 | |
Sampling Format: ulaw | ||
Identifier: | LDC2020S06 | |
https://catalog.ldc.upenn.edu/LDC2020S06 | ||
ISBN: 1-58563-930-3 | ||
ISLRN: 310-870-449-901-7 | ||
DOI: 10.35111/a3mk-ae14 | ||
Language: | Mandarin Chinese | |
Language (ISO639): | cmn | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2020S06 | |
Rights Holder: | Portions © 1996, 1997, 2020 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2020S06 | |
DateStamp: | 2021-01-01 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Canavan, Alexandra; Zipperlen, George; Bartlett, John. 2020. Linguistic Data Consortium. | |
Terms: | area_Asia country_CN dcmi_Sound iso639_cmn olac_primary_text |