OLAC Record
oai:www.ldc.upenn.edu:LDC98T24

Metadata
Title:1997 Mandarin Broadcast News Transcripts (HUB4-NE)
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Huang, Shudong, et al. 1997 Mandarin Broadcast News Transcripts (HUB4-NE) LDC98T24. Web Download. Philadelphia: Linguistic Data Consortium, 1998
Contributor:Huang, Shudong
Liu, Jing
Wu, Xuling
Wu, Lei
Yan, Yongmin
Qin, Zhoakai
Date (W3CDTF):1998
Description:*Introduction* This collection consists of 30 hours of transcripts of Mandarin Chinese broadcast news recordings from the following sources: Voice of America (VOA), China Central TV (CCTV) and KAZN-AM, a commercial radio station based in Los Angeles, CA. Of these three sources, the first two comprise the bulk of the collection and are represented in roughly equal amounts. Only a relatively small sample of KAZN-AM recordings is included, owing to the relatively high proportion of unusable material in that source(e.g., commercials, local traffic reports). Corresponding audio files are released as 1997 Mandarin Broadcast News Speech (HUB4-NE) LDC98S73. *Data* The transcripts were created by native speakers of Mandarin working at LDC. They are in GB-encoded form with SGML tags to identify story boundaries, speaker turn boundaries and phrasal pauses. The tags include time stamps to align the text with the speech data. Word segmentation (white-space between words) is included. A working DTD is provided, and the markup is consistent with that of the 1997 English and Spanish HUB4 collections. *Updates* There are no updates at this time. *Additional Licensing Instructions* This 'members-only' corpora is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.
Identifier:LDC98T24
https://catalog.ldc.upenn.edu/LDC98T24
ISBN: 1-58563-126-4
ISLRN: 915-625-485-899-5
DOI: 10.35111/qrcj-k950
Language:Mandarin Chinese
Language (ISO639):cmn
License:1997 Mandarin Broadcast News Agreement: https://catalog.ldc.upenn.edu/license/1997-mandarin-broadcast-news.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC98T24
Rights Holder:Portions © 1997 China Central TV, © 1997 MultiCultural Broadcasting Corporation, © 1997, 1998 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC98T24
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Huang, Shudong; Liu, Jing; Wu, Xuling; Wu, Lei; Yan, Yongmin; Qin, Zhoakai. 1998. Linguistic Data Consortium.
Terms: area_Asia country_CN dcmi_Text iso639_cmn olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC98T24
Up-to-date as of: Thu Oct 24 7:30:06 EDT 2024