OLAC Record
oai:www.ldc.upenn.edu:LDC98S71

Metadata
Title:1997 English Broadcast News Speech (HUB4)
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Fiscus, Jonathan G., et al. 1997 English Broadcast News Speech (HUB4) LDC98S71. Web Download. Philadelphia: Linguistic Data Consortium, 1998
Contributor:Fiscus, Jonathan G.
Garofolo, John S.
Przybocki, Mark
Fisher, William
Pallett, David
Date (W3CDTF):1998
Description:LDC98S71 - Speech data LDC98T28 - Transcripts *Introduction* This release contains a total of 97 hours of recordings from radio and television news broadcasts, gathered between June 1997 and February 1998. It has been prepared to serve as a supplement to the 1996 Broadcast News Speech collection (consisting of over 100 hours of similar recordings). The primary motivation for this collection is to provide additional training data for the DARPA "HUB4" Project on continuous speech recognition in the broadcast domain. *Data* Transcripts have been made of all recordings in this publication, manually time aligned to the phrasal level, annotated to identify boundaries between news stories, speaker turn boundaries and gender information about the speakers. The transcription conventions are described in the file "transcrp.doc" -- please note that this file describes the transcription methods by reference to text formatting conventions used internally by the LDC during the transcription process. The released version of the transcripts is in SGML format, comparable to the format that was used in the 1996 Broadcast News Speech transcriptions and there is accompanying documentation and an SGML DTD file, included with the transcription release. *Updates* There are no updates at this time. *Additional Licensing Instructions* This 'members-only' corpora is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.
Format:Sampling Rate: 16000
Sampling Format: 1-channel pcm
Identifier:LDC98S71
https://catalog.ldc.upenn.edu/LDC98S71
ISBN: 1-58563-123-X
ISLRN: 331-835-398-589-3
DOI: 10.35111/q5w8-6v93
Language:English
Language (ISO639):eng
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC98S71
Rights Holder:Portions © 1997-1998 American Broadcasting Company, Inc., Cable News Network LP, LLLP, National Public Radio, Inc., National Satellite Cable Corporation, © 1998 Trustee of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC98S71
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Fiscus, Jonathan G.; Garofolo, John S.; Przybocki, Mark; Fisher, William; Pallett, David. 1998. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC98S71
Up-to-date as of: Mon Mar 25 7:20:03 EDT 2024