OLAC Record: OGI Spelled and Spoken Word

OLAC Record
oai:www.ldc.upenn.edu:LDC94S18

Metadata

Title: OGI Spelled and Spoken Word

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Cole, Ronald Allan, and Yeshwant Muthusamy. OGI Spelled and Spoken Word LDC94S18. Web Download. Philadelphia: Linguistic Data Consortium, 1994

Contributor: Cole, Ronald Allan

Muthusamy, Yeshwant

Date (W3CDTF): 1994

Description: The OGI Spelled and Spoken Telephone Corpus consists of speech recordings from over 3,650 telephone calls, each made by a different speaker to an automated prompting/recording system installed at the Oregon Graduate Institute. Speakers were asked to say their name, where they were calling from and where they grew up; they were asked to answer a couple of yes/no questions and to spell their first and last names; many were also asked to repeat a few specific words and to recite the letters of the alphabet. Each response to a prompt is stored as a separate waveform file and the files are organized according to prompt (response type); all responses from a given call have a unique caller-index number as part of the file named, so that responses can easily be sorted by speaker. Waveform data are stored in compressed form, using the NIST SPHERE 2.0 software package, which is available separately at no charge to users. SPHERE 2.0 provides the decompression software needed to extract the waveform data, as well as tools for accessing and modifying file headers. Time-aligned phonetic transcriptions are provided for a subset of responses and a complete log of each (giving speaker sex, quality judgments and orthographic transcriptions of all responses) is included in a form suitable for use as a relational data base.

Format: Sampling Rate: 8000

Sampling Format: 1-channel pcm compressed

Identifier: LDC94S18

https://catalog.ldc.upenn.edu/LDC94S18

ISBN: 1-58563-036-5

ISLRN: 718-988-956-252-7

DOI: 10.35111/7jzb-ek53

Language: English

Language (ISO639): eng

License: LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC94S18

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC94S18

DateStamp: 2020-11-30

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Cole, Ronald Allan; Muthusamy, Yeshwant. 1994. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC94S18
Up-to-date as of: Wed Oct 29 7:00:32 EDT 2025

Metadata
Title:		OGI Spelled and Spoken Word
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Cole, Ronald Allan, and Yeshwant Muthusamy. OGI Spelled and Spoken Word LDC94S18. Web Download. Philadelphia: Linguistic Data Consortium, 1994
Contributor:		Cole, Ronald Allan
Contributor:		Muthusamy, Yeshwant
Date (W3CDTF):		1994
Description:		The OGI Spelled and Spoken Telephone Corpus consists of speech recordings from over 3,650 telephone calls, each made by a different speaker to an automated prompting/recording system installed at the Oregon Graduate Institute. Speakers were asked to say their name, where they were calling from and where they grew up; they were asked to answer a couple of yes/no questions and to spell their first and last names; many were also asked to repeat a few specific words and to recite the letters of the alphabet. Each response to a prompt is stored as a separate waveform file and the files are organized according to prompt (response type); all responses from a given call have a unique caller-index number as part of the file named, so that responses can easily be sorted by speaker. Waveform data are stored in compressed form, using the NIST SPHERE 2.0 software package, which is available separately at no charge to users. SPHERE 2.0 provides the decompression software needed to extract the waveform data, as well as tools for accessing and modifying file headers. Time-aligned phonetic transcriptions are provided for a subset of responses and a complete log of each (giving speaker sex, quality judgments and orthographic transcriptions of all responses) is included in a form suitable for use as a relational data base.
Format:		Sampling Rate: 8000
Format:		Sampling Format: 1-channel pcm compressed
Identifier:		LDC94S18
		https://catalog.ldc.upenn.edu/LDC94S18
		ISBN: 1-58563-036-5
		ISLRN: 718-988-956-252-7
		DOI: 10.35111/7jzb-ek53
Language:		English
Language (ISO639):		eng
License:		LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC94S18
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC94S18
DateStamp:		2020-11-30
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Cole, Ronald Allan; Muthusamy, Yeshwant. 1994. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text