OLAC Record: TI 46-Word

OLAC Record
oai:www.ldc.upenn.edu:LDC93S9

Metadata

Title: TI 46-Word

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Liberman, Mark, et al. TI 46-Word LDC93S9. Web Download. Philadelphia: Linguistic Data Consortium, 1993

Contributor: Liberman, Mark

Amsler, Robert

Church, Ken

Fox, Ed

Hafner, Carole

Klavans, Judy

Marcus, Mitch

Mercer, Bob

Pedersen, Jan

Roossin, Paul

Walker, Don

Warwick, Susan

Zampolli, Antonio

Date (W3CDTF): 1993

Description: *Introduction* This release contains a corpus of over five hours of speech which was originally designed and collected at Texas Instruments, Inc. (TI) in 1980 and used initially in performance assessment tests of isolated-word speaker-dependent technology. (See "Speech Recognition: Turning Theory to Practice" by G. R. Doddington and T. B. Schalk, in IEEE Spectrum, Vol. 18, No. 9, September 1981.) The 46-word vocabulary consists of two sub-vocabularies: (1) the TI 20-word vocabulary (consisting of the digits zero through nine plus the words "enter," "erase," "go," "help," "no," "rubout," "repeat," "stop," "start," and "yes" as well as (2) the TI 26-word "alphabet set" (consisting of the letters "a" through "z"). *Data* The corpus contains read utterances from 16 speakers (eight males and eight females) each speaking 26 utterances of the 46-word vocabulary: 16 tokens designated as training and ten as test. Note these numbers reflect the aim of the collection and for various reasons, the full number of utterances was not reached for some speakers. See the included documentation for more information. The corpus was collected at Texas Instruments in a quiet acoustic enclosure using an Electro-Voice RE-16 Dynamic Cardiod microphone at 12.5kHz sample rate with 12-bit quantization. The files are in NIST SPHERE format and have a ".wav" filename extension. *Samples* * Audio sample (sph) *Updates* As of October 5, 2016 the documentation was updated to more closely reflect the file inventory.

Format: Sampling Rate: 12500

Sampling Format: 1-channel 12-bit pcm

Identifier: LDC93S9

https://catalog.ldc.upenn.edu/LDC93S9

ISBN: 1-58563-017-9

ISLRN: 476-195-137-873-5

DOI: 10.35111/zx7a-fw03

Language: English

Language (ISO639): eng

License: LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC93S9

Rights Holder: Portions © 1993 Trustees of the University of Pennsylvania

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC93S9

DateStamp: 2024-06-13

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Liberman, Mark; Amsler, Robert; Church, Ken; Fox, Ed; Hafner, Carole; Klavans, Judy; Marcus, Mitch; Mercer, Bob; Pedersen, Jan; Roossin, Paul; Walker, Don; Warwick, Susan; Zampolli, Antonio. 1993. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC93S9
Up-to-date as of: Wed Oct 29 7:00:30 EDT 2025

Metadata
Title:		TI 46-Word
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Liberman, Mark, et al. TI 46-Word LDC93S9. Web Download. Philadelphia: Linguistic Data Consortium, 1993
Contributor:		Liberman, Mark
		Amsler, Robert
		Church, Ken
		Fox, Ed
		Hafner, Carole
		Klavans, Judy
		Marcus, Mitch
		Mercer, Bob
		Pedersen, Jan
		Roossin, Paul
		Walker, Don
		Warwick, Susan
		Zampolli, Antonio
Date (W3CDTF):		1993
Description:		Introduction This release contains a corpus of over five hours of speech which was originally designed and collected at Texas Instruments, Inc. (TI) in 1980 and used initially in performance assessment tests of isolated-word speaker-dependent technology. (See "Speech Recognition: Turning Theory to Practice" by G. R. Doddington and T. B. Schalk, in IEEE Spectrum, Vol. 18, No. 9, September 1981.) The 46-word vocabulary consists of two sub-vocabularies: (1) the TI 20-word vocabulary (consisting of the digits zero through nine plus the words "enter," "erase," "go," "help," "no," "rubout," "repeat," "stop," "start," and "yes" as well as (2) the TI 26-word "alphabet set" (consisting of the letters "a" through "z"). Data The corpus contains read utterances from 16 speakers (eight males and eight females) each speaking 26 utterances of the 46-word vocabulary: 16 tokens designated as training and ten as test. Note these numbers reflect the aim of the collection and for various reasons, the full number of utterances was not reached for some speakers. See the included documentation for more information. The corpus was collected at Texas Instruments in a quiet acoustic enclosure using an Electro-Voice RE-16 Dynamic Cardiod microphone at 12.5kHz sample rate with 12-bit quantization. The files are in NIST SPHERE format and have a ".wav" filename extension. Samples * Audio sample (sph) Updates As of October 5, 2016 the documentation was updated to more closely reflect the file inventory.
Format:		Sampling Rate: 12500
Format:		Sampling Format: 1-channel 12-bit pcm
Identifier:		LDC93S9
		https://catalog.ldc.upenn.edu/LDC93S9
		ISBN: 1-58563-017-9
		ISLRN: 476-195-137-873-5
		DOI: 10.35111/zx7a-fw03
Language:		English
Language (ISO639):		eng
License:		LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC93S9
Rights Holder:		Portions © 1993 Trustees of the University of Pennsylvania
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC93S9
DateStamp:		2024-06-13
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Liberman, Mark; Amsler, Robert; Church, Ken; Fox, Ed; Hafner, Carole; Klavans, Judy; Marcus, Mitch; Mercer, Bob; Pedersen, Jan; Roossin, Paul; Walker, Don; Warwick, Susan; Zampolli, Antonio. 1993. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text