OLAC Record: CSLU: Portland Cellular Telephone Speech Version 1.3

OLAC Record
oai:www.ldc.upenn.edu:LDC2008S01

Metadata

Title: CSLU: Portland Cellular Telephone Speech Version 1.3

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Cole, Ronald Allan, et al. CSLU: Portland Cellular Telephone Speech Version 1.3 LDC2008S01. Web Download. Philadelphia: Linguistic Data Consortium, 2008

Contributor: Cole, Ronald Allan

Fanty, Mark

Noel, M

Lander, T.

Date (W3CDTF): 2008

Date Issued (W3CDTF): 2008-01-22

Description: *Introduction* CSLU: Portland Cellular Telephone Speech Version 1.3 was created by the Center for Spoken Language Understanding (CSLU) at OGI School of Science and Engineering, Oregon Health and Science University, Beaverton, Oregon. It consists of cellular telephone speech and corresponding transcripts, specifically, 7,571 utterances from 515 speakers who made calls in the Portland, Oregon area using cellular telephones. Speakers called the CSLU data collection system on cellular telephones, and they were asked to repeat certain phrases and to respond to other prompts. Two prompt protocols were used: an In Vehicle Protocol for speakers calling from inside a vehicle and a Not in Vehicle Protocol for those calling from outside a vehicle. The protocols shared several questions, but each protocol contained distinct queries designed to probe the conditions of the caller's in vehicle/not in vehicle surroundings. Not every caller provided a response to each prompt. *Recording Details* The speeech data was captured digitally from CSLU's T1 connection and saved as 8 khz, 16-bit linear. *Transcriptions* The text transcriptions in this corpus were produced using the non time-aligned word-level conventions described in The CSLU Labeling Guide, which is included in the documentation for this release. CSLU: Portland Cellular Telephone Speech Version 1.3 contains orthographic and phonetic transcriptions of corresponding speech files. Non time-aligned orthographic transcriptions provide quick access to the content of an utterance; they may contain markers for word boundaries to support access and retrieval at the lexical level. Phonetic/phonemic transcriptions represent the phonetic content of an utterance at a given level of detail that is made explicit by the use of diacritics. Phonetic phenomena transcribed includes excessive nasalization, glottalization, frication on a stop, centralization, lateralization, rounding and palatalization. *Samples* For an example of the data in this corpus, please examine the following audio file and transcript. * audio(wav) * transcript

Extent: Corpus size: 586752 KB

Format: Sampling Rate: 8000

Sampling Format: ulaw

Identifier: LDC2008S01

https://catalog.ldc.upenn.edu/LDC2008S01

ISBN: 1-58563-463-8

ISLRN: 614-115-041-059-2

DOI: 10.35111/gptm-4d61

Language: English

Language (ISO639): eng

License: CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2008S01

Rights Holder: Portions © 1995, 1998, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2008 Trustees of the University of Pennsylvania

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2008S01

DateStamp: 2022-01-20

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Cole, Ronald Allan; Fanty, Mark; Noel, M; Lander, T. 2008. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2008S01
Up-to-date as of: Sat Jun 28 1:01:41 EDT 2025

Metadata
Title:		CSLU: Portland Cellular Telephone Speech Version 1.3
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Cole, Ronald Allan, et al. CSLU: Portland Cellular Telephone Speech Version 1.3 LDC2008S01. Web Download. Philadelphia: Linguistic Data Consortium, 2008
Contributor:		Cole, Ronald Allan
		Fanty, Mark
		Noel, M
		Lander, T.
Date (W3CDTF):		2008
Date Issued (W3CDTF):		2008-01-22
Description:		Introduction CSLU: Portland Cellular Telephone Speech Version 1.3 was created by the Center for Spoken Language Understanding (CSLU) at OGI School of Science and Engineering, Oregon Health and Science University, Beaverton, Oregon. It consists of cellular telephone speech and corresponding transcripts, specifically, 7,571 utterances from 515 speakers who made calls in the Portland, Oregon area using cellular telephones. Speakers called the CSLU data collection system on cellular telephones, and they were asked to repeat certain phrases and to respond to other prompts. Two prompt protocols were used: an In Vehicle Protocol for speakers calling from inside a vehicle and a Not in Vehicle Protocol for those calling from outside a vehicle. The protocols shared several questions, but each protocol contained distinct queries designed to probe the conditions of the caller's in vehicle/not in vehicle surroundings. Not every caller provided a response to each prompt. Recording Details The speeech data was captured digitally from CSLU's T1 connection and saved as 8 khz, 16-bit linear. Transcriptions The text transcriptions in this corpus were produced using the non time-aligned word-level conventions described in The CSLU Labeling Guide, which is included in the documentation for this release. CSLU: Portland Cellular Telephone Speech Version 1.3 contains orthographic and phonetic transcriptions of corresponding speech files. Non time-aligned orthographic transcriptions provide quick access to the content of an utterance; they may contain markers for word boundaries to support access and retrieval at the lexical level. Phonetic/phonemic transcriptions represent the phonetic content of an utterance at a given level of detail that is made explicit by the use of diacritics. Phonetic phenomena transcribed includes excessive nasalization, glottalization, frication on a stop, centralization, lateralization, rounding and palatalization. Samples For an example of the data in this corpus, please examine the following audio file and transcript. * audio(wav) * transcript
Extent:		Corpus size: 586752 KB
Format:		Sampling Rate: 8000
Format:		Sampling Format: ulaw
Identifier:		LDC2008S01
		https://catalog.ldc.upenn.edu/LDC2008S01
		ISBN: 1-58563-463-8
		ISLRN: 614-115-041-059-2
		DOI: 10.35111/gptm-4d61
Language:		English
Language (ISO639):		eng
License:		CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2008S01
Rights Holder:		Portions © 1995, 1998, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2008 Trustees of the University of Pennsylvania
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2008S01
DateStamp:		2022-01-20
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Cole, Ronald Allan; Fanty, Mark; Noel, M; Lander, T. 2008. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text