OLAC Record: West Point Company G3 American English Speech

OLAC Record
oai:www.ldc.upenn.edu:LDC2005S30

Metadata

Title: West Point Company G3 American English Speech

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Morgan, John, et al. West Point Company G3 American English Speech LDC2005S30. Web Download. Philadelphia: Linguistic Data Consortium, 2005

Contributor: Morgan, John

LaRocca, Stephen

Bellinger, Sherri

Ruscelli, Charles (Chip)

Date (W3CDTF): 2005

Date Issued (W3CDTF): 2005-11-29

Description: *Introduction* West Point Company G3 American English Speech was developed by the Center for Technology Enhanced Language Learning and contains approximately 15.5 hours of read English speech recorded with headset microphones. During the 2000-2001 academic year, cadets, staff, and faculty members at the United States Military Academy volunteered to participate in a speech data collection project for American English. The goal of the project was to amass recordings from no less than 100 adult speakers (50 males and 50 females) to form a substantial corpus of high-quality read speech. The Center for Technology Enhanced Language Learning is part of the U.S. Military Academy's Department of Foreign Languages. Many of the 100-plus volunteers who provided the recordings were members of the staff and faculty of the Department of Foreign Languages. Other volunteers were friends and colleagues from other organizations who worked in offices in Washington Hall. The largest group of volunteers was from Cadet Company G, Third Regiment, United States Corps of Cadets. Cadet Company G3, encouraged by their tactical officer, Major Scott Custer, adopted the speech data collection effort as a community service project. Every female cadet in Company G3 recorded her voice, as did many of the male cadets, including the cadet company commander and Major Custer. *Data* The 185 sentences comprising the data collection script were written to elicit examples of all or most all of the possible syllables used in spoken American English. The G3 Corpus audio data comes from 53 female and 56 male volunteers, each of whom recorded approximately 104 utterances. The recordings are sampled at a 16-bit resolution, 22,050 samples per second. Recordings were made using headset microphones (Shure M10) with preamplifiers attached to the line input jack of desktop computers. *Samples* For an example of the data in this corpus, please listen to this sample (WAV). *Updates* None at this time.

Extent: Corpus size: 2516582 KB

Format: Sampling Rate: 22050

Sampling Format: pcm

Identifier: LDC2005S30

https://catalog.ldc.upenn.edu/LDC2005S30

ISBN: 1-58563-349-6

ISLRN: 739-195-943-085-5

DOI: 10.35111/5yz9-yp59

Language: English

Language (ISO639): eng

License: LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2005S30

Rights Holder: Portions © 2001 United States Military Academy, Portions © 2005 Trustees of the University of Pennsylvania.

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2005S30

DateStamp: 2022-01-20

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Morgan, John; LaRocca, Stephen; Bellinger, Sherri; Ruscelli, Charles (Chip). 2005. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2005S30
Up-to-date as of: Wed Oct 29 7:00:52 EDT 2025

Metadata
Title:		West Point Company G3 American English Speech
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Morgan, John, et al. West Point Company G3 American English Speech LDC2005S30. Web Download. Philadelphia: Linguistic Data Consortium, 2005
Contributor:		Morgan, John
		LaRocca, Stephen
		Bellinger, Sherri
		Ruscelli, Charles (Chip)
Date (W3CDTF):		2005
Date Issued (W3CDTF):		2005-11-29
Description:		Introduction West Point Company G3 American English Speech was developed by the Center for Technology Enhanced Language Learning and contains approximately 15.5 hours of read English speech recorded with headset microphones. During the 2000-2001 academic year, cadets, staff, and faculty members at the United States Military Academy volunteered to participate in a speech data collection project for American English. The goal of the project was to amass recordings from no less than 100 adult speakers (50 males and 50 females) to form a substantial corpus of high-quality read speech. The Center for Technology Enhanced Language Learning is part of the U.S. Military Academy's Department of Foreign Languages. Many of the 100-plus volunteers who provided the recordings were members of the staff and faculty of the Department of Foreign Languages. Other volunteers were friends and colleagues from other organizations who worked in offices in Washington Hall. The largest group of volunteers was from Cadet Company G, Third Regiment, United States Corps of Cadets. Cadet Company G3, encouraged by their tactical officer, Major Scott Custer, adopted the speech data collection effort as a community service project. Every female cadet in Company G3 recorded her voice, as did many of the male cadets, including the cadet company commander and Major Custer. Data The 185 sentences comprising the data collection script were written to elicit examples of all or most all of the possible syllables used in spoken American English. The G3 Corpus audio data comes from 53 female and 56 male volunteers, each of whom recorded approximately 104 utterances. The recordings are sampled at a 16-bit resolution, 22,050 samples per second. Recordings were made using headset microphones (Shure M10) with preamplifiers attached to the line input jack of desktop computers. Samples For an example of the data in this corpus, please listen to this sample (WAV). Updates None at this time.
Extent:		Corpus size: 2516582 KB
Format:		Sampling Rate: 22050
Format:		Sampling Format: pcm
Identifier:		LDC2005S30
		https://catalog.ldc.upenn.edu/LDC2005S30
		ISBN: 1-58563-349-6
		ISLRN: 739-195-943-085-5
		DOI: 10.35111/5yz9-yp59
Language:		English
Language (ISO639):		eng
License:		LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2005S30
Rights Holder:		Portions © 2001 United States Military Academy, Portions © 2005 Trustees of the University of Pennsylvania.
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2005S30
DateStamp:		2022-01-20
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Morgan, John; LaRocca, Stephen; Bellinger, Sherri; Ruscelli, Charles (Chip). 2005. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text