OLAC Record
oai:www.ldc.upenn.edu:LDC94S20

Metadata
Title:BRAMSHILL
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:British Home Office. BRAMSHILL LDC94S20. Web Download. Philadelphia: Linguistic Data Consortium, 1994
Contributor:British Home Office
Date (W3CDTF):1994
Date Issued (W3CDTF):1994-01-01
Description:The recordings in this release were originally made in 1978-79 as part of a British Home Office study into speaker identification techniques. Subsequently, it was realized that a large body of unconstrained conversational material might be of interest to researchers working in other speech processing fields. The recordings were transcribed and the release prepared during 1993. The recordings were made at the Police Staff College, Bramshill, Hampshire, England. The participants were police officers taking part in the various courses at the college. This provided a wide range of regional accents and a range of ages from late teens to early fifties. Each speaker is described by nine demographic attributes. Three adjacent bedrooms were used. The two participants, each alone in their rooms, conversed by telephone. The third room was used as a monitoring and recording station. In addition to the telephone recordings, reference recordings were made using a high quality dynamic microphone in each room. It is these higher quality recordings, not the telephone speech, which are provided in the BRAMSHILL set. The recordings were made on a Sony Elcaset EL-7 cassette machine, chosen at the time because of its good speed stability. The microphone was a Shure SM-7 cardioid type. The speech data was sampled at 10 kHz, 16-bit resolution. Some attempt was made to control the acoustic environment. It is evident from listening to the recordings that, while these measures produced a reasonable recording environment, the rooms were far from soundproof. A variety of external noises (engines, aircraft, etc) can be heard on some of the recordings. Each speaker was given a pile of photographs. In response to a bleep signal, each speaker introduced himself by name and read a set of test sentences. After this, the main part of the conversation took place, in which participants were asked to determine which of each pair of photographs has been taken first (if indeed they were related at all). The conversations continued for 10 minutes until terminated by another bleep signal. During the digitization process, some periods of silence were removed, so some recordings now appear to be shorter than the original ten minutes. Furthermore, this means that recordings of two sides of a conversation are no longer time-aligned. In addition, to preserve the anonymity of the speakers, some passages (mainly the introductions) have been erased by replacing with binary zeroes. Finally the bleep signals have also been erased with binary zeroes. The transcriptions indicate where this has occurred. The speech was transcribed verbatim. No attempt was made to correct grammar, fill in missing words etc. Transcription conventions are detailed in the documentation. Every lexical word from the transcriptions is contained in the dictionary supplied in the INDEX directory. There are about 6,500 word types in the 600k words of the transcripts. Contractions, part-words, slang words, hesitation sounds and the non-speech sounds such are all treated as words in their own right in the dictionary.
Extent:Corpus size: 4718592 KB
Format:Sampling Rate: 10000
Sampling Format: 1-channel pcm
Identifier:LDC94S20
https://catalog.ldc.upenn.edu/LDC94S20
ISBN: 1-58563-029-2
ISLRN: 361-674-631-516-1
DOI: 10.35111/4z1v-mm54
Language:English
Language (ISO639):eng
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC94S20
Rights Holder:Portions © 1994 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC94S20
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: British Home Office. 1994. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC94S20
Up-to-date as of: Mon Mar 25 7:19:52 EDT 2024