OLAC Record

Title:Spoken Digits in Hindi and Indian English
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Bhattacharya, Basabdatta Sen, et al. Spoken Digits in Hindi and Indian English LDC2022S03. Web Download. Philadelphia: Linguistic Data Consortium, 2022
Contributor:Bhattacharya, Basabdatta Sen
Subramanian, Aiswarya
Chatterjee, Purbayan
Dey, Sounak
Date (W3CDTF):2022
Date Issued (W3CDTF):2022-02-15
Description:*Introduction* Spoken Digits in Hindi and Indian English was developed by the Birla Institute of Technology and Science Pilani. It contains approximately two hours of speech comprised of spoken digits from one to ten in Hindi and English with regional accents from across India. *Data* The speech data was collected as follows: in person, on a mobile handset recorder app; via one-to-one online communications over social apps; and from social media sites. Each audio file represents a single spoken digit in either Hindi or Indian English. Background noise was mostly retained. Some data was recorded in a noise-free environment or cleaned after recording to avoid abrupt noises such as car horns. The audio data is organized by number, language and gender. The gender breakdown for speakers is 17% female, 27% male, and 56% unspecified. A Google Colab Notebook file which can be used for basic functionalities such as removing noise or unwanted spaces is also included in this release. All audio data is presented as single channel 16-bit 16kHz flac compressed linear PCM. *Samples* Please view these samples: * Hindi Female (FLAC) * English Unspecified (FLAC) * English Male (FLAC) *Updates* None at this time.
Extent:Corpus size: 90831 KB
ISBN: 1-58563-986-9
ISLRN: 452-404-795-171-3
DOI: 10.35111/5way-1446
Language (ISO639):eng
License:Spoken Digits in Hindi and Indian English Agreement: https://catalog.ldc.upenn.edu/license/spoken-digits-in-hindi-and-indian-english-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2022S03
Rights Holder:Portions © 2022 Basabdatta Sen Bhattacharya, © 2022 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text


Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2022S03
DateStamp:  2023-01-01
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Bhattacharya, Basabdatta Sen; Subramanian, Aiswarya; Chatterjee, Purbayan; Dey, Sounak. 2022. Linguistic Data Consortium.
Terms: area_Asia area_Europe country_GB country_IN dcmi_Sound iso639_eng iso639_hin olac_primary_text

Up-to-date as of: Sun Jun 16 7:35:07 EDT 2024