OLAC Record

Title:Polish Speech Database
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Szwelnik, Tomasz, Jacek Kawalec, and Dorota Gutowska. Polish Speech Database LDC2019S19. Web Download. Philadelphia: Linguistic Data Consortium, 2019
Contributor:Szwelnik, Tomasz
Kawalec, Jacek
Gutowska, Dorota
Date (W3CDTF):2019
Date Issued (W3CDTF):2019-10-15
Description:*Introduction* Polish Speech Database was developed by VoiceLab. It consists of 263,424 utterances of Polish speech data from 200 speakers, totaling approximately 280 hours, and corresponding transcripts. Data collection was performed in Poland. Speakers were asked to record themselves for at least 60 minutes from their home computer using a headset while reading text on a website. The text was comprised of sentences covering most speech sounds in Polish. The database includes speaker metadata. There were 103 male speakers and 97 female speakers. Their ages ranged from 15 years to 60 years of age. Most were in the 15-30 years age range. *Data* Speech data is presented as 16,000 Hz, 16-bit, single channel, flac compressed wav files. Transcripts are UTF-8 encoded plain text. *Samples* Please view the following samples. * Female Speech * Female Transcript * Male Speech * Male Transcript *Updates* None at this time.
Extent:Corpus size: 18666949 KB
Format:Sampling Rate: 16000
Sampling Format: pcm
ISBN: 1-58563-903-6
ISLRN: 803-554-461-385-1
DOI: 10.35111/twqh-f096
Language (ISO639):pol
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2019S19
Rights Holder:Portions © 2019 VoiceLab.ai, © 2019 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text


Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2019S19
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Szwelnik, Tomasz; Kawalec, Jacek; Gutowska, Dorota. 2019. Linguistic Data Consortium.
Terms: area_Europe country_PL dcmi_Sound dcmi_Text iso639_pol olac_primary_text

Up-to-date as of: Tue May 7 7:25:44 EDT 2024