OLAC Record
oai:catalogue.elra.info:ELRA-S0484

Metadata
Title:ATCO2 Project Data
Access Rights: Rights available for: nonCommercialUse, commercialUse
Date Available (W3CDTF):2022-10-19
Date Issued (W3CDTF):2022-10-19
Description:ATCO2 project aims at developing a unique platform allowing to collect, organize and pre-process air-traffic control (voice communication) data from air space. This project has received funding from the Clean Sky 2 Joint Undertaking (JU) under grant agreement No 864702. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and the Clean Sky 2 JU members other than the Union. The project collected the real-time voice communication between air-traffic controllers and pilots available either directly through publicly accessible radio frequency channels or indirectly from air-navigation service providers (ANSPs). In addition to the voice communication data, contextual information is available in a form of metadata (i.e. surveillance data). The dataset consists of two distinct packages:- A corpus of ca. 4000 hours (untranscribed) of air-traffic control speech collected across different airports (Sion, Bern, Zurich, etc.) in .wav format for speech recognition. Speaker distribution is 90/10% between males and females and the group contains native and non-native speakers of English. The raw data, also provided, consists of:Overall size of the dataset (measured after Voice activity detection)- 5281 hours (English + non-English)- 4465 hours (English only)Overall raw size of audio files (sum of wav file lengths):- 6225 hours (English + non-English)- A corpus of ca. 4 hours (transcribed) of air-traffic control speech collected across different airports (Sion, Bern, Zurich, etc.) in .wav format for speech recognition. Speaker distribution is 90/10% between males and females and the group contains native and non-native speakers of English. This corpus has been manually transcribed and automatically annotated with orthographic information in XML format with speaker noise information, SNR values and others. Ca. 1 hour of annotation has followed a human re-checking.
Identifier:ELRA-S0484
ISLRN: 589-403-577-685-7
Identifier (URI):https://catalog.elra.info/en-us/repository/browse/ELRA-S0484/
Language:English
Language (ISO639):eng
Medium:Not specified
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-S0484
DateStamp:  2022-10-19
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2022. ELRA (European Language Resources Association).
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-S0484
Up-to-date as of: Fri Apr 19 6:30:37 EDT 2024