OLAC Record
oai:dspace-clarin-it.ilc.cnr.it:20.500.11752/ILC-999

Metadata
Title:ADAM Corpus
Bibliographic Citation:http://hdl.handle.net/20.500.11752/ILC-999
Creator:Cattoni, Roldano
Danieli, Morena
Soria, Claudia
Date (W3CDTF):2023-07-17T08:07:33Z
Date Available:2023-07-17T08:07:33Z
Description:The ADAM spoken corpus is a collection of 450 spoken dialogues: they are both human-human (200 dialogues) and human-machine (250 dialogues). All the dialogues are recordings and transcriptions of telephone conversations in the semantic domain of tourism and railway transportation. The format of the audio files is the standard format for telephone signal data recommended by the SPEECHDAT3 project directions. Each dialogue is annotated at five levels of linguistic information: prosody, morphosyntax, syntax, semantics and pragmatics. For each level a corresponding annotation scheme has been defined that provides annotation instructions, examples and criteria. The result of each annotation is an XML file that encodes the content of a dialogue with respect to a particular level according to the annotation scheme of that level. The human-human dialogues are simulated telephone conversations between two experimental subjects, playing the roles of a travel agent and of a caller, respectively. The human-machine dialogues were collected on the field: they are interactions between callers and the automatic telephone information service of the Italian railway company, recorded during an experimental phase of that service. Each dialogue in the ADAM corpus is represented by an orthographic transcription (physically an XML file), which in turn is linked to an audio file containing the corresponding recording. In addition, the transcription of each dialogue is associated to five XML annotation files, according to five different levels or layers of linguistic information, namely prosody, morphosyntax, syntax, semantics and pragmatics.
Identifier (URI):http://hdl.handle.net/20.500.11752/ILC-999
Language:Italian
Language (ISO639):ita
Publisher:Istituto di Linguistica Computazionale "A. Zampolli", Consiglio Nazionale delle Ricerche
Rights:Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
http://creativecommons.org/licenses/by-nc/4.0/
Subject:human-human spoken dialogues
human-machine spoken dialogues
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli", National Research Council, in Pisa
Description:  http://www.language-archives.org/archive/dspace-clarin-it.ilc.cnr.it
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:dspace-clarin-it.ilc.cnr.it:20.500.11752/ILC-999
DateStamp:  2023-07-17
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Cattoni, Roldano; Danieli, Morena; Soria, Claudia. 2023. Istituto di Linguistica Computazionale "A. Zampolli", Consiglio Nazionale delle Ricerche.
Terms: area_Europe country_IT dcmi_Text iso639_ita olac_primary_text


http://www.language-archives.org/item.php/oai:dspace-clarin-it.ilc.cnr.it:20.500.11752/ILC-999
Up-to-date as of: Tue Sep 19 0:43:08 EDT 2023