OLAC Record
oai:dspace-clarin-it.ilc.cnr.it:20.500.11752/OPEN-557

Metadata
Title:Italian Sense Inventory
Bibliographic Citation:http://hdl.handle.net/20.500.11752/OPEN-557
Creator:Poli, Francesca
Date (W3CDTF):2021-07-01T12:23:35Z
Date Available:2021-07-01T12:23:35Z
Description:The present Sense Inventory is an Italian language resource automatically derived from two Italian computational lexicons: ItalWordNet (https://dspace-clarin-it.ilc.cnr.it/repository/xmlui/handle/20.500.11752/ILC-62) and PAROLE-SIMPLE-CLIPS (https://dspace-clarin-it.ilc.cnr.it/repository/xmlui/handle/20.500.11752/ILC-88). It was built in collaboration with the CNR Institute of Computational Linguistics as an experiment related to the ELEXIS project (https://elex.is/), with the aim to produce a synthetic and structured inventory of senses to be used for the sense annotation of the ELEXIS WSD test corpus. This Sense Inventory is thus based upon the selection of lemmas occurring in the ELEXIS test corpus and on the merged sense information derived from the two existing lexicons. The Python program developed for the automatic construction of the Sense Inventory takes as input the ELEXIS dataset, extracts the lemmas from its sentences and searches for all related senses in the above mentioned resources. It also makes use of a sense mapping database of the cited lexicons, 'iwnmapdb', available upon request from CNR-ILC. The extrapolated and checked data are then arranged in a formal structure in which for each lemma - PoS pair the following details are given: - Not mapped senses extracted from PAROLE-SIMPLE-CLIPS (PSC), - Mapped senses extracted from the mapping database 'iwnmapdb', - Not mapped senses extracted from ItalWordNet (IWN). All fields with no value are filled with None. The tab separated format thus has the following structure: LEMMA POS CONCATENATED DEFINITION PSC-IWN USEMID PSC DEFINITION PSC EXAMPLE PSC SEMANTIC TYPE PSC SYNSETID IWN SENSEID IWN DEFINITION IWN The total number of lemmas (with a ADV/ADJ/NOUN/VERB part of speech) inserted in the Sense Inventory amounts to 3860. There are 12,944 senses and mappings reported in the Sense Inventory, out of a total of 15,672 senses extracted from PAROLE-SIMPLE-CLIPS and ItalWordNet; 3461 mappings were extracted from the mapping database IWNMAPDB and then included in the Sense Inventory as relevant senses.
Identifier (URI):http://hdl.handle.net/20.500.11752/OPEN-557
Language:Italian
Language (ISO639):ita
Publisher:Università di Pisa
Istituto di Linguistica Computazionale “A. Zampolli” - Consiglio Nazionale delle Ricerche (ILC-CNR)
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
http://creativecommons.org/licenses/by-nc-sa/4.0/
Subject:lexical resources
word sense disambiguation
semantic dataset
Italian language
Subject (ISO639):ita
Type:lexicalConceptualResource
Type (DCMI):Text
Type (OLAC):lexicon

OLAC Info

Archive:  ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli", National Research Council, in Pisa
Description:  http://www.language-archives.org/archive/dspace-clarin-it.ilc.cnr.it
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:dspace-clarin-it.ilc.cnr.it:20.500.11752/OPEN-557
DateStamp:  2021-07-01
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Poli, Francesca. 2021. Università di Pisa.
Terms: area_Europe country_IT dcmi_Text iso639_ita olac_lexicon

Inferred Metadata

Country: Italy
Area: Europe


http://www.language-archives.org/item.php/oai:dspace-clarin-it.ilc.cnr.it:20.500.11752/OPEN-557
Up-to-date as of: Tue Sep 19 0:42:10 EDT 2023