OLAC Record
oai:www.ldc.upenn.edu:LDC2008T24

Metadata
Title:COMNOM v 1.0
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Meyers, Adam, Ruth Reeves, and Catherine Macleod. COMNOM v 1.0 LDC2008T24. Web Download. Philadelphia: Linguistic Data Consortium, 2008
Contributor:Meyers, Adam
Reeves, Ruth
Macleod, Catherine
Date (W3CDTF):2008
Date Issued (W3CDTF):2007-09-15
Description:*Introduction* COMNOM is an automatically enriched version of COMLEX Syntax that was created at New York University as part of the NomBank annotation project. COMLEX resources are distributed by the Linguistic Data Consortium (LDC) and consist of the following: COMLEX English Syntax Lexicon (LDC98L21), an English dictionary consisting of approximately 38,000 lemmas with detailed information about the syntactic characteristics of each lexical item and subcategorization (complement structures); and COMLEX Syntax Text Corpus Version 2.0 (LDC96T11). COMNOM adds classes to COMLEX Syntax lexical entries using NOMLEX-PLUS, a dictionary with approximately 8,000 entries. COMNOM collected prepositions from NOMLEX-PLUS sub-categorizations (:VERB-SUBC, :OBJECT, :SUBJECT, etc.), deduced essential complements from them and added them to the existing COMLEX entry. Further information about the methodology used in COMNOM can be found in Meyers, "Those Other NomBank Dictionaries -- Manual for Dictionaries that Come with NomBank". Related resources and further information about COMNOM and NomBank are available from the Nom Bank project website. A license to COMLEX English Syntax Lexicon (LDC98L21) or COMLEX Syntax Text Corpus Version 2.0 (LDC96T11) is required in order to obtain COMNOM v. 1.0. *Data* This release includes three versions of COMNOM which correspond to the three versions of NOMLEX-PLUS and are characterized by the amount of corpus training that influenced their creation. The data used for training are the Wall Street Journal materials in the Penn Treebanks (Treebank-2 and Treebank-3), with annotations from Proposition Bank I and NomBank 1.0. The three versions are: * COMNOM-clean.1.0 -- contains no information derived from annotated data * COMNOM.1.0 -- contains information from the entire annotated corpus * COMNOM-training.1.0 -- contains information from annotated data in sections 02-21 of the corpus only.
Extent:Corpus size: 36864 KB
Identifier:LDC2008T24
https://catalog.ldc.upenn.edu/LDC2008T24
ISBN: 1-58563-493-X
ISLRN: 419-167-670-549-0
DOI: 10.35111/fjkn-rv50
Language:English
Language (ISO639):eng
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2008T24
Rights Holder:Portions © 1987-1989 Dow Jones & Company, Inc., © 1996, 1998, 2008 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2008T24
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Meyers, Adam; Reeves, Ruth; Macleod, Catherine. 2008. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Text iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2008T24
Up-to-date as of: Mon Mar 25 7:20:20 EDT 2024