OLAC Record: COMNOM v 1.0

OLAC Record
oai:www.ldc.upenn.edu:LDC2008T24

Metadata

Title: COMNOM v 1.0

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Meyers, Adam, Ruth Reeves, and Catherine Macleod. COMNOM v 1.0 LDC2008T24. Web Download. Philadelphia: Linguistic Data Consortium, 2008

Contributor: Meyers, Adam

Reeves, Ruth

Macleod, Catherine

Date (W3CDTF): 2008

Date Issued (W3CDTF): 2007-09-15

Description: *Introduction* COMNOM is an automatically enriched version of COMLEX Syntax that was created at New York University as part of the NomBank annotation project. COMLEX resources are distributed by the Linguistic Data Consortium (LDC) and consist of the following: COMLEX English Syntax Lexicon (LDC98L21), an English dictionary consisting of approximately 38,000 lemmas with detailed information about the syntactic characteristics of each lexical item and subcategorization (complement structures); and COMLEX Syntax Text Corpus Version 2.0 (LDC96T11). COMNOM adds classes to COMLEX Syntax lexical entries using NOMLEX-PLUS, a dictionary with approximately 8,000 entries. COMNOM collected prepositions from NOMLEX-PLUS sub-categorizations (:VERB-SUBC, :OBJECT, :SUBJECT, etc.), deduced essential complements from them and added them to the existing COMLEX entry. Further information about the methodology used in COMNOM can be found in Meyers, "Those Other NomBank Dictionaries -- Manual for Dictionaries that Come with NomBank". Related resources and further information about COMNOM and NomBank are available from the Nom Bank project website. A license to COMLEX English Syntax Lexicon (LDC98L21) or COMLEX Syntax Text Corpus Version 2.0 (LDC96T11) is required in order to obtain COMNOM v. 1.0. *Data* This release includes three versions of COMNOM which correspond to the three versions of NOMLEX-PLUS and are characterized by the amount of corpus training that influenced their creation. The data used for training are the Wall Street Journal materials in the Penn Treebanks (Treebank-2 and Treebank-3), with annotations from Proposition Bank I and NomBank 1.0. The three versions are: * COMNOM-clean.1.0 -- contains no information derived from annotated data * COMNOM.1.0 -- contains information from the entire annotated corpus * COMNOM-training.1.0 -- contains information from annotated data in sections 02-21 of the corpus only.

Extent: Corpus size: 36864 KB

Identifier: LDC2008T24

https://catalog.ldc.upenn.edu/LDC2008T24

ISBN: 1-58563-493-X

ISLRN: 419-167-670-549-0

DOI: 10.35111/fjkn-rv50

Language: English

Language (ISO639): eng

License: LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2008T24

Rights Holder: Portions © 1987-1989 Dow Jones & Company, Inc., © 1996, 1998, 2008 Trustees of the University of Pennsylvania

Type (DCMI): Text

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2008T24

DateStamp: 2020-11-30

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Meyers, Adam; Reeves, Ruth; Macleod, Catherine. 2008. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Text iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2008T24
Up-to-date as of: Wed Oct 29 7:01:05 EDT 2025

Metadata
Title:		COMNOM v 1.0
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Meyers, Adam, Ruth Reeves, and Catherine Macleod. COMNOM v 1.0 LDC2008T24. Web Download. Philadelphia: Linguistic Data Consortium, 2008
Contributor:		Meyers, Adam
		Reeves, Ruth
		Macleod, Catherine
Date (W3CDTF):		2008
Date Issued (W3CDTF):		2007-09-15
Description:		Introduction COMNOM is an automatically enriched version of COMLEX Syntax that was created at New York University as part of the NomBank annotation project. COMLEX resources are distributed by the Linguistic Data Consortium (LDC) and consist of the following: COMLEX English Syntax Lexicon (LDC98L21), an English dictionary consisting of approximately 38,000 lemmas with detailed information about the syntactic characteristics of each lexical item and subcategorization (complement structures); and COMLEX Syntax Text Corpus Version 2.0 (LDC96T11). COMNOM adds classes to COMLEX Syntax lexical entries using NOMLEX-PLUS, a dictionary with approximately 8,000 entries. COMNOM collected prepositions from NOMLEX-PLUS sub-categorizations (:VERB-SUBC, :OBJECT, :SUBJECT, etc.), deduced essential complements from them and added them to the existing COMLEX entry. Further information about the methodology used in COMNOM can be found in Meyers, "Those Other NomBank Dictionaries -- Manual for Dictionaries that Come with NomBank". Related resources and further information about COMNOM and NomBank are available from the Nom Bank project website. A license to COMLEX English Syntax Lexicon (LDC98L21) or COMLEX Syntax Text Corpus Version 2.0 (LDC96T11) is required in order to obtain COMNOM v. 1.0. Data This release includes three versions of COMNOM which correspond to the three versions of NOMLEX-PLUS and are characterized by the amount of corpus training that influenced their creation. The data used for training are the Wall Street Journal materials in the Penn Treebanks (Treebank-2 and Treebank-3), with annotations from Proposition Bank I and NomBank 1.0. The three versions are: * COMNOM-clean.1.0 -- contains no information derived from annotated data * COMNOM.1.0 -- contains information from the entire annotated corpus * COMNOM-training.1.0 -- contains information from annotated data in sections 02-21 of the corpus only.
Extent:		Corpus size: 36864 KB
Identifier:		LDC2008T24
		https://catalog.ldc.upenn.edu/LDC2008T24
		ISBN: 1-58563-493-X
		ISLRN: 419-167-670-549-0
		DOI: 10.35111/fjkn-rv50
Language:		English
Language (ISO639):		eng
License:		LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2008T24
Rights Holder:		Portions © 1987-1989 Dow Jones & Company, Inc., © 1996, 1998, 2008 Trustees of the University of Pennsylvania
Type (DCMI):		Text
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2008T24
DateStamp:		2020-11-30
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Meyers, Adam; Reeves, Ruth; Macleod, Catherine. 2008. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Text iso639_eng olac_primary_text