OLAC Record

Title:Chinese Proposition Bank 2.0
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Xue, Nianwen, et al. Chinese Proposition Bank 2.0 LDC2008T07. Web Download. Philadelphia: Linguistic Data Consortium, 2008
Contributor:Xue, Nianwen
Palmer, Martha
Chang, Meiyu
Jiang, Zixin
Date (W3CDTF):2008
Date Issued (W3CDTF):2008-05-19
Description:Chinese Proposition Bank 2.0 is a continuation of the Chinese Propostion Bank project, which aims to create a corpus of Chinese text annotated with information about basic semantic propositions. Chinese Propostion Bank 1.0 consists of predicate-argument annotation on 250,000 words from Chinese Treebank 5.0. Chinese Proposition Bank 2.0 adds predicate-argument annotation on 500,000 words from Chinese Treebank 6.0. The data sources include newswire from Xinhua News Agency, articles from Sinorama Magazine, news from the website of the Hong Kong Special Administrative Region and transcripts from various Chinese broadcast news programs. *Data* This release contains the predicate-argument annotation of 81,009 verb instances (11,171 unique verbs) and 14,525 noun instances (1,421 unique nouns). The annotation of nouns is limited to nominalizations that have a corresponding verb. The general annotation guidelines and the lexical guidelines (called frame files) for each verbal and nominal predicate are included in this release. Total propositions for verbs: 81,009 Total propositions for nouns: 14,525 Total verbs framed: 11,171 Total framesets: 11,776 Verbs with multiple framesets: 474 Average framesets per verb: 1.05 Total nouns framed: 1,421 Total noun framesets: 1,528 Nouns with multiple framesets: 48 Average framesets per noun: 1.08 *Samples* For an example of the data in this corpus, please examine this sample image(jpeg) of a parse tree.
Extent:Corpus size: 159744 KB
ISBN: 1-58563-451-4
ISLRN: 794-819-316-121-4
DOI: 10.35111/nh9x-6n14
Language:Mandarin Chinese
Language (ISO639):cmn
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2008T07
Rights Holder:Portions © 2000-2001 China Broadcasting System, © 2000-2001 China Central TV, © 2000-2001 China National Radio, © 2000-2001 China Television System, © 1997 The Government of the Hong Kong Special Administrative Region, © 1996-2001 Sinorama Magazine, © 1994-1998 Xinhua News Agency, © 2001, 2004, 2005, 2007, 2008 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2008T07
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Xue, Nianwen; Palmer, Martha; Chang, Meiyu; Jiang, Zixin. 2008. Linguistic Data Consortium.
Terms: area_Asia country_CN dcmi_Text iso639_cmn olac_primary_text

Up-to-date as of: Tue May 7 7:24:54 EDT 2024