OLAC Record
oai:www.ldc.upenn.edu:LDC2021T13

Metadata
Title:Chinese Abstract Meaning Representation 2.0
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Li, Bin, et al. Chinese Abstract Meaning Representation 2.0 LDC2021T13. Web Download. Philadelphia: Linguistic Data Consortium, 2021
Contributor:Li, Bin
Xiao, Liming
Liu, Yihuan
Wen, Yuan
Song, Li
Chun, Jayeol
Feng, Minxuan
Zhou, Junsheng
Qu, Weiguang
Xue, Nianwen
Date (W3CDTF):2021
Date Issued (W3CDTF):2021-07-15
Description:*Introduction* Chinese Abstract Meaning Representation (CAMR) 2.0 was developed by Brandeis University and Nanjing Normal University and is comprised of semantic representations of a set of approximately 20,000 Chinese sentences from Chinese Treebank (CTB) 8.0 (LDC2013T21). CAMR 2.0 includes the content of Chinese Abstract Meaning Representation 1.0 (LDC2019T07) (CTB 8.0 weblog and discussion forum sentences), plus an additional 9,933 sentences from the newswire portion of CTB 8.0. Abstract Meaning Representation (AMR) captures "who is doing what to whom" in a sentence. Each sentence is paired with a graph that represents its whole-sentence meaning in a tree structure. LDC has released the following AMR English data sets: Abstract Meaning Representation (AMR) Annotation Release 1.0 (LDC2014T12), Abstract Meaning Representation (AMR) Annotation Release 2.0 (LDC2017T10) and Abstract Meaning Representation (AMR) Annotation Release 3.0 (LDC2020T02). Chinese AMR is constructed following the basic principles developed for English: a compact, readable, whole-sentence semantic representation, while making adaptations where necessary to handle Chinese-specific phenomena. For more information about the project, see the Chinese AMR homepage. *Data* The text contains 20,078 sentences from the weblog, discussion forum, and newswire portions of CTB 8.0. Three sets of files are included: the original Chinese AMR data with concept-to-word and relation-to-word alignments, a converted English AMR format, and a Chinese syntactic dependency tree format. Each set is divided into training, development and test sets, and all files are presented as plain text in UTF-8 encoding. *Samples* Please view this sample (TXT). *Updates* None at this time.
Extent:Corpus size: 31459 KB
Identifier:LDC2021T13
https://catalog.ldc.upenn.edu/LDC2021T13
ISBN: 1-58563-970-2
ISLRN: 483-739-101-185-5
DOI: 10.35111/x61v-0p46
Language:Mandarin Chinese
Language (ISO639):cmn
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2021T13
Rights Holder:Portions © 2006 Agence France Presse, © 2006 Anhui TV, © 2005 Cable News Network, LP, LLLP, © 2000-2001 China Broadcasting System, © 2000-2001, 2005-2006 China Central TV, © 2000-2001 China National Radio, © 2006 Chinanews.com, © 2000-2001 China Television System, © 2006 Guangming Daily, © 2006 National Broadcasting Company, Inc. © 2006 New Tang Dynasty TV, © 2006 Peoples Daily Online, © 2005-2006 Phoenix TV, © 1996-2001 Sinorama Magazine, © 1997 The Government of the Hong Kong Special Administrative Region, © 1994-1998, 2006 Xinhua News Agency, © 2019, 2021 Bin Li, © 2001, 2004, 2005, 2007, 2009, 2010, 2013, 2019, 2021 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2021T13
DateStamp:  2022-01-01
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Li, Bin; Xiao, Liming; Liu, Yihuan; Wen, Yuan; Song, Li; Chun, Jayeol; Feng, Minxuan; Zhou, Junsheng; Qu, Weiguang; Xue, Nianwen. 2021. Linguistic Data Consortium.
Terms: area_Asia country_CN dcmi_Text iso639_cmn olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2021T13
Up-to-date as of: Tue May 7 7:25:52 EDT 2024