OLAC Record
oai:www.ldc.upenn.edu:LDC2020T01

Metadata
Title:Chinese CogBank
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Li, Bin, et al. Chinese CogBank LDC2020T01. Web Download. Philadelphia: Linguistic Data Consortium, 2020
Contributor:Li, Bin
Yin, Siqi
Xu, Jie
Song, Li
Feng, Minxuan
Date (W3CDTF):2020
Date Issued (W3CDTF):2020-02-17
Description:*Introduction* Chinese CogBank is a database of cognitive properties of Chinese words intended for use in metaphor understanding and generation. It consists of 232,497 "word-property" pairs, which are comprised of 83,104 words and 100,195 properties. Each "word-property" type also has an associated frequency which can stand as a functional measure of the importance of a property. *Data* The data was collected via the Chinese search engine Baidu.com. The original collection consisted of 1,258,430 types (5,637,500 tokens) of "word-adjective" pairs that were reduced in Chinese CogBank to 232,497 "word-property" pairs after a series of manual checks. The corpus is presented as a single tab separated value file encoded in UTF-8. *Samples* Please view this sample. *Updates* None at this time.
Extent:Corpus size: 3480 KB
Identifier:LDC2020T01
https://catalog.ldc.upenn.edu/LDC2020T01
ISBN: 1-58563-917-6
ISLRN: 382-367-821-870-2
DOI: 10.35111/w8tv-1e21
Language:Mandarin Chinese
Language (ISO639):cmn
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2020T01
Rights Holder:Portions © 2020 Bin Li, © 2020 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2020T01
DateStamp:  2021-01-01
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Li, Bin; Yin, Siqi; Xu, Jie; Song, Li; Feng, Minxuan. 2020. Linguistic Data Consortium.
Terms: area_Asia country_CN dcmi_Text iso639_cmn olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2020T01
Up-to-date as of: Tue May 7 7:25:46 EDT 2024