OLAC Record
oai:www.ldc.upenn.edu:LDC2017T14

Metadata
Title:Ancient Chinese Corpus
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Chen, Xiaohe, et al. Ancient Chinese Corpus LDC2017T14. Web Download. Philadelphia: Linguistic Data Consortium, 2017
Contributor:Chen, Xiaohe
Li, Bin
Feng, Minxuan
Xu, Chao
Xu, Runhua
Shi, Min
Yu, Lili
Xiao, Lei
Wang, Qingqing
Date (W3CDTF):2017
Date Issued (W3CDTF):2017-10-18
Description:*Introduction* Ancient Chinese Corpus was developed at Nanjing Normal University. It contains word-segmented and part-of-speech tagged text from Zuozhuan, an ancient Chinese work believed to date from the Warring States Period (475-221 BC). Zuozhuan is a commentary on the Chunqui, a history of the Chinese Spring and Autumn period (770-476 BC). This release is part of a continuing project to develop a large, part-of-speech tagged ancient Chinese corpus. *Data* Ancient Chinese Corpus consists of 180,000 Chinese characters and 195,000 segment units (including words and punctuation). The part-of-speech tag set was developed by Nanjing Normal University and contains 17 tags. This release contains two text files: 268 paragraphs and 10,560 lines. A line is one sentence; paragraphs are separated by one empty line. Each word is tagged with its part-of-speech and separated by a space. The files are presented in UTF-8 plain text files using traditional Chinese script. *Samples* Please view this sample. *Updates* None at this time.
Extent:Corpus size: 1584 KB
Identifier:LDC2017T14
https://catalog.ldc.upenn.edu/LDC2017T14
ISBN: 1-58563-816-1
ISLRN: 924-985-704-453-5
DOI: 10.35111/ctjv-ez04
Language:Literary Chinese
Language (ISO639):lzh
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2017T14
Rights Holder:Portions © 2017 Xiaohe Chen, © 2017 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2017T14
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Chen, Xiaohe; Li, Bin; Feng, Minxuan; Xu, Chao; Xu, Runhua; Shi, Min; Yu, Lili; Xiao, Lei; Wang, Qingqing. 2017. Linguistic Data Consortium.
Terms: dcmi_Text iso639_lzh olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2017T14
Up-to-date as of: Fri Dec 6 7:48:40 EST 2024