OLAC Record oai:www.ldc.upenn.edu:LDC2021T16 |
Metadata | ||
Title: | DiscAlign for Penn and RST Discourse Treebanks | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Demberg, Vera, Fatemeh Asr, and Merel Scholman. DiscAlign for Penn and RST Discourse Treebanks LDC2021T16. . Philadelphia: Linguistic Data Consortium, 2021 | |
Contributor: | Demberg, Vera | |
Asr, Fatemeh Torabi | ||
Scholman, Merel C.J. | ||
Date (W3CDTF): | 2021 | |
Date Issued (W3CDTF): | 2021-09-15 | |
Description: | *Introduction* DiscAlign for Penn and RST Discourse Treebanks was developed by Saarland University. It consists of alignment information for the discourse annotations contained in Penn Discourse Treebank Version 2.0 (LDC2008T05) (PDTB 2.0) and RST Discourse Treebank (LDC2002T07) (RST-DT). PDTB 2.0 and RST-DT annotations overlap for 385 newspaper articles in sections 6, 11, 13, 19 and 23 of the Wall Street Journal corpus contained in Treebank-2 (LDC95T7). DiscAlign for Penn and RST Discourse Treebanks contains approximately 6,700 alignments between PDTB 2.0 and RST-DT relations. DiscAlign for Penn and RST Treebanks is available at no cost to all licensees of PDTB 2.0 and RST-DT and appears in their download queues associated with these corpora as DiscAlign_Penn_RST_DTB_LDC2021T16.zip. *Data* The alignment table is presented as a single UTF-8 encoded CSV file with each row representing a PDTB discourse relation that has been mapped with an RST relation from the RST-DT corpus. Table columns provide some basic information about the source relation extracted from PDTB, the target relation extracted from RST-DT, and the quality of the alignment between the two. See the included documentation for more details on the columns and the mapping procedure. *Samples* Please view this sample (TXT). *Updates* None at this time. | |
Extent: | Corpus size: 2002 KB | |
Identifier: | LDC2021T16 | |
https://catalog.ldc.upenn.edu/LDC2021T16 | ||
ISBN: 1-58563-975-3 | ||
ISLRN: 013-086-726-491-1 | ||
DOI: 10.35111/cf0q-c454 | ||
Language: | English | |
Language (ISO639): | eng | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2021T16 | |
Rights Holder: | Portions © 2021 Saarland University, © 2021 Trustees of the University of Pennsylvania | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2021T16 | |
DateStamp: | 2021-09-15 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Demberg, Vera; Asr, Fatemeh Torabi; Scholman, Merel C.J. 2021. Linguistic Data Consortium. | |
Terms: | area_Europe country_GB dcmi_Text iso639_eng olac_primary_text |