OLAC Record oai:www.ldc.upenn.edu:LDC2015T10 |
Metadata | ||
Title: | RST Signalling Corpus | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Das, Debopam, Maite Taboada, and Paul McFetridge. RST Signalling Corpus LDC2015T10. Web Download. Philadelphia: Linguistic Data Consortium, 2015 | |
Contributor: | Das, Debopam | |
Taboada, Maite | ||
McFetridge, Paul | ||
Date (W3CDTF): | 2015 | |
Date Issued (W3CDTF): | 2015-06-15 | |
Description: | *Introduction* RST Signalling Corpus was developed at Simon Fraser University and contains annotations for signalling information added to RST Discourse Treebank (LDC2002T07). RST Discourse Treebank (RST-DT) is a collection of English news texts annotated for rhetorical relations under the RST (Rhetorical Structure Theory) framework. In RST Signalling Corpus, information about textual signals -- such as although, because, thus -- and signals such as tense, lexical chains or punctuation were added as an annotation layer to examine how rhetorical relations are signalled in discourse. *Data* The source data consists of 385 Wall Street Journal news articles from the Penn Treebank annotated for rhetorical relations in RST Discourse Treebank. As in RST-DT, the data in this release is divided into a training set (347 articles) and a test set (38 articles). The signalling annotation in this data set was performed using the UAM CorpusTool version 2.8.12. Files are presented as UTF-8 encoded XML and plain text. The corpus is divided into three annotation sub-directories: training, test and full. All sub-directories include source, metadata, signalling annotation, and dtd files. *Samples* Please view the following samples: * Metadata Sample * Signal Sample * Text Sample *Updates* None at this time. | |
Extent: | Corpus size: 38176 KB | |
Identifier: | LDC2015T10 | |
https://catalog.ldc.upenn.edu/LDC2015T10 | ||
ISBN: 1-58563-719-X | ||
ISLRN: 256-234-245-630-4 | ||
DOI: 10.35111/5sm9-m096 | ||
Language: | English | |
Language (ISO639): | eng | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2015T10 | |
Rights Holder: | Portions © 1987-1989 Dow Jones & Company, Inc., © 2015 Depobam Das, © 2015 Maite Taboada, © 1995, 1999, 2002, 2015 Trustees of the University of Pennsylvania | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2015T10 | |
DateStamp: | 2020-11-30 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Das, Debopam; Taboada, Maite; McFetridge, Paul. 2015. Linguistic Data Consortium. | |
Terms: | area_Europe country_GB dcmi_Text iso639_eng olac_primary_text |