OLAC Record
oai:www.ldc.upenn.edu:LDC2017T15

Metadata
Title:English Web Treebank Propbank
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:O'Gorman, Tim, Katherine Conger, and Martha Palmer. English Web Treebank Propbank LDC2017T15. Web Download. Philadelphia: Linguistic Data Consortium, 2017
Contributor:O'Gorman, Tim
Conger, Katherine
Palmer, Martha
Date (W3CDTF):2017
Date Issued (W3CDTF):2017-10-18
Description:*Introduction* English Web Treebank Propbank was developed by the University of Colorado Boulder - CLEAR (Computational Language and Education Research) and provides predicate-argument structure annotation for English Web Treebank (LDC2012T13). The goal of Propbank (or proposition bank) annotation is to develop annotations with information about basic semantic propositions. English Web Treebank Propbank provides semantic role annotation and predicate sense disambiguation for roughly 50,000 predicates, corresponding to all verbs, all adjectives in equational clauses and all nouns considered to be predicative. Mark-up is in the "unified" propbank annotation format, which combines representations in nouns, verbs and adjectives. *Data* The source data consists of weblogs, newsgroups, email, reviews and questions-answers. Human annotators followed the guidelines included with this release. Annotated propositions were automatically validated to ensure that (1) pointers to the tree nodes were valid, (2) Propbank labels were valid, and (3) Propbank annotation was consistent with the associated frameset. Additionally, XML frame files were validated against the included dtd and were checked for frame internal consistency (e.g. misspelling, extraneous characters, general correctness). Data is presented in UTF-8 XML files. *Samples* Please view the following samples. * Source * Prop * Frame *Updates* None at this time.
Extent:Corpus size: 93696 KB
Identifier:LDC2017T15
https://catalog.ldc.upenn.edu/LDC2017T15
ISBN: 1-58563-818-8
ISLRN: 385-163-116-259-0
DOI: 10.35111/gp5k-be63
Language:English
Language (ISO639):eng
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2017T15
Rights Holder:Portions © 2012 Google Inc., © 2011 Yahoo! Inc., © 2017 Colorado University - CLEAR, © 2012, 2017 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2017T15
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: O'Gorman, Tim; Conger, Katherine; Palmer, Martha. 2017. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Text iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2017T15
Up-to-date as of: Mon Mar 25 7:20:56 EDT 2024