OLAC Record oai:www.ldc.upenn.edu:LDC2019T14 |
Metadata | ||
Title: | Machine Reading Phase 1 NFL Scoring Training Data | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Simpson, Heather, et al. Machine Reading Phase 1 NFL Scoring Training Data LDC2019T14. Web Download. Philadelphia: Linguistic Data Consortium, 2019 | |
Contributor: | Simpson, Heather | |
Strassel, Stephanie | ||
Wright, Jonathan | ||
Griffitt, Kira | ||
Date (W3CDTF): | 2019 | |
Date Issued (W3CDTF): | 2019-09-16 | |
Description: | *Introduction* Machine Reading Phase 1 NFL Scoring Training Data was developed by the Linguistic Data Consortium (LDC) and contains 110 US NFL (National Football League) scoring source documents and 110 standoff annotation files used in the DARPA (Defense Advanced Research Projects Agency) Machine Reading program. The Machine Reading program aimed to develop automated reading systems to bridge the gap between knowledge contained in natural language texts and knowledge accessible to formal reasoning systems. The reading systems designed by program participants were required to extract and reason about facts from text in multiple domains. The data in this release constitutes the training data for the NFL Scoring Use Cases evaluation. The NFL Scoring Use Cases tested the sports domain by extracting information about scoring events and outcomes of US NFL games and by aligning that information with an NFL Scoring ontology. *Data* This release contains 110 source documents (70,233 words) from English newswire stories. The files were manually annotated for instances of NFL Scoring annotation categories defined with respect to the NFL Scoring ontology. Annotations are in GUI XML (traditional annotation) and RDF XML (formal knowledge representation) formats. All source and annotation files are presented as UTF-8 encoded XML files with associated dtds. *Acknowledgments* The Linguistic Data Consortium gratefully acknowledges the support of Defense Advanced Research Projects Agency (DARPA) Machine Reading Program under Air Force Research Laboratory (AFRL) prime contract no. FA8750-09 C-xxxx. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of the DARPA, AFRL, or the US government. *Samples* Please view the following samples: * Source Sample * GUI XML Sample * RDF XML Sample *Updates* None at this time. | |
Extent: | Corpus size: 9190 KB | |
Identifier: | LDC2019T14 | |
https://catalog.ldc.upenn.edu/LDC2019T14 | ||
ISBN: 1-58563-900-1 | ||
ISLRN: 960-797-702-613-4 | ||
DOI: 10.35111/8pye-2w87 | ||
Language: | English | |
Language (ISO639): | eng | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2019T14 | |
Rights Holder: | Portions © 1995-1996, 2002-2005 Agence France Presse, ©1998, 2000-2001 The Associated Press, © 1994, 1996, 1998, 2005 New York Times, © 2003, 2005, 2007, 2009, 2011, 2019 Trustees of the University of Pennsylvania | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2019T14 | |
DateStamp: | 2020-11-30 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Simpson, Heather; Strassel, Stephanie; Wright, Jonathan; Griffitt, Kira. 2019. Linguistic Data Consortium. | |
Terms: | area_Europe country_GB dcmi_Text iso639_eng olac_primary_text |