OLAC Record
oai:www.ldc.upenn.edu:LDC2018T05

Metadata
Title:H2, E2, ERK1 Children's Writing
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Berkling, Kay. H2, E2, ERK1 Children's Writing LDC2018T05. Web Download. Philadelphia: Linguistic Data Consortium, 2018
Contributor:Berkling, Kay
Date (W3CDTF):2018
Date Issued (W3CDTF):2018-04-16
Description:*Introduction* H2, E2, ERK1 Children's Writing was developed by the Cooperative State University Baden-Württemberg, University of Education. It consists of approximately 2,000 texts written over four months by 173 German school children age six through eleven years. The data in this corpus was collected by elementary schools in Baden Württemberg, Germany and digitized at the Cooperative State University during the 2016/2017 school year. Three second, third, and fourth grade classrooms participated in the collection. Texts were written within regular class settings. The students were presented with a picture and were asked to write a story, to describe the picture or if unable to write a text, to list what they saw in the picture. The pictures were designed to enhance the output with respect to important spelling error categories, namely, the marking of short vowels with a silent consonant letter and the correct spelling of the long vowel . The children were allowed at least 15 minutes to write the texts. This exercise was repeated weekly for nine or sixteen weeks depending on the program. LDC has also released H1 Children's Writing (LDC2016T01). *Data* There were 173 total participants. 100 students were multilingual, and further metadata is available for 166 of the 173 children. The following is included for each text in the database: school week of collection; school type; age; gender; grade/classroom; language spoken at home; and school materials used. In all, 2,117 texts representing 118,621 tokens were collected. The texts were digitized in two forms: (1) the original text, including all errors (achieved), and (2) the intended (target) text, where all spelling errors were removed. Annotations were added to both the achieved text and the target text to distinguish words that should not be analyzed for spelling errors, such as names or foreign words. For sentence-level analysis, syntax errors were annotated by marking substitutions, deletions and insertions at the word level. In such cases, the used word was analyzed for spelling, and the correct word was used for sentence structure analysis. Original handwriting is presented as pdf documents and the converted text as UTF-8 plain text in csv documents. *Samples* Please view this image sample and transcript sample. *Updates* None at this time.
Extent:Corpus size: 1725896 KB
Identifier:LDC2018T05
https://catalog.ldc.upenn.edu/LDC2018T05
ISBN: 1-58563-842-0
ISLRN: 553-412-087-213-4
DOI: 10.35111/mtbt-9b85
Language:German
Language (ISO639):deu
License:H2, E2, ERK1 Children’s Writing Agreement: https://catalog.ldc.upenn.edu/license/h2-e2-erk1-childrens-writing-agreement.pdf
Medium:Distribution: Web Download
Provenance:Collected by Cooperative State University of Karlsruhe in Baden Würtemberg, Germany.
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2018T05
Rights Holder:Portions © 2018 Kay Berkling, Trustees of the University of Pennsylvania
Type (DCMI):StillImage
Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2018T05
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Berkling, Kay. 2018. Linguistic Data Consortium.
Terms: area_Europe country_DE dcmi_StillImage dcmi_Text iso639_deu olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2018T05
Up-to-date as of: Fri Dec 6 7:48:40 EST 2024