OLAC Record
oai:lindat.mff.cuni.cz:11234/1-5024

Metadata
Title:A Human-Annotated Dataset for Language Modeling and Named Entity Recognition in Medieval Documents (2023-01-05)
Bibliographic Citation:http://hdl.handle.net/11234/1-5024
Creator:Novotný, Vít
Luger, Kristýna
Štefánik, Michal
Vrabcová, Tereza
Horák, Aleš
Date (W3CDTF):2023-01-23T20:43:53Z
Date Available:2023-01-23T20:43:53Z
Description:This is an open dataset of sentences from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains a corpus for language modeling and human annotations for named entity recognition (NER).
Identifier (URI):http://hdl.handle.net/11234/1-5024
Language:Czech
English
German
Latin
Language (ISO639):ces
eng
deu
lat
Publisher:Masaryk University, Brno
Replaces (URI):http://hdl.handle.net/11234/1-4936
Rights:Public Domain Dedication (CC Zero)
http://creativecommons.org/publicdomain/zero/1.0/
Subject:NER
named entity recognition
Medieval
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-5024
DateStamp:  2023-01-23
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Novotný, Vít; Luger, Kristýna; Štefánik, Michal; Vrabcová, Tereza; Horák, Aleš. 2023. Masaryk University, Brno.
Terms: area_Europe country_CZ country_DE country_GB country_VA dcmi_Text iso639_ces iso639_deu iso639_eng iso639_lat olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-5024
Up-to-date as of: Thu Oct 5 0:43:33 EDT 2023