OLAC Record
oai:catalogue.elra.info:ELRA-W0086

Metadata
Title:2006 CoNLL Shared Task - Ten Languages
Access Rights: Rights available for: nonCommercialUse
Date Available (W3CDTF):2015-12-02
Date Issued (W3CDTF):2015-12-02
Date Modified (W3CDTF):2015-12-02
Description:2006 CoNLL Shared Task - Ten Languages consists of dependency treebanks in ten languages used as part of the CoNLL 2006 shared task on multi-lingual dependency parsing. The languages covered in this release are: Bulgarian, Danish, Dutch, German, Japanese, Portuguese, Slovene, Spanish, Swedish and Turkish.The Conference on Computational Natural Language Learning (CoNLL) is accompanied every year by a shared task intended to promote natural language processing applications and evaluate them in a standard setting. In 2006, the shared task was devoted to the parsing of syntactic dependencies using corpora from up to thirteen languages. The task aimed to define and extend the then-current state of the art in dependency parsing, a technology that complemented previous tasks by producing a different kind of syntactic description of input text. More information about CoNLL and the 2006 shared task are available respectively at: http://ifarm.nl/signll/conll and http://ilk.uvt.nl/conll. The source data in the treebanks in this release consists principally of various texts (e.g., textbooks, news, literature) annotated in dependency format. In general, dependency grammar is based on the idea that the verb is the center of the clause structure and that other units in the sentence are connected to the verb as directed links or dependencies. This is a one-to-one correspondence: for every element in the sentence there is one node in the sentence structure that corresponds to that element. In constituency or phrase structure grammars, on the other hand, clauses are divided into noun phrases and verb phrases and in each sentence, one or more nodes may correspond to one element. All of the data sets in this release are dependency treebanks.The individual data sets are:BulTreeBank (Bulgarian)The Danish Dependency Treebank (Danish)The Alpino Treebank (Dutch)The TIGER Corpus (German)Treebank Tuba-J/S (Japanese)Floresta Sinta(c)tica (Portuguese)Slovene Dependency Treebank, SDT V0.1 (Slovene)Cast3LB (Spanish)Talbanken05 (Swedish)METU-Sabanci Turkish Treebank (Turkish)This corpus is distributed jointly with LDC. LDC Catalogue Reference is: https://catalog.ldc.upenn.edu/LDC2015T11.
Identifier:ELRA-W0086
ISLRN: 578-227-532-044-0
Identifier (URI):https://catalog.elra.info/en-us/repository/browse/ELRA-W0086/
Language:Bulgarian
German
Portuguese
Danish
Spanish; Castilian
Japanese
Slovenian
Dutch; Flemish
Turkish
Swedish
Language (ISO639):bul
deu
por
dan
spa
jpn
slv
nld
tur
swe
Medium:downloadable
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0086
DateStamp:  2015-12-02
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2015. ELRA (European Language Resources Association).
Terms: area_Asia area_Europe country_BG country_DE country_DK country_ES country_JP country_NL country_PT country_SE country_SI country_TR dcmi_Text iso639_bul iso639_dan iso639_deu iso639_jpn iso639_nld iso639_por iso639_slv iso639_spa iso639_swe iso639_tur olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0086
Up-to-date as of: Fri Apr 19 6:29:55 EDT 2024