OLAC Record
oai:lindat.mff.cuni.cz:11234/1-3257

Metadata
Title:OAGL Paper Metadata Dataset
Bibliographic Citation:http://hdl.handle.net/11234/1-3257
Creator:Çano, Erion
Date (W3CDTF):2020-07-02T12:35:47Z
Date Available:2020-07-02T12:35:47Z
Description:OAGL is a paper metadata dataset consisting of 17528680 records which comprise various scientific publication attributes like abstracts, titles, keywords, publication years, venues, etc. The last field of each record is the page length of the corresponding publication. Dataset records (samples) are stored as JSON lines in each text file. The data is derived from OAG data collection (https://aminer.org/open-academic-graph) which was released under ODC-BY license. This data (OAGL Paper Metadata Dataset) is released under CC-BY license (https://creativecommons.org/licenses/by/4.0/). If using it, please cite the following paper: Çano Erion, Bojar Ondřej: How Many Pages? Paper Length Prediction from the Metadata. NLPIR 2020, Proceedings of the the 4th International Conference on Natural Language Processing and Information Retrieval, Seoul, Korea, December 2020.
Identifier (URI):http://hdl.handle.net/11234/1-3257
Language:English
Language (ISO639):eng
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Creative Commons - Attribution 4.0 International (CC BY 4.0)
http://creativecommons.org/licenses/by/4.0/
Subject:Paper Length Prediction
Scientific Papers Corpus
Scientific Publication Metadata
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-3257
DateStamp:  2021-06-29
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Çano, Erion. 2020. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_GB dcmi_Text iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-3257
Up-to-date as of: Thu Oct 5 0:41:06 EDT 2023