Archive Details

The LDC Corpus Catalog

Repository NameThe LDC Corpus Catalog
Location3600 Market Street, Suite 810, Philadelphia, PA 19104, USA
Short LocationPhiladelphia, USA
SynopsisThe Linguistic Data Consortium (LDC) is an open consortium of universities, companies and research laboratories thatsupports language-related education, research and technology development by creating and sharing language resources including data, tools and standards. LDC is hosted by the University of Pennsylvania, Philadelphia, PA USA. The LDC Corpus Catalog contains close to 500 holdings in over 40 languages. These resources include multilingual lexicons and speech, text and video datasets. Between 30-36 publications are added to the catalog annually.
AccessLDC publications are generally available to both LDC members and nonmembers with some exceptions. Licensing details can be found here, In certain cases, users must sign a corpus-specific license agreement.
ParticipantsDaniel Jaquette (Programmer), Denise DiPersio (Publications Manager)
Base URL
OAI Version2.0
OLAC Version1.1
Records in Archive
Faceted search
Last Harvested2023-06-03
Current As Of2014-06-08
Latest Datestamp2023-05-22
ReportsArchive Metrics and Integrity Checks
Up-to-date as of: Sun Jun 4 5:10:29 EDT 2023