OpenCitations Meta CSV dataset of all bibliographic metadata
Compared to the previous version, this release includes metadata related to citing and cited bibliographic resources added in the November 2024 version of Crossref, as well as the November 2024 dump of JaLC (Japan Link Center).
In this version, we have focused on correcting a specific type of error, namely the erroneous duplication of resources with the same identifier. We have successfully merged:
- 100% of duplicated identifiers (datacite:Identifier)
- 100% of duplicated responsible agents (foaf:Agent)
- 70% of duplicated bibliographic resources (fabio:Expression)
This dataset contains all the bibliographic metadata (in CSV format) included in OpenCitations Meta. In particular, each line of the CSV file defines a bibliographic resource, and includes the following information:
- [field "id"] the IDs for the document described within the line;
- [field "title"] the document's title;
- [field "author"] the authors of the document;
- [field "pub_date"] the date of publication;
- [field "venue"] information about the venue, i.e. the bibliographical resource to which the document belongs;
- [field "volume"] the volume sequence identifier (e.g. a number) to which the entity belongs;
- [field "issue"] the issuesequence identifier (e.g. a number) to which the entity belongs;
- [field "page"] the page range of the resource described in the row;
- [field "type"] the type of resource described in the row;
- [field "publisher"] the entity responsible for making the resource available;
- [field "editor"] the editors of the document.
This version of the dataset contains:
- 121,302,680 bibliographic entities
- 368,061,399 authors, 2,718,222 editors, and 101,612,475 publishers (counted by their roles, without disambiguating individual
- 698,995 publication venues
The compressed dataset weighs 12G, while, when extracted, it weighs 48G on an ext4 filesystem.
Additional information about OpenCitations Meta at official webpage.
Funding
GraspOS: next Generation Research Assessment to Promote Open Science
European Commission
Find out more...