This dataset contains all the bibliographic metadata (in CSV format) included in OpenCitations Meta, released on 20 December 2022. In particular, each line of the CSV file defines a bibliographic resource, and includes the following information:
[field "id"] the IDs for the document described within the line;
[field "title"] the document's title;
[field "author"] the authors of the document;
[field "pub_date"] the date of publication;
[field "venue"] information about the venue, i.e. the bibliographical resource to which the document belongs;
[field "volume"] the volume sequence identifier (e.g. a number) to which the entity belongs;
[field "issue"] the issuesequence identifier (e.g. a number) to which the entity belongs;
[field "page"] the page range of the resource described in the row;
[field "type"] the type of resource described in the row;
[field "publisher"] the entity responsible for making the resource available;
[field "editor"] the editors of the document.
This version of the dataset contains:
87,321,593 bibliographic entities
277,750,235 authors and 2,359,301 editors (counted by their roles, without disambiguating individuals)
710,226 publication venues
17,268 publishers
The zipped dataset weighs 7.62 GB, while extracted 35.0 GB.
Additional information about OpenCitations Meta at official webpage.
Funding
OpenAIRE-Nexus Scholarly Communication Services for EOSC users