This dataset contains the provenance information (in N-Triples format) of all the citation data included in the OpenCitation Index, released on 29 November 2023. In particular, any citation in the dataset includes the following provenance information:
[citation IRI] the Open Citation Identifier (OCI) for the citation, defined in the final part of the URL identifying the citation (https://w3id.org/oc/index/ci/[OCI]);
[property "prov:wasAttributedTo"] the IRI of the agent that has created the citation data;
[property "prov:hadPrimarySource"] the IRI of the source dataset from where the citation data have been extracted;
[property "prov:generatedAtTime"] the creation time of the citation data.
[propert "prov:invalidatedAtTime"] the start of the destruction, cessation, or expiry of an existing entity by an activity.
[property "oco:hasUpdateQuery"] the UPDATE SPARQL query that keeps track of which metadata have been modified.
The size of the zipped archive is 79 GB, while the size of the unzipped N-Triples files is 2.5 TB.