This dataset contains the provenance information (in N-Triples format)
of all the citation data included in COCI, released on the 4 July 2020. In particular, any citation in the dataset includes the following provenance information:
[citation IRI] the Open Citation Identifier (OCI) for the citation, defined in the
final part of the URL identifying the citation
(https://w3id.org/oc/index/coci/ci/[OCI]);;
[property "prov:wasAttributedTo"] the IRI of the agent that have created the citation data;
[property "prov:hadPrimarySource"] the IRI of the source dataset from where the citation data have been extracted;
[property "prov:generatedAtTime"] the creation time of the citation data.
The size of the zipped archive is 38.1 GB, while the size of the unzipped N-Triples file is 1.58 TB.
Additional information about COCI can be retrieved in the official webpage.
Funding
Wellcome Trust 'Open Biomedical Citations in Context Corpus' - Open Research Fund 2018, https://wellcome.ac.uk/funding/people-and-projects/grants-awarded/open-biomedical-citations-context-corpus