1/1
18 files

COCI N-Triples dataset of all the citation data

dataset
posted on 29.07.2021, 21:14 authored by OpenCitations ​OpenCitations ​
This dataset contains all the citation data (in N-Triples format) included in COCI, released on 29 July 2021. In particular, any citation in the dataset, defined as an individual of the class cito:Citation, includes the following information:
  • [citation IRI] the Open Citation Identifier (OCI) for the citation, defined in the final part of the URL identifying the citation (https://w3id.org/oc/index/coci/ci/[OCI]);
  • [property "cito:hasCitingEntity"] the citing entity identified by its DOI URL (http://dx.doi.org/[DOI]);
  • [property "cito:hasCitedEntity"] the cited entity identified by its DOI URL (http://dx.doi.org/[DOI]);
  • [property "cito:hasCitationCreationDate"] the creation date of the citation (i.e. the publication date of the citing entity);
  • [property "cito:hasCitationTimeSpan"] the time span of the citation (i.e. the interval between the publication date of the cited entity and the publication date of the citing entity);
  • [type "cito:JournalSelfCitation"] it records whether the citation is a journal self-citations (i.e. the citing and the cited entities are published in the same journal);
  • [type "cito:AuthorSelfCitation"] it records whether the citation is an author self-citation (i.e. the citing and the cited entities have at least one author in common).
This version of the dataset contains:
  • 1,094,394,688 citations;
  • 65,835,422 bibliographic resources.
The size of the zipped archive is 54.9 GB, while the size of the unzipped N-Triples file is 1.18 TB.

Additional information about COCI can be retrieved in the official webpage.

Funding

Wellcome Trust 'Open Biomedical Citations in Context Corpus' - Open Research Fund 2018, https://wellcome.ac.uk/funding/people-and-projects/grants-awarded/open-biomedical-citations-context-corpus

History

Usage metrics

Licence

Exports