Data Sharing Effect on Article Citation Rate in Paleoceanography
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
The validation of scientific results requires reproducible methods and data. Often, however, data sets supporting research articles are not openly accessible and interlinked. This analysis tests whether open sharing and linking of supporting data through the PANGAEA® data library measurably increases the citation rate of articles published between 1993 and 2010 in the journal Paleoceanography as reported in the Thomson Reuters Web of Science database. The 12.85% (171) of articles with publicly available supporting data sets received 19.94% (8,056) of the aggregate citations (40,409). Publicly available data were thus significantly (p=0.007, 95% confidence interval) associated with about 35% more citations per article than the average of all articles sampled over the 18-year study period (1,331), and the increase is fairly consistent over time (14 of 18 years). This relationship between openly available, curated data and increased citation rate may incentivize researchers to share their data.