Dataset of PLOS Computational Biology code sharing rates, 2019-2022

posted on 27.05.2022, 10:55 by Lauren CadwalladerLauren Cadwallader, Tim Vines, Linda Andersson

This dataset contains article metadata and information about code generation and sharing rates for research articles published in PLOS Computational Biology from 1 January 2019 to 31 March 2022.

The decriptive metadata, e.g. article title, publication data, author countries, is taken from the article .xml files. Additional information around code generation and sharing rates was derived using Natural Langugage Processing. The code used for this is available at 

The Excel file contains 4 worksheets:


2) Raw data

3) Calculations of the code sharing rates that are used in the Editorial

4) Data used to create Figure 1 in the Editorial.

The Editorial that this dataset supports is:

Cadwallader, L., F. Mac Gabhann, J. Papin & V. E. Pitzer (2022) Advancing code sharing in the computational biology community. PLOS Computational Biology 18(6): e1010193.


