Varieties of Reproducibility in Empirical and Computational Domains
journal contributionposted on 14.07.2018, 10:33 by Olivia Guest
In this talk I will discuss the different types of replicability/reproducibility, focussing heavily on the two that involve software/computational models. I recall talking to people at The Software Sustainability Institute when I applied and become a fellow, and I realized software people (coders, users) who aren't modelers predominantly care about code being time- and machine-independent. However scientists who are modelers also care about the specification, i.e., the theory, being captured by newly-created software written based on a journal or conference paper. In other words, modelers create models to capture theoretical frameworks in such a way that their ideas can be executed and the repercussions of the theoretical assumptions can be computationally tested. Machine learning is a field that cares deeply both about software and about modeling/science. My PhD brought to light a lot of the issues that arise when both types of replication cannot be carried out, underlining that even if the code was available all models should (strive to) be re-implementable given their specification/theory/article. I will elaborate on how code can be disseminated and preserved, but also that in many cases even code from a few months ago might not work out-of-the-box. However useful it is to be able to re-run code from the past, it is often secondary to doing good science because checking that the spec is generally correct — i.e., the theory is actually computationally captured — is more important to science. Implementation-only details, for example, might need to be upgraded to theory-level if they turn out to be imperative to modelling a certain effect. And vice versa, theory-level assumptions could be relaxed if it is found that other important aspects of the theory are nonetheless captured with a variety of implementations.