Artifact Execution Curation for Repeatability in Artifact Evaluation

<br>When faced with a crisis in reproducibility, scientists, in particular computer scientists, responded with increased scrutiny of published results and the software behind them.<br>One of those responses is Artifact Evaluation (AE - artifact-eval.org) that is gaining momentum within ACM and IEEE conferences. In fact, preliminary studies show that papers that go through AE have more citations on average [1]. In order to support efforts such as AE, scientists need to develop tools that improve the repeatability and reproducibility of computational artifacts.<br><br>In this poster we present Occam, an Experiment and Artifact System for Evaluation (EASE). Occam preserves both software artifacts and their execution, and is a fundamental step towards repeatable and reproducible science. Occam packages software with metadata that describe how to build and run them, and preserves all of their dependencies by archiving them within the system. These software artifacts can then be deployed on-demand in reproducible environments that can be deployed in Virtual Machines (e.g. Docker). In addition, Occam allows the definition of workflows - graphs that describe the execution of multiple artifacts in a sequence of operations. These workflows can then be executed to generate outputs that are also packaged and kept in the system.<br><br>All Occam packaged artifacts and results keep their lineage and provenance information. As such, it is possible to inspect both current and past versions of packages, and to follow results all the way to the experiments and artifacts that created them. This allows evaluators to inspect not only the results themselves but also the software and datasets that generated them. Moreover, it is possible to clone, modify, and re-execute workflows (e.g. with different parameters or input datasets). Thus, reviewers and future users of packaged artifacts can independently validate past results and re-use other scientists work.<br><br>[1] Bruce R. Childers and Panos K. Chrysanthis. "Artifact Evaluation: Is it a Real Incentive?."<br>Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE5. 2). 2017.<br><br>