2019 Exascale Computing Project Annual Meeting Tutorial: Better Scientific Software
datasetposted on 12.01.2019 by David E. Bernholdt, Anshu Dubey, Jared O'Neal
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
The computational science and engineering (CSE) community is in the midst of an extremely challenging period created by the confluence of disruptive changes in computing architectures, demand for greater scientific reproducibility, and new opportunities for greatly improved simulation capabilities, especially through coupling physics and scales. Computer architecture changes require new software design and implementation strategies, including significant refactoring of existing code. Reproducibility demands require more rigor across the entire software endeavor. Code coupling requires aggregate team interactions including integration of software processes and practices. These challenges demand large investments in scientific software development and improved practices. Focusing on improved developer productivity and software sustainability is both urgent and essential.
This half day tutorial distills multi-project and multi-years experience from members of the IDEAS project, and creators of the BSSw.io community website. This tutorial will provide information and hands-on experience with software practices, processes, and tools explicitly tailored for CSE. Goals are improving the productivity of those who develop CSE software and increasing the sustainability of software artifacts. We discuss practices that are relevant for projects of all sizes, with emphasis on small teams, and on aggregate teams composed of small teams. Topics include reasons and motivation for caring about software productivity in science teams, effective models, tools, and processes for small teams (including agile workflow management), reproducibility, and scientific software testing and verification (including automated testing and continuous integration), and refactoring.
* Why effective software practices are essential for CSE projects [30 min] - Terminology - Understanding what you want from your CSE software and how to achieve it
* Effective models, tools, processes and practices for CSE software teams [30 min] - Small team models, challenges - Aggregate team models, challenges - Agile workflow management - Hands on: Issue tracking via Kanban in GitHub
* Reproducibility [30 min] - Increasing demand for reproducibility - Role of better software practices to support reproducibility - Preparing for next-generation publication requirements
* Software testing and evolution [45 min] - Introduction and motivation - How to develop tests and test-suites - Planning and verification for refactoring
* Software maintenance [45 min] - Workflow management with git - Using Travis CI.
We consider this tutorial to have a mix of content, with the majority being at the beginning and intermediate levels.