%0 Journal Article %A Cooperman, Gene %D 2018 %T NSCI: SI2-SSE: An Extensible Model to Support Scalable Checkpoint-Restart for DMTCP across Multiple Disciplines %U https://figshare.com/articles/journal_contribution/NSCI_SI2-SSE_An_Extensible_Model_to_Support_Scalable_Checkpoint-Restart_for_DMTCP_across_Multiple_Disciplines/6176465 %R 10.6084/m9.figshare.6176465.v1 %2 https://ndownloader.figshare.com/files/11183738 %K NSF-SI2-2018-Talk %K checkpoint-restart %K DMTCP %K Computer Software not elsewhere classified %X DMTCP (Distributed MultiThreaded CheckPointing) is a widely used package for transparent checkpoint-restart. Checkpoint-restart saves to disk the state of a running process, and then to restart (possibly on a new computer) the process where it left off. DMTCP has grown from a monolithic package to a highly adaptable package supporting HPC (e.g., MPI), GPUs, high-performance networks; and applications such as cyber-security, EDA, science, and engineering. %I figshare