SeedMe2: Data sharing building blocks
journal contributionposted on 27.01.2017, 17:26 by Amit Chourasia, David Nadeau, John Moreland, Dmitry Mishin, Michael Norman
The need for data sharing and rapid data access has become central with the rise of collaborative research in many disciplines. Several data sharing approaches have emerged for consumer use cases that primarily need an easy way to share files using web browsers. However, these approaches are not well suited to the particular demands of large-scale data sharing for computational research. Whereas consumer approaches primarily support manual user interfaces to add and remove files, the huge number of files that can be generated during and after a large-scale computation job make manual data sharing interfaces impractical. Instead, these tasks require mechanisms that integrate into computation workflows to automatically post files during and after computation jobs. Furthermore, scientific data sharing requires additional metadata and descriptive information that characterizes shared data to record job and compute platform characteristics, input data, job parameters, job completion status, and other record-keeping required to document the trajectory of computational research. Without these features, consumer data sharing approaches are not well suited for computational science.