CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Poster
2020-02-17T14:47:46Z (GMT) by
PI: G. Fox, Co-PIs: Madhav Marathe, Shantenu Jha, Judy Qiu, Fusheng Wang
Institutions: Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia, and Utah.
NSF 1443054 “Middleware and High-Performance Analytics Libraries for Scalable Data Science” is a collaboration between 7 collaborating Universities at Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia, and Utah. It addresses the intersection of HPC and Big Data computing with several different application areas or communities driving the requirements for software systems and algorithms. The base architecture includes the HPC-ABDS, High-Performance Computing Enhanced Apache Big Data Stack, and application use cases identifying key features that determine software and algorithm requirements. The middleware includes the Harp-DAAL collective communication layer, Twister2 Big Data toolkit, and RADICAL pilot jobs for batch and streaming applications. The SPIDAL Scalable Parallel Interoperable Data Analytics Library includes core machine-learning, image processing, and the application communities, Network science, Polar Science, Biomolecular Simulations, Pathology, and Spatial systems. Recent work focuses on the integration of ML with HPC in HPCafterML in biomolecular simulations and a broad study of HPCforML.