Fast Dynamic Load Balancing Tools for Extreme Scale Systems

High performance simulations running on distributed memory, parallel systems require even work distributions with minimal communications. To efficiently maintain these distributions on systems with accelerators, the balancing and partitioning procedures must utilize the accelerator. This work presents algorithms and speedup results using OpenCL and Kokkos to accelerate critical portions of the EnGPar hypergraph-based diffusive load balancer. Focus is given to basic hypergraph traversal and selection procedures.