A Compiler Framework for Optimizing Dynamic Parallelism on GPUs
softwareposted on 2021-11-19, 09:13 authored by Juan Gómez Luna, Onur Mutlu, Wen-mei Hwu, Izzat El Hajj, Mhd Ghaith OlabiMhd Ghaith Olabi
Our artifact is a compiler implemented within Clang for optimizing applications that use dynamic parallelism. Since building the compiler requires building Clang/LLVM which can be time and resource consuming, we provide pre-built binaries in a Docker image, along with the required dependences and the benchmarks/datasets on which the compiler has been evaluated. Reviewers can use these binaries to transform the CUDA code with our optimizations, then compile and run the code on a CUDA-capable GPU. Scripts are provided to automate this process.