NSF-CSSI-Poster: Re-engineering Galaxy for Performance, Scalability and Energy Efficiency.

2020-01-31T16:44:07Z (GMT) by Mahmut Kandemir
Galaxy is an open source, web-based framework that is extensively used by more than 20,000 researchers world-wide for conducting research in many areas such as genomics, molecular dynamics, chemistry, drug discovery, and natural language processing. It provides a web-based environment using which scientists perform various computational analyses on their data, exchange results from these analyses, explore new research concepts, facilitate student training, and preserve their results for future use. Galaxy currently runs on a large variety of high-performance computing (HPC) platforms including local clusters, supercomputers in national labs, public datacenters and Cloud. Unfortunately, while most of these systems supplement conventional CPUs with significant accelerator capabilities (in the form of Graphical Processing Units (GPUs) and/or Field-Programmable Gate Arrays (FPGAs)), the current Galaxy implementation does not take advantage of these powerful accelerators. This is unfortunate because many Galaxy applications (e.g., sequence analysis, metabolomics, and metagenomics) are inherently parallelizable and can benefit from significant latency and throughput improvements when mapped to GPUs and FPGAs.

The main objectives of this proposed work are to (i) enable existing Galaxy tools to take full advantage of the immense computational capabilities offered by state-of-the-art GPUs and FPGAs, and at the same time, (ii) enlarge the Galaxy community by bringing the unique tool, analytics, data preservation and sharing capabilities provided by Galaxy to existing GPU and FPGA based applications from various
domains that currently do not use Galaxy.