figshare
Browse
1/1
12 files

Superheat: An R Package for Creating Beautiful and Extendable Heatmaps for Visualizing Complex Data

dataset
posted on 2018-05-18, 16:03 authored by Rebecca L. Barter, Bin Yu

The technological advancements of the modern era have enabled the collection of huge amounts of data in science and beyond. Extracting useful information from such massive datasets is an ongoing challenge as traditional data visualization tools typically do not scale well in high-dimensional settings. An existing visualization technique that is particularly well suited to visualizing large datasets is the heatmap. Although heatmaps are extremely popular in fields such as bioinformatics, they remain a severely underutilized visualization tool in modern data analysis. This article introduces superheat, a new R package that provides an extremely flexible and customizable platform for visualizing complex datasets. Superheat produces attractive and extendable heatmaps to which the user can add a response variable as a scatterplot, model results as boxplots, correlation information as barplots, and more. The goal of this article is two-fold: (1) to demonstrate the potential of the heatmap as a core visualization method for a range of data types, and (2) to highlight the customizability and ease of implementation of the superheat R package for creating beautiful and extendable heatmaps. The capabilities and fundamental applicability of the superheat package will be explored via three reproducible case studies, each based on publicly available data sources.

Funding

Air Force Office of Scientific Research [FA9550-14-1-0016]; National Human Genome Research Institute [1U01HG007031-01 (ENCODE)]; National Science Foundation [CCF-0939370, CDS&E-MSS 1228246, DMS-1107000, DMS-1160319 (FRG)].

History