Visualizing the Effects of Data Transformations on Errors

Flight, Robert M; Moseley, Hunter

doi:10.6084/m9.figshare.3168775.v1

data_transformations_v2.pdf (4.6 MB)

Visualizing the Effects of Data Transformations on Errors

poster

posted on 2016-04-11, 18:43 authored by Robert M FlightRobert M Flight, Hunter MoseleyHunter Moseley

In many omics data analyses, a primary step is the application of a transformation to the data. Transformations are generally employed to convert proportional error (variance) to additive error, which most statistical methods appropriately handle. However, omics data frequently contain error sources that result in both additive and proportional errors [1 - 5]. To our knowledge, there has not been a systematic study on detecting the presence of proportional error in omics data, or the effect of transformations on the error structure.

In this work, we demonstrate a set of three simple graphs which facilitate the detection of proportional and mixed error in omics data when multiple replicates are available. The three graphs illustrate proportional and mixed error in a visually compelling manner that is both straight-forward to recognize and to communicate. The graphs plot the 1) absolute difference in the range, 2) standard deviation and 3) relative standard deviation against the mean signal across replicates (Fig. 1 C-E). In addition to showing the presence of different types of error, these graphs readily demonstrate the effect of various transformations on the error structure as well.

Using these graphical summaries we find that the log and hyperbolic inverse sin transforms are the most effective method of the common methods employed for removing proportional error.

All code used to generate the figures in this poster can be accessed in the Github repo https://github.com/rmflight/error_transformation