figshare
Browse
ci6b00260_si_001.txt (14.09 kB)

Automated Protocol for Large-Scale Modeling of Gene Expression Data

Download (14.09 kB)
dataset
posted on 2016-10-31, 00:00 authored by Michelle Lynn Hall, David Calkins, Woody Sherman
With the continued rise of phenotypic- and genotypic-based screening projects, computational methods to analyze, process, and ultimately make predictions in this field take on growing importance. Here we show how automated machine learning workflows can produce models that are predictive of differential gene expression as a function of a compound structure using data from A673 cells as a proof of principle. In particular, we present predictive models with an average accuracy of greater than 70% across a highly diverse ∼1000 gene expression profile. In contrast to the usual in silico design paradigm, where one interrogates a particular target-based response, this work opens the opportunity for virtual screening and lead optimization for desired multitarget gene expression profiles.

History