We developed a computational method (Celligner) that identifies and removes systematic differences between cell lines and tumor gene expression profiles, allowing for direct integration of existing large-scale cancer cell line and tumor datasets. Celligner performs this computational alignment across cancer types in a completely unsupervised fashion, without relying on prior annotations of cancer types, tumor sample purity, or contaminating cell expression profiles. We applied Celligner to produce a global alignment of 12,236 tumor samples from TCGA, TARGET, and Treehouse datasets and 1,249 cell lines from DepMap.
This dataset includes Celligner-aligned data, a matrix of correlations between cell lines and tumors, associated cell line and tumor metadata, and other outputs from the Celligner method. See Readme file for more details about the dataset contents and version history.