figshare
Browse

sorry, we can't preview this file

ref_LCMV_Atlas_mouse_v1.rds (843.47 MB)

ProjecTILs murine reference atlas of virus-specific CD8 T cells, version 1

Download (843.47 MB)
dataset
posted on 2020-06-16, 12:44 authored by Massimo Andreatta, Santiago CarmonaSantiago Carmona
We have developed ProjecTILs, a computational approach to project new data sets into a reference map of T cells, enabling their direct comparison in a stable, annotated system of coordinates. Because new cells are embedded in the same space of the reference, ProjecTILs enables the classification of query cells into annotated, discrete states, but also over a continuous space of intermediate states. By comparing multiple samples over the same map, and across alternative embeddings, the method allows exploring the effect of cellular perturbations (e.g. as the result of therapy or genetic engineering) and identifying genetic programs significantly altered in the query compared to a control set or to the reference map.

We illustrate the projection of several data sets from recent publications over two cross-study murine T cell reference atlases: the first describing tumor-infiltrating T lymphocytes (TILs), the second characterizing acute and chronic viral infection.

Single-cell data to build the virus-specific CD8 T cell reference map were downloaded from GEO under the following entries: GSE131535, GSE134139 and GSE119943, selecting only samples in wild type conditions. Data for the Ptpn2-KO, Tox-KO and CD4-depletion projections were obtained from entries GSE134139, GSE119943, and GSE137007 and were not included in the construction of the reference map.

To construct the LCMV reference map, we split the dataset into five batches that displayed strong batch effects, and applied STACAS (https://github.com/carmonalab/STACAS) to mitigate its confounding effects. We computed 800 variable genes per batch, excluding cell cycling genes, ribosomal and mitochondrial genes, and computed pairwise anchors using 200 integration genes, and otherwise default STACAS parameters. Anchors were filtered at the default threshold 0.8 percentile, and integration was performed with the IntegrateData Seurat3 function with the guide tree suggested by STACAS.

Next, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method implemented in Seurat 3 with parameters {resolution=0.4, reduction=”pca”, k.param=20}. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: i) average expression of key marker genes in individual clusters; ii) gradients of gene expression over the UMAP representation of the reference map; iii) gene-set enrichment analysis to determine over- and under- expressed genes per cluster using MAST. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat3 using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).

Funding

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung 180010

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC