scUTRquant SingleCellExperiment and SummarizedExperiment Objects
Overview
This dataset contains R Bioconductor objects that have 3'UTR isoform counts quantified with the scUTRquant pipeline (https://doi.org/10.5281/zenodo.8118393). These objects are part of the minimum dataset required for verifying the analysis reported in Fansler et al., bioRxiv, 2023.
Loading objects
The objects can be loaded into R 4.2 using Bioconductor 3.16. A minimal Conda environment definition is provided for creating compatible environments. For example:
conda env create -n sce_bioc_3_16 -f envs/sce_bioc_3_16.min.yaml
The .Rds objects can be loaded with (for example):
sce <- readRDS("sce/utrome_mm10_v2/bleckwehl21.txs.Rds")
Descriptions
The objects are organized as follows:
envs
- Conda environment definitionssce_bioc_3_16.full.yaml
- solved version forosx-64
(for record only)sce_bioc_3_16.min.yaml
- minimal version intended for reuse
sce
- SingleCellExperiment objectsgencode_vM25_pc_w500
- data quantified using truncated GENCODE vM25 protein coding transcriptsdahlin18.txs.Rds
- HSPCs from Dahlin et al., 2018guo19.txs.Rds
- mESCs from Guo et al., 2019tmuris.txs.Rds
- 10X Chromium 3' data from Tabula Murisximerakis19.txs.Rds
- aging brain samples from Ximerakis et al., 2019
utrome_hg38_v1
- data quantified using the Microwell-seq-derived annotation provided in scUTRquantpbmc_10k_v3_fastq.txs.Rds
- 10X v3 kit demonstration 10K PBMCspbmc_1k_v2_fastq.txs.Rds
- 10X v2 kit demonstration 1K PBMCspbmc_1k_v3_fastq.txs.Rds
- 10X v3 kit demonstration 1K PBMCstsapiens.txs.full_annot.Rds
- 10X Chromium 3' data from Tabula Sapiens
utrome_mm10_v2
- data quantified using the Microwell-seq-derived annotation provided in scUTRquantbleckwehl21.txs.Rds
- mESCs from Bleckwehl et al., 2021guo19.txs.Rds
- mESCs from Guo et al., 2019heart_10k_v3_fastq.txs.Rds
- 10X v3 kit demonstration 10K heart cellsheart_1k_v2_fastq.txs.Rds
- 10X v2 kit demonstration 1K heart cellsheart_1k_v3_fastq.txs.Rds
- 10X v3 kit demonstration 1K heart cellsmerged.txs.full_annot.Rds
- combined atlas with annotations including HSPCs from Dahlin et al., 2018; mESCs from Guo et al., 2019; 10X Chromium 3' data from Tabula Muris; brain cells from Ximerakis et al., 2019neuron_10k_v3_fastq.txs.Rds
- 10X v3 kit demonstration 10K neuronsneuron_1k_v2_fastq.txs.Rds
- 10X v2 kit demonstration 1K neuronsneuron_1k_v3_fastq.txs.Rds
- 10X v3 kit demonstration 1K neurons
se
- SummarizedExperiment objects of bulk 3'-seq or pseudobulk 10X Chromium 3' datautrome_hg38_v1
- data quantified using the Microwell-seq-derived annotation provided in scUTRquantkd6_essential_bulk_expressed.Rds
- K562 6-day essential Perturb-seq screen from Replogle et al., 2022rd7_essential_bulk_expressed.Rds
- RPE1 7-day essential Perturb-seq screen from Replogle et al., 2022
utrome_mm10_v2
- data quantified using the Microwell-seq-derived annotation provided in scUTRquanthspcs_bulk.txs.rds
- HSCs and MPPs from Sommerkamp et al., 2020
Data Generation
UTRome Annotations
The annotations (kallisto indices) used in data quantification were generated in the pipelines:
gencode_vM25_pc_w500
- https://github.com/Mayrlab/txcutr-db (https://doi.org/10.5281/zenodo.8118405)utrome_hg38_v1
- https://github.com/Mayrlab/hcl-utrome (https://doi.org/10.5281/zenodo.8118411)utrome_mm10_v2
- https://github.com/Mayrlab/mca-utrome (https://doi.org/10.5281/zenodo.8118416)
Sample Sheets and Configurations
Sample sheets and configuration files for how the raw data were run in scUTRquant are available in the scUTRquant-inputs
repository (https://doi.org/10.5281/zenodo.10901352).
All raw data were run in scUTRquant (https://doi.org/10.5281/zenodo.8118393), with the exception of hspcs_bulk.txs.rds
which used the pipeline at https://github.com/Mayrlab/sommerkamp20 (https://doi.org/10.5281/zenodo.10892210).
Post-processing Pipelines
Downstream of scUTRquant, additional annotation, filtering, merging, and summarization was performed in the following pipelines:
merged.txs.full_annot.Rds
- https://github.com/Mayrlab/atlas-mm (https://doi.org/10.5281/zenodo.10895352)tsapiens.txs.full_annot.Rds
- https://github.com/Mayrlab/atlas-hs (https://doi.org/10.5281/zenodo.10895337)(kd6|rd7)_essential_bulk_expressed.Rds
- https://github.com/Mayrlab/gwps-sq (https://doi.org/10.5281/zenodo.10895730)
Downstream Analyses
Additional downstream analyses which use these objects are available in https://github.com/Mayrlab/scUTRquant-figures (https://doi.org/10.5281/zenodo.8118443).
Funding
Tri-Institutional Training Program in Computational Biology and Medicine
National Institute of General Medical Sciences
Find out more...3'UTR-mediated protein-protein interactions determine protein functions
National Institute of General Medical Sciences
Find out more...Regulation of protein multi-functionality by 3 UTRs
National Institute of General Medical Sciences
Find out more...Function and therapeutic targeting of 3'UTR-dependent protein localization (UDPL) in cancer
Pershing Square Foundation
Find out more...