Last updated: 2019-07-16
Checks: 4 3
Knit directory: MAGL/
This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
The global environment had objects present when the code in the R Markdown file was run. These objects can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment. Use wflow_publish
or wflow_build
to ensure that the code is always run in an empty environment.
The following objects were defined in the global environment when these results were created:
Name | Class | Size |
---|---|---|
data | environment | 56 bytes |
env | environment | 56 bytes |
The command set.seed(20190311)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
To ensure reproducibility of the results, delete the cache directory 3-differential_cache
and re-run the analysis. To have workflowr automatically delete the cache directory prior to building the file, set delete_cache = TRUE
when running wflow_build()
or wflow_publish()
.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: ._.Renviron
Ignored: ._.gitignore
Ignored: ._README.md
Ignored: ._Snakefile
Ignored: ._config.yaml
Ignored: .snakemake/
Ignored: MAGL/.DS_Store
Ignored: MAGL/.RData
Ignored: MAGL/.Rhistory
Ignored: MAGL/._.DS_Store
Ignored: MAGL/._.Rprofile
Ignored: MAGL/._.gitignore
Ignored: MAGL/._data
Ignored: MAGL/analysis/._refs.bib
Ignored: MAGL/analysis/0-preprocessing_cache/
Ignored: MAGL/analysis/1-clustering_cache/
Ignored: MAGL/analysis/3-differential_cache/
Ignored: MAGL/analysis/4-visualization_cache/
Ignored: MAGL/analysis/5-geneset_cache/
Ignored: MAGL/analysis/6-more_cache/
Ignored: MAGL/code/._utils.R
Ignored: MAGL/data/
Ignored: MAGL/output/
Ignored: data/
Ignored: figures/
Ignored: meta/
Ignored: results/
Ignored: scripts/.DS_Store
Ignored: scripts/._.DS_Store
Ignored: scripts/._apply_scdd.R
Ignored: scripts/._plot_perf_by_expr.R
Ignored: scripts/._plot_perf_by_lfc.R
Ignored: scripts/._plot_perf_by_ss.R
Ignored: scripts/._plot_runtimes.R
Ignored: scripts/._plot_sim_ex.R
Ignored: scripts/._prep_data.R
Ignored: scripts/._prep_magl.R
Ignored: scripts/._prep_sim.R
Ignored: scripts/._sim_qc.R
Ignored: scripts/._utils.R
Untracked files:
Untracked: .RData
Untracked: .RDataTmp
Untracked: .Rhistory
Untracked: MAGL/code/utils.R
Untracked: MAGL/current
Untracked: MAGL/docs/figure/3-differential.Rmd/
Untracked: figs/
Unstaged changes:
Modified: MAGL/.Rprofile
Modified: MAGL/analysis/2-annotation.Rmd
Modified: MAGL/analysis/3-differential.Rmd
Deleted: scripts/fig_vehicle_vs_lps.R
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 31c612f | HelenaLC | 2019-07-16 | update cluster anno; write res to .csv |
Rmd | 244411f | HelenaLC | 2019-07-11 | update lps analysis to include removal of doublets |
html | 244411f | HelenaLC | 2019-07-11 | update lps analysis to include removal of doublets |
Rmd | 08bf260 | HelenaLC | 2019-06-17 | update |
html | 08bf260 | HelenaLC | 2019-06-17 | update |
Rmd | 61c0b06 | HelenaLC | 2019-05-20 | update |
Rmd | faf858d | HelenaLC | 2019-05-17 | add DS & geneset analysis |
Rmd | dde5200 | HelenaLC | 2019-05-15 | add ds analysis (in progress) |
library(ComplexHeatmap)
library(cowplot)
library(dplyr)
library(limma)
library(magrittr)
library(muscat)
library(scater)
library(SingleCellExperiment)
muscat
# load data
sce <- readRDS(file.path("output", "MAGL-SCE.rds"))
# make WT reference group
sce$group_id <- factor(sce$group_id, levels = c("WT", "LPS"))
# reorder sample levels
m <- match(levels(sce$sample_id), sce$sample_id)
o <- order(sce$group_id[m])
sce$sample_id <- factor(sce$sample_id, levels = levels(sce$sample_id)[o])
# prep. SCE for `muscat`
sce <- prepSCE(sce, "cluster_id", "sample_id", "group_id")
(ei <- metadata(sce)$experiment_info)
sample_id group_id n_cells
1 LC016 WT 3899
2 LC017 LPS 2074
3 LC019 WT 2342
4 LC020 LPS 3375
5 LC022 WT 3273
6 LC023 LPS 3158
7 LC025 WT 2926
8 LC026 LPS 4510
In consideration of the visualizations provided in the Annotation tab and additional exploration with iSEE, we arrive at the following cluster annotations:
anno <- list(
"Astrocytes" = 3,
"Endothelial" = 13,
"Microglia" = 17,
"Oligodendrocytes" = 4,
"OPC" = 12,
"CPE cells" = c(18, 19),
"Excit. Neuron" = c(0, 1, 2, 5, 6, 7, 10, 14, 16),
"Inhib. Neuron" = c(8, 9, 11, 15))
m <- match(sce$cluster_id, unlist(anno))
ns <- vapply(anno, length, numeric(1))
lab <- rep.int(names(anno), ns)
sce$cluster_id <- factor(lab[m], levels = names(anno))
# cluster-sample cell-counts
table(sce$cluster_id, sce$sample_id)
LC016 LC019 LC022 LC025 LC017 LC020 LC023 LC026
Astrocytes 138 225 255 91 96 297 189 341
Endothelial 51 102 64 53 43 171 77 192
Microglia 20 31 47 35 30 111 34 98
Oligodendrocytes 130 202 344 114 61 291 128 168
OPC 94 79 121 66 40 90 115 158
CPE cells 20 9 42 104 66 96 71 22
Excit. Neuron 2887 1455 1948 1874 1456 1998 2132 2952
Inhib. Neuron 559 239 452 589 282 321 412 579
# normalize for visualization
sce <- normalize(sce)
system.time(pb <- aggregateData(sce))
user system elapsed
10.484 0.296 9.020
pb
class: SingleCellExperiment
dim: 11063 8
metadata(4): experiment_info log.exprs.offset agg_pars n_cells
assays(8): Astrocytes Endothelial ... Excit. Neuron Inhib. Neuron
rownames(11063): ENSMUSG00000051951.Xkr4 ENSMUSG00000089699.Gm1992 ...
ENSMUSG00000095041.PISD ENSMUSG00000063897.DHRSX
rowData names(0):
colnames(8): LC016 LC019 ... LC023 LC026
colData names(6): group_id orig.ident ... total_features_by_counts_drop
pct_counts_Mt_drop
reducedDimNames(0):
spikeNames(0):
Prior to conducting any formal testing, we can compute a multi-dimensional scaling (MDS) plot of aggregated signal to explore overall sample similarities. Ideally, such a represenation of the data should separate both clusters and groups from one another. Vice versa, samples from the same cluster/group should fall close to each other.
In our MDS plot on pseudobulk counts (Fig. @ref(fig:pb-mds)), we can observe that cell-populations (clusters) are separated quite well, while WT and stimulated samples (groups) are separated to a lesser (if any) extend.
pbMDS(pb)
Pseudobulk-level MDS plot. Each point represent one cluster-sample instance; points are colored by cluster ID and shaped by group ID.
Version | Author | Date |
---|---|---|
08bf260 | HelenaLC | 2019-06-17 |
system.time(res <- pbDS(pb, verbose = FALSE))
user system elapsed
28.040 0.648 28.713
tbl <- bind_rows(res$table[[1]])
tbl <- filter(tbl, p_adj.loc < 0.05, abs(logFC) > 1)
tbl <- mutate_if(tbl, is.numeric, format, digits = 4)
saveRDS(sce, file.path("output", "MAGL-SCE.rds"))
saveRDS(res, file.path("output", "MAGL-DS_res.rds"))
write.csv(tbl, file.path("output", "MAGL-DS_res.csv"), row.names = FALSE)
sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
Matrix products: default
BLAS: /usr/local/R/R-3.6.0/lib/libRblas.so
LAPACK: /usr/local/R/R-3.6.0/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C LC_TIME=en_CA.UTF-8
[4] LC_COLLATE=en_CA.UTF-8 LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_CA.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel grid stats graphics grDevices utils datasets methods
[10] base
other attached packages:
[1] scater_1.12.2 SingleCellExperiment_1.6.0 SummarizedExperiment_1.14.0
[4] DelayedArray_0.10.0 BiocParallel_1.18.0 matrixStats_0.54.0
[7] Biobase_2.44.0 GenomicRanges_1.36.0 GenomeInfoDb_1.20.0
[10] IRanges_2.18.1 S4Vectors_0.22.0 BiocGenerics_0.30.0
[13] muscat_0.99.10 magrittr_1.5 limma_3.40.2
[16] cowplot_0.9.4 ggplot2_3.2.0 ComplexHeatmap_2.0.0
[19] BiocStyle_2.12.0
loaded via a namespace (and not attached):
[1] backports_1.1.4 circlize_0.4.6 Hmisc_4.2-0
[4] blme_1.0-4 workflowr_1.4.0 plyr_1.8.4
[7] igraph_1.2.4.1 lazyeval_0.2.2 TMB_1.7.15
[10] splines_3.6.0 listenv_0.7.0 digest_0.6.20
[13] foreach_1.4.4 htmltools_0.3.6 viridis_0.5.1
[16] gdata_2.18.0 lmerTest_3.1-0 checkmate_1.9.4
[19] memoise_1.1.0 cluster_2.1.0 doParallel_1.0.14
[22] globals_0.12.4 annotate_1.62.0 prettyunits_1.0.2
[25] colorspace_1.4-1 blob_1.2.0 xfun_0.8
[28] dplyr_0.8.3 crayon_1.3.4 RCurl_1.95-4.12
[31] genefilter_1.66.0 lme4_1.1-21 zeallot_0.1.0
[34] survival_2.44-1.1 iterators_1.0.10 glue_1.3.1
[37] gtable_0.3.0 zlibbioc_1.30.0 XVector_0.24.0
[40] GetoptLong_0.1.7 BiocSingular_1.0.0 future.apply_1.3.0
[43] shape_1.4.4 scales_1.0.0 DBI_1.0.0
[46] edgeR_3.26.5 Rcpp_1.0.1 viridisLite_0.3.0
[49] xtable_1.8-4 progress_1.2.2 htmlTable_1.13.1
[52] clue_0.3-57 dqrng_0.2.1 foreign_0.8-71
[55] bit_1.1-14 rsvd_1.0.1 Formula_1.2-3
[58] htmlwidgets_1.3 gplots_3.0.1.1 RColorBrewer_1.1-2
[61] acepack_1.4.1 pkgconfig_2.0.2 XML_3.98-1.20
[64] nnet_7.3-12 locfit_1.5-9.1 dynamicTreeCut_1.63-1
[67] tidyselect_0.2.5 rlang_0.4.0 reshape2_1.4.3
[70] AnnotationDbi_1.46.0 munsell_0.5.0 tools_3.6.0
[73] RSQLite_2.1.1 evaluate_0.14 stringr_1.4.0
[76] yaml_2.2.0 knitr_1.23 bit64_0.9-7
[79] fs_1.3.1 caTools_1.17.1.2 purrr_0.3.2
[82] future_1.14.0 nlme_3.1-140 whisker_0.3-2
[85] scran_1.12.1 grr_0.9.5 pbkrtest_0.4-7
[88] compiler_3.6.0 rstudioapi_0.10 beeswarm_0.2.3
[91] png_0.1-7 variancePartition_1.14.0 statmod_1.4.32
[94] tibble_2.1.3 geneplotter_1.62.0 stringi_1.4.3
[97] highr_0.8 lattice_0.20-38 Matrix_1.2-17
[100] nloptr_1.2.1 vctrs_0.2.0 pillar_1.4.2
[103] BiocManager_1.30.4 GlobalOptions_0.1.0 BiocNeighbors_1.2.0
[106] data.table_1.12.2 bitops_1.0-6 irlba_2.3.3
[109] Matrix.utils_0.9.7 colorRamps_2.3 R6_2.4.0
[112] latticeExtra_0.6-28 KernSmooth_2.23-15 gridExtra_2.3
[115] vipor_0.4.5 codetools_0.2-16 gtools_3.8.1
[118] boot_1.3-23 MASS_7.3-51.4 assertthat_0.2.1
[121] DESeq2_1.24.0 rprojroot_1.3-2 rjson_0.2.20
[124] withr_2.1.2 sctransform_0.2.0 GenomeInfoDbData_1.2.1
[127] hms_0.5.0 rpart_4.1-15 glmmTMB_0.2.3
[130] minqa_1.2.4 rmarkdown_1.13 DelayedMatrixStats_1.6.0
[133] git2r_0.26.1 numDeriv_2016.8-1.1 base64enc_0.1-3
[136] ggbeeswarm_0.6.0