Last updated: 2019-07-16

Checks: 4 3

Knit directory: MAGL/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

The global environment had objects present when the code in the R Markdown file was run. These objects can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment. Use wflow_publish or wflow_build to ensure that the code is always run in an empty environment.

The following objects were defined in the global environment when these results were created:

Name Class Size
data environment 56 bytes
env environment 56 bytes

The command set.seed(20190311) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

The following chunks had caches available:
  • aggregation
  • cluster-anno
  • ds-analysis
  • load-libs
  • pb-mds
  • prep-data
  • save-rds
  • session-info-chunk-inserted-by-workflowr
  • unnamed-chunk-1

To ensure reproducibility of the results, delete the cache directory 3-differential_cache and re-run the analysis. To have workflowr automatically delete the cache directory prior to building the file, set delete_cache = TRUE when running wflow_build() or wflow_publish().

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    ._.Renviron
    Ignored:    ._.gitignore
    Ignored:    ._README.md
    Ignored:    ._Snakefile
    Ignored:    ._config.yaml
    Ignored:    .snakemake/
    Ignored:    MAGL/.DS_Store
    Ignored:    MAGL/.RData
    Ignored:    MAGL/.Rhistory
    Ignored:    MAGL/._.DS_Store
    Ignored:    MAGL/._.Rprofile
    Ignored:    MAGL/._.gitignore
    Ignored:    MAGL/._data
    Ignored:    MAGL/analysis/._refs.bib
    Ignored:    MAGL/analysis/0-preprocessing_cache/
    Ignored:    MAGL/analysis/1-clustering_cache/
    Ignored:    MAGL/analysis/3-differential_cache/
    Ignored:    MAGL/analysis/4-visualization_cache/
    Ignored:    MAGL/analysis/5-geneset_cache/
    Ignored:    MAGL/analysis/6-more_cache/
    Ignored:    MAGL/code/._utils.R
    Ignored:    MAGL/data/
    Ignored:    MAGL/output/
    Ignored:    data/
    Ignored:    figures/
    Ignored:    meta/
    Ignored:    results/
    Ignored:    scripts/.DS_Store
    Ignored:    scripts/._.DS_Store
    Ignored:    scripts/._apply_scdd.R
    Ignored:    scripts/._plot_perf_by_expr.R
    Ignored:    scripts/._plot_perf_by_lfc.R
    Ignored:    scripts/._plot_perf_by_ss.R
    Ignored:    scripts/._plot_runtimes.R
    Ignored:    scripts/._plot_sim_ex.R
    Ignored:    scripts/._prep_data.R
    Ignored:    scripts/._prep_magl.R
    Ignored:    scripts/._prep_sim.R
    Ignored:    scripts/._sim_qc.R
    Ignored:    scripts/._utils.R

Untracked files:
    Untracked:  .RData
    Untracked:  .RDataTmp
    Untracked:  .Rhistory
    Untracked:  MAGL/code/utils.R
    Untracked:  MAGL/current
    Untracked:  MAGL/docs/figure/3-differential.Rmd/
    Untracked:  figs/

Unstaged changes:
    Modified:   MAGL/.Rprofile
    Modified:   MAGL/analysis/2-annotation.Rmd
    Modified:   MAGL/analysis/3-differential.Rmd
    Deleted:    scripts/fig_vehicle_vs_lps.R

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
Rmd 31c612f HelenaLC 2019-07-16 update cluster anno; write res to .csv
Rmd 244411f HelenaLC 2019-07-11 update lps analysis to include removal of doublets
html 244411f HelenaLC 2019-07-11 update lps analysis to include removal of doublets
Rmd 08bf260 HelenaLC 2019-06-17 update
html 08bf260 HelenaLC 2019-06-17 update
Rmd 61c0b06 HelenaLC 2019-05-20 update
Rmd faf858d HelenaLC 2019-05-17 add DS & geneset analysis
Rmd dde5200 HelenaLC 2019-05-15 add ds analysis (in progress)

load packages

library(ComplexHeatmap)
library(cowplot)
library(dplyr)
library(limma)
library(magrittr)
library(muscat)
library(scater)
library(SingleCellExperiment)

Data preparation

Load data & prep. for muscat

# load data
sce <- readRDS(file.path("output", "MAGL-SCE.rds"))

# make WT reference group
sce$group_id <- factor(sce$group_id, levels = c("WT", "LPS"))

# reorder sample levels
m <- match(levels(sce$sample_id), sce$sample_id)
o <- order(sce$group_id[m])
sce$sample_id <- factor(sce$sample_id, levels = levels(sce$sample_id)[o])

# prep. SCE for `muscat`
sce <- prepSCE(sce, "cluster_id", "sample_id", "group_id")
(ei <- metadata(sce)$experiment_info)
  sample_id group_id n_cells
1     LC016       WT    3899
2     LC017      LPS    2074
3     LC019       WT    2342
4     LC020      LPS    3375
5     LC022       WT    3273
6     LC023      LPS    3158
7     LC025       WT    2926
8     LC026      LPS    4510

Cluster annotation

In consideration of the visualizations provided in the Annotation tab and additional exploration with iSEE, we arrive at the following cluster annotations:

anno <- list(
    "Astrocytes" = 3,
    "Endothelial" = 13,
    "Microglia" = 17,
    "Oligodendrocytes" = 4,
    "OPC" = 12,
    "CPE cells" = c(18, 19),
    "Excit. Neuron" = c(0, 1, 2, 5, 6, 7, 10, 14, 16),
    "Inhib. Neuron" = c(8, 9, 11, 15))

m <- match(sce$cluster_id, unlist(anno))
ns <- vapply(anno, length, numeric(1))
lab <- rep.int(names(anno), ns)
sce$cluster_id <- factor(lab[m], levels = names(anno))

# cluster-sample cell-counts
table(sce$cluster_id, sce$sample_id)
                  
                   LC016 LC019 LC022 LC025 LC017 LC020 LC023 LC026
  Astrocytes         138   225   255    91    96   297   189   341
  Endothelial         51   102    64    53    43   171    77   192
  Microglia           20    31    47    35    30   111    34    98
  Oligodendrocytes   130   202   344   114    61   291   128   168
  OPC                 94    79   121    66    40    90   115   158
  CPE cells           20     9    42   104    66    96    71    22
  Excit. Neuron     2887  1455  1948  1874  1456  1998  2132  2952
  Inhib. Neuron      559   239   452   589   282   321   412   579
# normalize for visualization
sce <- normalize(sce)

Aggregation of single-cell to pseudo-bulk data

system.time(pb <- aggregateData(sce))
   user  system elapsed 
 10.484   0.296   9.020 
pb
class: SingleCellExperiment 
dim: 11063 8 
metadata(4): experiment_info log.exprs.offset agg_pars n_cells
assays(8): Astrocytes Endothelial ... Excit. Neuron Inhib. Neuron
rownames(11063): ENSMUSG00000051951.Xkr4 ENSMUSG00000089699.Gm1992 ...
  ENSMUSG00000095041.PISD ENSMUSG00000063897.DHRSX
rowData names(0):
colnames(8): LC016 LC019 ... LC023 LC026
colData names(6): group_id orig.ident ... total_features_by_counts_drop
  pct_counts_Mt_drop
reducedDimNames(0):
spikeNames(0):

Pseudobulk-level MDS plot

Prior to conducting any formal testing, we can compute a multi-dimensional scaling (MDS) plot of aggregated signal to explore overall sample similarities. Ideally, such a represenation of the data should separate both clusters and groups from one another. Vice versa, samples from the same cluster/group should fall close to each other.

In our MDS plot on pseudobulk counts (Fig. @ref(fig:pb-mds)), we can observe that cell-populations (clusters) are separated quite well, while WT and stimulated samples (groups) are separated to a lesser (if any) extend.

pbMDS(pb)
Pseudobulk-level MDS plot. Each point represent one cluster-sample instance;
points are colored by cluster ID and shaped by group ID.

Pseudobulk-level MDS plot. Each point represent one cluster-sample instance; points are colored by cluster ID and shaped by group ID.

Version Author Date
08bf260 HelenaLC 2019-06-17

DS analysis

system.time(res <- pbDS(pb, verbose = FALSE))
   user  system elapsed 
 28.040   0.648  28.713 
tbl <- bind_rows(res$table[[1]])
tbl <- filter(tbl, p_adj.loc < 0.05, abs(logFC) > 1)
tbl <- mutate_if(tbl, is.numeric, format, digits = 4)

Save SCE & results

saveRDS(sce, file.path("output", "MAGL-SCE.rds"))
saveRDS(res, file.path("output", "MAGL-DS_res.rds"))
write.csv(tbl, file.path("output", "MAGL-DS_res.csv"), row.names = FALSE)

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS:   /usr/local/R/R-3.6.0/lib/libRblas.so
LAPACK: /usr/local/R/R-3.6.0/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C               LC_TIME=en_CA.UTF-8       
 [4] LC_COLLATE=en_CA.UTF-8     LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils     datasets  methods  
[10] base     

other attached packages:
 [1] scater_1.12.2               SingleCellExperiment_1.6.0  SummarizedExperiment_1.14.0
 [4] DelayedArray_0.10.0         BiocParallel_1.18.0         matrixStats_0.54.0         
 [7] Biobase_2.44.0              GenomicRanges_1.36.0        GenomeInfoDb_1.20.0        
[10] IRanges_2.18.1              S4Vectors_0.22.0            BiocGenerics_0.30.0        
[13] muscat_0.99.10              magrittr_1.5                limma_3.40.2               
[16] cowplot_0.9.4               ggplot2_3.2.0               ComplexHeatmap_2.0.0       
[19] BiocStyle_2.12.0           

loaded via a namespace (and not attached):
  [1] backports_1.1.4          circlize_0.4.6           Hmisc_4.2-0             
  [4] blme_1.0-4               workflowr_1.4.0          plyr_1.8.4              
  [7] igraph_1.2.4.1           lazyeval_0.2.2           TMB_1.7.15              
 [10] splines_3.6.0            listenv_0.7.0            digest_0.6.20           
 [13] foreach_1.4.4            htmltools_0.3.6          viridis_0.5.1           
 [16] gdata_2.18.0             lmerTest_3.1-0           checkmate_1.9.4         
 [19] memoise_1.1.0            cluster_2.1.0            doParallel_1.0.14       
 [22] globals_0.12.4           annotate_1.62.0          prettyunits_1.0.2       
 [25] colorspace_1.4-1         blob_1.2.0               xfun_0.8                
 [28] dplyr_0.8.3              crayon_1.3.4             RCurl_1.95-4.12         
 [31] genefilter_1.66.0        lme4_1.1-21              zeallot_0.1.0           
 [34] survival_2.44-1.1        iterators_1.0.10         glue_1.3.1              
 [37] gtable_0.3.0             zlibbioc_1.30.0          XVector_0.24.0          
 [40] GetoptLong_0.1.7         BiocSingular_1.0.0       future.apply_1.3.0      
 [43] shape_1.4.4              scales_1.0.0             DBI_1.0.0               
 [46] edgeR_3.26.5             Rcpp_1.0.1               viridisLite_0.3.0       
 [49] xtable_1.8-4             progress_1.2.2           htmlTable_1.13.1        
 [52] clue_0.3-57              dqrng_0.2.1              foreign_0.8-71          
 [55] bit_1.1-14               rsvd_1.0.1               Formula_1.2-3           
 [58] htmlwidgets_1.3          gplots_3.0.1.1           RColorBrewer_1.1-2      
 [61] acepack_1.4.1            pkgconfig_2.0.2          XML_3.98-1.20           
 [64] nnet_7.3-12              locfit_1.5-9.1           dynamicTreeCut_1.63-1   
 [67] tidyselect_0.2.5         rlang_0.4.0              reshape2_1.4.3          
 [70] AnnotationDbi_1.46.0     munsell_0.5.0            tools_3.6.0             
 [73] RSQLite_2.1.1            evaluate_0.14            stringr_1.4.0           
 [76] yaml_2.2.0               knitr_1.23               bit64_0.9-7             
 [79] fs_1.3.1                 caTools_1.17.1.2         purrr_0.3.2             
 [82] future_1.14.0            nlme_3.1-140             whisker_0.3-2           
 [85] scran_1.12.1             grr_0.9.5                pbkrtest_0.4-7          
 [88] compiler_3.6.0           rstudioapi_0.10          beeswarm_0.2.3          
 [91] png_0.1-7                variancePartition_1.14.0 statmod_1.4.32          
 [94] tibble_2.1.3             geneplotter_1.62.0       stringi_1.4.3           
 [97] highr_0.8                lattice_0.20-38          Matrix_1.2-17           
[100] nloptr_1.2.1             vctrs_0.2.0              pillar_1.4.2            
[103] BiocManager_1.30.4       GlobalOptions_0.1.0      BiocNeighbors_1.2.0     
[106] data.table_1.12.2        bitops_1.0-6             irlba_2.3.3             
[109] Matrix.utils_0.9.7       colorRamps_2.3           R6_2.4.0                
[112] latticeExtra_0.6-28      KernSmooth_2.23-15       gridExtra_2.3           
[115] vipor_0.4.5              codetools_0.2-16         gtools_3.8.1            
[118] boot_1.3-23              MASS_7.3-51.4            assertthat_0.2.1        
[121] DESeq2_1.24.0            rprojroot_1.3-2          rjson_0.2.20            
[124] withr_2.1.2              sctransform_0.2.0        GenomeInfoDbData_1.2.1  
[127] hms_0.5.0                rpart_4.1-15             glmmTMB_0.2.3           
[130] minqa_1.2.4              rmarkdown_1.13           DelayedMatrixStats_1.6.0
[133] git2r_0.26.1             numDeriv_2016.8-1.1      base64enc_0.1-3         
[136] ggbeeswarm_0.6.0