figshare
Browse

scDynaBar: Using CRISPR barcoding as a molecular clock to capture dynamic processes at single-cell resolution

dataset
posted on 2024-11-07, 07:50 authored by Yolanda AndrésYolanda Andrés, Irene Hernando-Herraez, Wolf Reik, Stephen J. Clark, Christopher Todd, Celia Alda-Catalinas, Ioannis Kafetzopoulos

Abstract

scDynaBar is an innovative approach that combines CRISPR-Cas9 dynamic barcoding with single-cell sequencing to record temporal cellular events. Over a 4-week period, genetic barcodes accumulate mutations, which are then sequenced together with the transcriptome of each single cell. This enables the creation of a time-ordered record of cellular events, providing a unique perspective on biological dynamics.

In this study, we applied scDynaBar to track the transition from a pluripotent state to a two-cell (2C)-like state in mouse embryonic stem cells (mESCs). Our results demonstrate the transient nature of the 2C-like state. Additionally, we show consistent mutation rates across different cell types in a mouse gastruloid model, underscoring the robustness and versatility of the system across diverse biological contexts.

Overview

This repository contains data and scripts for analyzing single-cell RNA-seq and Bulk RNA-seq experiments. The aim is to process barcode sequences and associated metadata, create Seurat objects, and generate visualizations for results presented in the associated paper.

For more information, code and data updates visit our github repository --> https://github.com/socyol/scDynaBar.git

Data/

  • barcode_sequences/ --> This directory contains CSV files with barcode sequence data
  • metadatas/ -->This directory contains metadata CSV files that provide details on: The number of reads/alleles (coverage) for each cell; Allele features such as mean diversity, mean length, and percentage of original sequences (% original sequences). For bulk experiments, this data is provided for each sample, including the system used (cas9 or BE3) and the spacer utilized (from a selection of 7).
  • seurat_objects/ --> Contains the Seurat objects created from processed single-cell data, specifically those that have passed quality control.

scripts/

#### Single-cells experiment analysis:

  • - **`sc_1_SeuratObject_analysis.R`**: This script processes the filtered_feature_bc_matrix to convert it into a Seurat object (matrix in GEOomnibus accession)
  • - **`sc_2_Barcode_sequences_analysis.R`**: This script processes the barcode sequence data and merges it with the Seurat objects to create the metadata (located in the `metadata` folder). This metadata is crucial for conducting analyses, viewing results, and generating figures.



#### Bulk analysis

  • - **`bulk_1_analysis.R`**: First step of the Bulk analysis. This script prepares the FASTQ data by organizing it into different folders to optimize and facilitate the subsequent analysis. This step ensures proper structuring and management of the data for the following phases.
  • - **`bulk_2_analysis.sh`**: Second step of the Bulk analysis. This Bash script automates the analysis process by creating jobs for each data file, which then execute the code provided in bulk_3_analysis.R. This parallelization helps to speed up the processing. At the end of this step, an output file similar to the barcode sequences for bulk data is generated, providing a comprehensive summary of the data processing.
  • - **`bulk_3_analysis.R`**:Third and final step of the Bulk analysis. This R script conducts a thorough analysis of the previously processed and organized data. It includes computations and data filtering.



#### PLOTS

  • - **`plots_1-bulk.R`**: Contains scripts for creating visualizations related to bulk data.
  • - **`plots_2-timecourse.R`**: Scripts for visualizations specific to time course analyses.
  • - **`plots_3-zscan4.R`**: Scripts for visualizations related to zscan4 experiment.
  • - **`plots_4-gastruloids.R`**: Scripts for visualizations related to gastruloid data.


- **`settings.R`**: This script includes all necessary libraries and custom functions created for this project.


History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC