RNA-seq simulated data for differential transcript usage (DTU) analyses
Published on by Simone Tiberi
The project contains the fastq files (in a gzipped format) from simulated RNA-seq data for 12 paired-ended samples (i.e., biological replicates); all reads are 101 base pairs long. The first subscript denotes the sample id (1 to 12), while the second subscript indicates the two strands of each sample (1 or 2). Samples belong to two groups: samples 1 to 6 constitute the first group, while samples 7 to 12 represent the second group. In each group 1,000 genes exhibit differential transcript usage (DTU) and further 1,000 genes (partially overlapping with the previous set) shows differential gene expression (DGE) between groups. Genes showing DTU were simulated by randomly permuting the relative abundance of the four most expressed transcripts; for gene with two or three transcripts only, all transcripts relative abundances were permuted. The dataset is used to benchmark DTU methods in the manuscript entitled "BANDITS: Bayesian differential splicing accounting for sample-to-sample variability and mapping uncertainty". The code for simulating the data is available on GitHub at https://github.com/SimoneTiberi/BANDITS_manuscript .