matlab files for Tabula Microcebus

dataset

posted on 2021-12-14, 15:06 authored by Angela PiscoAngela Pisco, Camille EzranCamille Ezran, Shixuan Liu, The Tabula Microcebus Consortium

Instructions to read h5ad file in Matlab: A mat file of the complete lemur cell atlas dataset converted from the h5ad file is provided in the Figshare files. We also provide a Matlab script to import the h5ad file to mat file: please download the h5ad file of interest, Matlab script “LCA_h5ad2Mat.m” and Matlab function “read_csmatrix.m” to the same folder, and run “LCA_h5ad2Mat.m”.

The mat file contains a single variable named “rawData”, a Matlab structure variable with the following fields:

cells: a table of the sequenced cells with metadata for individual sequenced cells (features of the table includes above “/obs” and “/obsm” list for the h5ad file, e.g., cell_name, tissue, free_annotation_v1, and X_umap, but not the MHC counts which is included in tabMHC, see below).
genes: gene table

name: NCBI gene symbol.
highly_variable: whether the gene is highly variable (calculated for the entire dataset).

mat_raw: a sparse matrix of the cell by gene transcript count (raw count).
mat_X: a sparse matrix of the cell by gene transcript level after library size normalization and natural log transformation (i.e., smartseq2, ln(reads/N *1e4 +1); 10x, ln(UMI/N *1e4 +1), where N denotes the total number of reads or UMI of the cell).
tabMHC: a table of the calculated raw counts for the major histocompatibility complex (MHC) genes (see the Tabula Microcebus manuscript for detail). Note the count is only available for cells sequenced by 10x method and count is NAN for cells sequenced by smartseq2 method. Both raw counts and normalized counts (labeled with prefix letter ‘n’) are provided.

MHC_C_I, MHC_NC_I, MHC_all_II: sum of counts from classical Class I genes.
nMHC_C_I, nMHC_NC_I, nMHC_all_II: sum of normalized counts from classical Class I genes.
counts and normalized counts from individual classical Class I genes (Mimu_168, Mimu_W03, Mimu_W04, Mimu_249, nMimu_168, nMimu_W03, nMimu_W04, nMimu_249), non-classical Class I genes (Mimu_180ps, Mimu_191, Mimu_202, Mimu_208, Mimu_218, Mimu_229ps, Mimu_239ps, nMimu_180ps, nMimu_191, nMimu_202, nMimu_208, nMimu_218, nMimu_229ps, nMimu_239ps), and Class II genes (Mimu_DMA, Mimu_DMB, Mimu_DPA, Mimu_DPB, Mimu_DQA, Mimu_DQB, Mimu_DRA, Mimu_DRB, nMimu_DMA, nMimu_DMB, nMimu_DPA, nMimu_DPB, nMimu_DQA, nMimu_DQB, nMimu_DRA, nMimu_DRB).