Allen Brain Atlas: Mouse gene expression data

2017-10-06T13:34:49Z (GMT) by Ben Fulcher
These data were downloaded using python and the AllenSDK, and then imported and processed in Matlab. The dataset is a newer version of that analyzed in our paper: Fulcher, B. D. & Fornito, A. A transcriptional signature of hub connectivity in the mouse connectome. Proc. Natl. Acad. Sci. USA 113, 1435 (2016).

The data file, AllenGeneDataset.mat, contains gene expression data from 25469 section datasets, across 19419 genes, for 213 structures in the Allen Brain Atlas.
The data is formatted in a Matlab structure with four components:

GeneExpData contains fields for 'energy', 'density' (the expression energy and expression density of each section dataset), and also 'gene_energy' and 'gene_density' (the expression energy and density for each gene, after averaging across repeat section datasets)

sectionDatasetInfo is a table that contains information about each section dataset: entrez_id of the gene, plane_of_section_id of the experiment (i.e., coronal, 1, or sagittal, 2), and the section_id. Each row labels columns of the 'energy' and 'density' matrices of GeneExpData.

geneInfo contains information about each gene, with rows labeling the columns of the matrices in GeneExpData. Provides the acronym, entrez_id, gene_id, and name for each gene.

structInfo labels the structure information for the 213 structures in the mesoscale mouse connectome reported by Oh, S. W. et al. A mesoscale connectome of the mouse brain. Nature 508, 207 (2014). Each structure is labeled with its acronym, color_hex_triplet, id, name, and divisionLabel. Each row in this table corresponds to a row of the matrices in GeneExpData.

Note that ALL DATA in this repository were retrieved directly from the Allen Institute's API. If these data help you, then please acknowledge the amazing work and open science policies of the Allen Institute. For specific details of accreditation, please refer to the Allen Institute's citation policy if you use these data (link below).

Please contact me if you'd like any more information about how these data were put together. For example, if you'd like the raw files from the allensdk that haven't been processed in Matlab, or if you require more information about how the data were retrieved from the Allen SDK and processed using Matlab. I plan to make the code framework available on github when I find the time.