Enhancer information and related codes for a new gene-based analysis
It remains challenging to boost statistical power of GWAS to identify more risk variants or loci that can account for “missing heritability”. Furthermore, since most identified variants are not in gene coding regions, a biological interpretation of their function largely lacks. On the other hand, recent biotechnological advances have made it feasible to experimentally measure the three-dimensional organization of the genome, including enhancer-promoter interactions in high resolutions. Due to the well known critical roles of enhancer-promoter interactions in regulating gene expression programs, such data have been applied to link GWAS risk variants to their putative target genes, gaining insights into underlying biological mechanisms.However, their direct use in GWAS association testing is yet to be exploited.
Here we propose integrating enhancer-promoter interactions into GWAS association analysis to both boost statistical power and enhance interpretability. We demonstrate that, through an application to two large schizophrenia (SCZ) GWAS summary datasets, the proposed method could identify some novel SCZ-associated genes and pathways (containing no significant SNPs).
To help other researchers reuse enhancer information and conduct follow-up or applied research, we provide the following information:
1. MCF7_allenhancer.rds: Enhancer information for each gene based on MCF7 data (Li et al., 2012);
2. Hippo_allenhancer.rds: Enhancer information for each gene based on a computationally predicted method with the brain hippocampus region (Cao et al., 2017);
3. CP_allenhancer.rds: Enhancer information based on the cortical and subcortical plate (Won et al., 2016);
4. GZ_allenhancer.rds: Enhancer information based on the germinal zone (Won et al., 2016);
5. Enhancer_only.R: Source codes for enhancer only-based method. You may change the directory and prepare the necessary file to run it successfully;
6. Enhancer_plus_gene_body.R: Source codes for enhancer plus gene body-based method.