Association of the IKZF1 5' UTR variant rs1456896 with lupus nephritis in a northern Han Chinese population.

Objectives: Polymorphisms of IKAROS family zinc finger 1 (IKZF1) have been found to be associated with systemic lupus erythematosus (SLE) by genome-wide association studies (GWAS). The aim of the current study was to investigate the association between IKZF1 functional variants and lupus nephritis (LN) in a northern Han Chinese population and analyse their relationship with clinical and pathological phenotypes in LN. Method: The association between IKZF1 functional variants and LN was analysed for the lead variant rs1456896 with both GWAS and expression quantitative trait loci (eQTL) top hits in 500 LN patients and 500 healthy controls. Replication was conducted in an independent cohort comprising 798 LN patients and 704 healthy controls. Using the ENCODE (Encyclopedia of DNA Elements) databases, functional annotations and differential gene expression data were evaluated. Results: A significant association between the single nuclear polymorphism (SNP) rs1456896 and susceptibility to LN was observed in the two different cohorts (p = 9.32 × 10−3 and p = 3.00 × 10−2) and reinforced in combination (p = 1.36 × 10−3). In silico analysis indicates that rs1456896 is a regulatory variant and lower mRNA expressions of IKZF1 were observed in both peripheral blood mononuclear cells (PBMCs) and renal biopsies from SLE patients compared to normal controls. Although patients with the protective genotype AA of rs1456896 seemed to have more pronounced clinical manifestations and a lower ratio of histological classes III and IV, no significant associations between rs1456896 genotypes and sub-phenotypes of LN were detected. Conclusions: Our results suggest that the rs1456896 A allele is associated with protective susceptibility to LN. However, this association did not seem to be implicated in the disease and histopathological severity of LN in the current population.

Systemic lupus erythematosus (SLE) is a chronic autoimmune disease characterized by diverse clinical features and outcomes (1). The dysregulation of T-cell responses and B-cell activation and the formation of immune complexes are thought to play an important role in SLE damage to organs or tissues (2). Although the exact pathogenesis remains to be determined, numerous studies have suggested that SLE has a strong genetic component (3)(4)(5) . Lupus nephritis (LN), a major disease manifestation of SLE, is an independent indicator of poor prognosis in SLE. It has been suggested that susceptibility to LN is the result of complex interactions between organ-specific reactions and polymorphic genetic factors involved in the regulation of immune responses, and some of the genetic risk factors for SLE make a contribution to specific disease manifestations such as LN (2).
IKAROS family zinc finger 1 (IKZF1) encodes a zincfinger DNA-binding protein that is involved in B-and T-lymphocyte differentiation (6)(7)(8). Using genetically engineered mice in which the endogenous Ikaros gene is disrupted, research has demonstrated that continued high-level expression of Ikaros is essential for homeostasis of peripheral lymphocytes and maintenance of B-cell tolerance (9). Using a hypothesis-free genomewide association study (GWAS) conducted in a Han Chinese population (10), an association between single nuclear polymorphisms (SNPs) of IKZF1 and SLE has been identified. A disease sub-phenotype analysis revealed a significant association between the IKZF1 variant and LN, suggesting the potential role of IKZF1 in the pathogenesis of LN (11). However, the genetic association between IKZF1 and LN remains to be determined (12,13). Thus, studies of independent associations are needed, and such endeavours should be centred on both the association signal and the function signal. In the present study, we focused on a lead variant with both GWAS and expression quantitative trait loci (eQTL) top hits and its associations with LN.

Study population
To identify the association of IKZF1 variants with LN, 500 LN patients (mean age 31.9 ± 11.2 years; 423 females) and 500 healthy blood donors (mean age 40.0 ± 8.6 years; 140 females), who were of Han Chinese origin in Beijing, were enrolled from the renal division of Peking University First Hospital. For comparison, a second population containing 798 LN patients (mean age 32.8 ± 12.2 years; 672 females) and 704 healthy blood donors (mean age 32.7 ± 8.9 years; 219 females) were enrolled. All the patients with LN met the American College of Rheumatology (ACR) revised criteria for SLE (14) and diagnoses were confirmed by renal biopsy using light microscopy, immunofluorescence, and electron microscopy.
The ethics review committee of Peking University First Hospital approved the study and all participants provided written informed consent.

SNP selection and genotyping
For SNP selection, a region spanning about 200 kb (chr7: 50304461-50492271, hg19) encompassing IKZF1 as the RefSeq gene and its regulatory regions was searched in the updated HaploReg database (Hap-loReg v4.1, www.broadinstitute.org/mammals/haploreg/ haploreg.php). Among the 2041 SNPs within this region, 15 variants had NHGRI-EBI GWAS hits, 72 variants had GRASP QTL hits, and 13 variants had both NHGRI-EBI GWAS hits and GRASP QTL hits. Among these 13 SNPs, rs1456896 had the most GRASP QTL hits (321 hits). rs1456896 was previously reported to be the top signal associated with inflammatory bowel disease (IBD) by GWAS (15,16), suggesting it may be a lead variant within the region with regard to both genetic association and functional significance. Thus, rs1456896 was selected and genotyped by TaqMan allele discrimination assays (Applied Biosystems, Foster City, CA, USA) to search for the association between IKZF1 variants and LN. A flow chart of the genetic analysis of IKZF1 functional variants is shown in Figure 1.

Bioinformatic analysis
The variant effect of rs1456896 on regulatory motifs was investigated using the HaploReg v4.1 database and was quantified as the difference of logarithms of odds scores LOD(alternative)-LOD(reference). A negative score suggested a relatively higher affinity for the reference sequence while a positive score indicated a relatively higher affinity for the alternative. DNA features and regulatory elements of the region that contains rs1456896 were identified using the RegulomeDB database (http://regulome.stanford.edu/), and the scoring standard for RegulomeDB is provided in Supplementary Table S1. In addition, regulatory annotations on rs1456896 were searched in the rSNPBase database (http://rsnp.psych.ac.cn/). To explore whether rs1456896 had an eQTL effect, the gene expression profiling of Epstein-Barr virus (EBV)-transformed lymphoblastoid cell lines of 160 unrelated Asian individuals (HapMap3) from the Gene Expression Variation project (GENEVAR project, www.sanger.ac.uk/humgen/genevar/) was used. In addition, the gene differential expression analysis of IKZF1 was performed in SLE patients and compared with healthy controls from large-scale genome-wide gene expression analyses conducted in peripheral blood mononuclear cells (PBMCs) (E-GEOD-50772) and renal biopsies (E-GEOD-32591) by using the ArrayExpress Archive database (www.ebi.ac.uk/arrayexpress/).

Statistical analyses
The genotype frequency of rs1456896 was tested for Hardy-Weinberg equilibrium separately in cases and controls. Allele frequencies were compared between LN cases and controls by using the χ 2 test, and logistic regression analysis adjusted by sex and age was adopted to control the age and gender bias. In addition, the association of different degrees of severity and outcomes of LN with different genotypes of rs1456896 was determined. The results of the measurements are expressed as mean ± standard deviation (sd) and the χ 2 test or oneway analysis of variance (ANOVA) was used. Statistical analyses were performed using SPSS version 16.0 (SPSS Inc, Chicago, IL, USA). A two-tailed p value of less than 0.05 was considered statistically significant.

Association of rs1456896 with LN
As shown in Furthermore, associations between different genotypes of rs1456896 and disease severity and outcomes, including age of onset, gender, proteinuria, estimated glomerular filtration rate (eGFR), serum creatinine level, C3 level, the SLE Disease Activity Index (SLEDAI) score, percentage of crescent, different histological types, response to treatment, and development of end-stage renal disease (ESRD), were detected in 279 LN patients who were routinely followed up for at least 1 year, with a mean follow-up time of 56.58 months. Among these patients, response to treatment was defined as a urine protein level that decreased by 50% or was below 1 g/24 h after treatment, and the development of ESRD was defined as dialysis or death. As shown in Supplementary Table S2, the patients with the protective genotype AA of rs1456896 seemed to have later onset, a smaller male percentage, lower proteinuria levels, higher eGFR levels, lower serum creatinine levels, lower SLEDAI scores, a lower ratio of histological classes III and IV, and a higher treatment remission rate. However, no significant associations between rs1456896 genotypes and sub-phenotypes of LN were detected.

Regulatory effects predication
The HaploReg v4.1 database predicted a binding site motif that spans the 5ʹ untranslated region (UTR) of the IKZF1 rs1456896 region for binding by the transcription factor hepatocyte nuclear factor (HNF)4. The difference between the LOD scores for A/G alleles (reference) was -10.1, predicting a lower affinity of HNF4 to the regulatory motif for the variant A (protective) allele relative to the G allele. We further located the set of SNPs in strong linkage disequilibrium (LD) (r 2 ≥ 0.8) with rs1456896 in an Asian population, and among the 21 SNPs, six were located in regions of promotor histone marks, 11 were in regions with enhancer histone marks, 12 were in DNAse regions, three were in protein binding regions, 17 were in motif changed regions, and four had NHGRI/EBI GWAS hits and/or GRASP QTL hits (Supplementary Table S3). In the RegulomeDB database, rs1456896 scored 4 (TF binding + DNase peak) and was predicted to be within the regions of protein binding, chromatin structure, and histone modifications. In the rSNPBase database, the eQTL effect of rs1456896 has been detected. Furthermore, five SNPs in strong LD (r 2 > 0.8) with rs1456896 were regarded as regulatory SNPs (rSNPs): one variant was involved in proximal and distal transcriptional regulation, one variant was involved in distal transcriptional regulation and showed an eQTL effect, while the other three variants were involved in distal transcriptional regulation. The evidence above suggests that rs1456896 is located in a regulatory region.

Gene expression analysis
To validate the cis-eQTL effect of rs1456896, the gene expression data of an Asian population (n = 160) in the None of the genotypes in the controls or patients showed significant deviation from Hardy-Weinberg equilibrium. Logistic regression analyses adjusted by sex and age also suggested that the protective allele A of rs1456896 was significantly associated with LN compared to controls.
GENEVAR project was used. As shown in Supplementary Table S4, although the rs1456896 A allele seemed to be associated with a higher expression level of IKZF1, no significant association was identified. We also ascertained whether IKZF1 was expressed differently in SLE patients and healthy controls using the ArrayExpress Archive database. We observed that IKZF1 mRNA expression was significantly down-regulated in SLE PBMCs (mean ± sd normalized fluorescence intensity 67843.41 ± 1334.21 among 61 SLE patients vs. 9040.20 ± 773.33 among 20 normal donor controls, p = 2.85 × 10 −4 ) and tubulointerstitial samples for LN (4.17 ± 0.10 among 32 LN patients vs. 4.28 ± 0.14 among 15 pre-transplate living donor controls, p = 5.00 × 10 −3 ) (Figure 2).

Discussion
In the present study, the association between rs1456896 of IKZF1 and LN has been detected and replicated in an independent population. Various regulatory effects of rs1456896 were predicted and a significant lower expression level of IKZF1 has been detected in PBMCs and renal biopsy samples of SLE patients compared with controls, providing important indications for the role of IKZF1 in LN.
The results of recent genetic studies have largely supported the notion of shared genetics in immune-related diseases, including SLE (17)(18)(19). The polymorphisms of IKZF1 have been reported to be associated with several immunerelated diseases, such as IBD (15,16), type 1 diabetes (T1DM) (20), childhood acute lymphoblastic leukaemia (C-ALL) (21), and SLE (10,12,13). IKZF1 encodes a transcription factor that belongs to the family of zinc-finger DNA-binding proteins associated with chromatin remodelling. It is reported to be involved in the pathogenesis of autoimmune diseases by regulating lymphocyte differentiation. SLE is a prototypic, systemic autoimmune disease that has a strong genetic component. There were studies suggested that Ikaros could regulate the transcription of STAT4, which was reported to be associated with SLE by several GWAS, by combining with its 5ʹ sequence (22,23). In addition, interferon (IFN) is thought to be closely associated with SLE, especially with regard to skin and renal involvement. In a mouse model, the low expression level of IKZF1 was attributed to the induced number of plasmacytoid dendritic cells (pDCs) and IFN production (24). Ikaros was found to be essential for DC activation and T-helper (Th1)cell differentiation, and IKZF1 might play an important role in IFN-γ production by inhibiting T-bet, a special transcription factor for Th1 cells in IFN-producing regulation (25).
In the present study, a significant difference was detected with the protective allele (A allele) of the SNP rs1456896 located in the 5ʹ UTR of IKZF1. As SLE is widely regarded as a polygenic disease, the functional variant rs1456896 in IKZF1 is likely to play some role in SLE/LN but is probably not the only causal factor in the disease. Thus, we failed to observe significant differences regarding clinical findings, pathological features, and renal outcomes in our LN patients (n = 279) with different genotypes of rs1456896. In in-silico analysis, the regulatory effects of rs1456896 have been predicted. In addition, individuals with the rs1456896 AA genotype seemed to have a higher expression level of IKZF1 in an Asian population in the GENE-VAR project, although no significant association has been detected, which may be due to the small sample size and low allele frequency of rs1456896 #A allele. Consistent with the previous report conducted in Chinese SLE patients (26), lower IKZF1 expression levels have been observed in PBMC and tubulointerstitial samples of SLE patients by genome-wide gene expression analyses. These bioinformatics analyses may shed more light on the role of IKZF1 in the pathogenesis of LN. However, more widespread replications with larger populations and functional studies are needed in the future.
In summary, a relationship between the rs1456896 A allele and susceptible protection of LN was established in our study, which may be useful for improvements in the understanding of the pathogenesis of the disease.