Screening of candidate genes at GLC3B and GLC3C loci in Chinese primary congenital glaucoma patients with targeted next generation sequencing

ABSTRACT Background Primary congenital glaucoma (PCG) is characterized by developmental abnormalities of the anterior chamber angle. Although several genes have been associated with PCG, pathogenic mutations could only be detected in about 20% of Chinese patients. GLC3B (1p36.2–36.1) and GLC3C (14q24.3) loci were previously identified in PCG pedigrees via linkage analysis. However, no causative genes were reported in these loci. This study was designed to search for novel PCG-related genes in these genetic regions. Materials and methods DNA samples from 100 PCG patients and 200 normal controls were pooled and sequenced using a customized panel of 133 positional candidate genes located around GLC3B and GLC3C loci (±1Mb). PCG-related genes were prioritized by the distribution of variants between patients and controls. Confirmation of selected variants and co-segregation analysis were performed using Sanger sequencing. Results Patient and control group contained 116 and 147 rare variants respectively after screening. Three genes (ZC2HC1C, VPS13D, and PGF) were prioritized according to the distribution of variants between the two groups. Rare variants of PGF were only identified in PCG patients. Conclusions To the best of our knowledge, this is the first study aiming at exploring novel PCG-related genes at GLC3B and GLC3C loci. Our preliminary results suggest that there are potential associations between ZC2HC1C, VPS13D, PGF, and PCG. However, larger cohort studies and functional assays are required to provide further evidence for the proposed genotype-phenotype association.


Background
Primary congenital glaucoma (PCG) is the major cause of pediatric glaucoma (1) which occurs before the age of three without any overt ocular defects other than trabeculodysgenesis (2). Early symptoms may be non-specific including tearing, photophobia, and blepharospasm. Elevated intraocular pressure (IOP) gradually leads to the enlargement of eyes (buphthalmos), hazy cornea, Haab's striae, optic nerve cupping, and irreversible visual disability if left untreated (3).
This genetic heterogeneity, together with the fact that none of the known causative genes shows a high prevalence in Chinese population, implies that there might be other causal genes remaining to be discovered. It is worth noting that GLC3B and GLC3C, the other two PCG-related loci, have been mapped to 1p36.2-p36.1 and 14q24.3 via linkage analysis for approximately two decades. And still, no causative genes have been reported. Intriguingly, our previous work confirmed the existence of GLC3C locus in PCG patients of Han origin based on significant transmission disequilibrium between the phenotype and the haplotype TAACG (rs2111701-rs4020123-rs4903696-rs11159318-rs177216) (18). Thus, it might be rewarding to screen these two chromosomal regions for novel candidate genes.
Next generation sequencing (NGS) enables simultaneous screening of multiple genes in a rapid and cost-effective manner. Targeted-NGS (t-NGS), in particular, is competent at uncovering novel genotype-phenotype correlations with improved sequencing coverage and precision (19). In this study, we developed a panel of 133 positional candidate genes for genomic DNA capture. These positional candidates were located at GLC3B or GLC3C locus and 1Mb flanking sequences. Here, we present the findings after applying this panel to 100 unrelated PCG patients and 200 normal controls of Chinese Han descent.

Subjects
One hundred PCG patients and 200 normal controls were recruited through EENT biobank from June 2004 to October 2018. Participants in control group underwent comprehensive ophthalmic examinations including a detailed medical history taking, slit-lamp biomicroscopy, fundus examination under mydriasis, B-scan, Goldmann applanation tonometry, and standard automated perimetry to rule out major ocular disorders. All diagnoses of PCG were confirmed by glaucoma specialists under general anesthesia before surgery. To be specific, IOP was measured using a Tono-PEN tonometry (Mentor, Norwell, MA, USA), anterior segment examination and corneal diameter measurement were performed under surgical microscope. Gonioscope was used to observe the structure of anterior chamber angle as well as the optic disc whenever possible. The diagnostic criteria were: (1) IOP >21 mmHg, (2) expansion of corneal diameter or corneal edema, (3) increased or asymmetric cup-to-disc ratio. Patients who experienced onset of the disease before age three (reported by parents) and met any two of the three criteria listed above were diagnosed as having PCG. Patients with other ocular or systemic disorders, such as megalocornea, Axenfeld-Rieger syndrome, Peter's anomaly, aniridia, Sturge-Weber syndrome, ocular or birth trauma were excluded.

Sample preparation and targeted re-sequencing
A panel of 133 candidate genes (full list in Supplementary  Table 1) located within the GLC3B or GLC3C locus and 1Mb upstream and downstream sequences were customized to cover the coding regions and 20bp sequences flanking the splice sites. The boundaries of GLC3B were marked by D1S489 and D1S1176 (hg19, chr1: 12,048,051-15,210,056) and those of GLC3C were delineated by D14S61 and D14S1000 (hg19, chr14: 76,335,308-82,106,058). However, six protein-coding genes at the telomeric end of GLC3C were not covered in this panel, including DIO2, CEP128, TSHR, GTF2A1, STON2, and SEL1 L. Genomic DNA was extracted from peripheral blood samples, amplified by multiplex PCR and mixed into pooled libraries (2 individuals/pool for patients, 4 individuals/pool for controls) before being subjected to sequencing by Illumina Hiseq2500 platform (Illumina, San Diego, CA, USA).

Bioinformatics analysis
Sequencing reads were aligned to the human reference genome assembly (hg19). Variants were called by Genome Analysis Toolkit (version 3.3) and annotated by ANNOVAR (20). To predict the pathogenicity of non-synonymous variants, assessments from 11 in silico tools were integrated including SIFT Additionally, the expression patterns of prioritized genes were accessed through the public resource (https://singlecell.broad institute.org/single_cell/study/SCP780) containing single-cell RNA sequencing data of human aqueous humor outflow pathways.

Prioritization of candidate genes
Variants complying with all of the following criteria were kept for further analysis: (1) Exonic nonsynonymous variants and splice site variants (±5 bp from exons); (2) with an allele frequency <1% (gnomAD_genome, gnomAD_exome, and ExAC_EAS); (3) not annotated as benign or likely benign by ClinVar and InterVar. Insertions and deletions were excluded out of accuracy concern. Genes with statistically significant different distribution of variants or those with more than one variants only in the PCG group were reviewed for expression pattern in aqueous outflow pathway and gene functions based on published literatures. Statistical analysis and data visualization were performed via R (version 3.6.1). P values were adjusted by false discovery rate (FDR) in multiple comparisons and the significance level was set to 0.05.

Validation by Sanger sequencing and co-segregation analysis
Potentially pathogenic variants in selected candidate genes detected by t-NGS were confirmed with PCR amplification and Sanger sequencing. Briefly, the amplified products were purified and sequenced by an automated sequencer (3730xl DNA Analyzer, Applied Biosystems). When available, DNA samples from parents were also sequenced to analyze cosegregation.

Clinical characteristics
The case group included 100 unrelated PCG patients, 37% of which were female and the median age of disease onset was 6 months (Table 1). Family history could be found for one patient, consistent with previous findings (32). Notably, we only recruited patients with typical clinical signs and disease involvement in both eyes considering the phenotypes were more genetically contributed in these cases. In addition, a total of 200 healthy controls were recruited in this study, 82% of which were female. And the mean age at sample collection was 58 ± 18.3 years.

Mutation profiles
An average read depth of 273.9x was achieved in this study, while 86.7% of the targeted regions had at least 50x coverage per base. On average, 858 single nucleotide variations were called per sample in PCG group and 970 in controls. After prioritization, 230 variants (nonsense, missense, and splice site) with allele frequency <1% were retained. 152 of them belonged to 49 genes in GLC3B locus, and the remaining 78 variants were detected in 28 genes around GLC3C locus.

Discussion
In this study, 100 PCG patients and 200 normal controls were recruited to explore the possible existence of PCG-related genes around two previously reported loci (GLC3B and GLC3C). A total of 133 positional candidates were sequenced with targeted exome capture, and the results were validated by Sanger sequencing selectively.
Sequencing a large number of individuals in case of population-based genetic study is still economically demanding despite falling costs of massive parallel sequencing. One of the frequently adopted alternatives is pooled sequencing. Several research groups have applied this method to identify rare causal variants in ocular diseases such as retinoblastoma and inherited retinal dystrophies (39,40). 92% (265/289) variants subjected to Sanger sequencing were confirmed in this study, suggesting that our sequencing and bioinformatics pipeline was reliable. However, sequencing errors introduced by sample pooling are unavoidable, and the results should be interpreted with caution (41).
Unlike certain inbred populations in East Europe or Middle East, the molecular basis of PCG could only be elucidated for nearly 20% of all Chinese patients. Therefore, further investigation of novel genetic mechanisms is urgently required. Until now, no causative genes have been identified in GLC3B and GLC3C loci. GLC3B was first mapped to the short arm of chromosome 1 (D1S1597/D1S489/D1S228-D1S1176/D1S507/ D1S407) in 1996 using four pedigrees unlinked to GLC3A locus (42). Six years later, a five-generation consanguineous Turkish family unlinked to both GLC3A and GLC3B was used to locate the third genetic locus for PCG (GLC3C, D14S61-D14S1000) (43). Our previous work identified a 22kb interval of risk haplotype within GLC3C in Chinese patients (18). Nevertheless, this region was not covered in our panel since no functional genes but one pseudogene (FXNP1) has been identified within 100kb flanking sequence of the haplotype. Apart from our research, few studies have been conducted to uncover the association between these two loci and PCG. Sivadorai et al. reported lack of common haplotype motif in GLC3B and GLC3C in four CYP1B1-negative Gypsy patients, although the evidence was hardly conclusive (44). Lee et al. analyzed copy number variations (CNVs) across the whole exome in 20 Korean PCG trios and found that no CNVs were located in known disease loci or any predicted functional partners of the target genes (45). More recently, Alsaif et al. conducted whole exome sequencing in 34 CYP1B1-negative patients, neither mutations in other known disease-causing genes nor novel candidate genes in GLC3B and GLC3C were identified (46).
Considering the lack of family history in most Chinese PCG cases, we adopted a case-control design in this study. Candidate genes were prioritized by their variants' distribution in PCG and control groups. Three genes (ZC2HC1C, VPS13D, and PGF) met the screening standards. The molecular function of ZC2HC1C is still unknown. Seven variants were detected by t-NGS. However, four of them were shared by patients and controls (Supplementary Table 2). In addition, ZC2HC1C is only weakly expressed in human iridocorneal angle structures (Supplementary Figure 1), undermining its possible involvement in the pathogenesis of PCG. VPS13D belongs to the yeast vacuolar protein sorting-associated protein 13 family. Two variants (p. R946 H and p.T3687 M) identified in patients were predicted to be pathogenic by more than half of in silico tools. Mutations of VPS13D have been already linked to early-onset movement disorders (33). Therefore, the association between VPS13D and PCG requires further investigation (Supplementary Table 2). Interestingly, variants of PGF were only detected in PCG group. PGF encodes placental growth factor which is a member of the VEGF family (47). PGF initiates angiogenesis in both direct (binding with Vegfr1) and indirect (competing with Vegf for Vegfr1 allowing Vegf to bind with Vegfr2) manners (48). Intriguingly, the receptors for Pgf, including Vegfr1, Nrp-1, and Nrp-2, are also expressed in angle tissues (Supplementary Figure 1). Impaired cerebral vascular development and angiogenesis response towards pathologic stimuli were reported in Pgf −/− mice (49,50), recapitulating its vital role in vascular homeostasis. On the other hand, studies have demonstrated that the development of the Schlemm's canal (now identified as a hybrid vessel with both vascular and lymphatic features) strikingly resembled the process of angiogenesis and lymphangiogenesis (51,52). Additionally, some of the wellstudied molecular cues orchestrating vascular and lymphatic formation (eg. Ang1/2-Tie2, Vegfc-Vegfr3) were also indispensable in modulating the genesis of the SC (9,10,51). Recently, one single-cell RNA-seq analysis showed that the expression of Pgf was significantly higher in SC endothelial than that of both lymphatic and vascular endothelial cells (53). Therefore, PGF could potentially regulate the development and maintenance of the SC, which entails further investigation. There are some limitations to the present research. The aim of the study was to explore whether there were PCG-related genes at GLC3B and GLC3C loci, there might be relevant genes in other genomic regions. It should also be noted that not all types of genetic variations were included in this study. For instance, we discarded insertions and deletions out of accuracy concerns. Additionally, CNVs and deep intronic variants lie beyond the detection range of this customized panel.
To the best of our knowledge, this was the first attempt to screen for PCG-related genes around GLC3B and GLC3C locus using targeted NGS. Our findings indicate the association of three genes with PCG (ZC2HC1C, VPS13D, and PGF). However, the genotype-phenotype correlation needs to be replicated in larger cohorts and functional assays. And still there remains a considerable proportion of PCG cases with unknown genetic causes. More comprehensive investigations (e.g. whole genome sequencing) are required to shed new light on the molecular basis of PCG.

Conclusions
In sum, our customized NGS panel and prioritization strategy enabled the identification of three putative PCG-related genes (ZC2HC1C, VPS13D, and PGF). However, more population/ pedigree-based as well as in vivo/vitro studies are needed to replicate and confirm the indicated phenotype-genotype associations.