Linking endotypes to omics profiles in difficult-to-control asthma using the diagnostic Chinese medicine syndrome differentiation algorithm

Abstract Objective: Patients with difficult-to-control asthma have difficulty breathing almost all of the time, even leading to life-threatening asthma attacks. However, only few diagnostic markers for this disease have been identified. We aimed to take advantage of unique Chinese medicine theories for phenotypic classification and to explore molecular signatures in difficult-to-control asthma. Methods: The Chinese medicine syndrome differentiation algorithm (CMSDA) is a syndrome-scoring classification method based on the Chinese medicine overall observation theory. Patients with difficult-to-control asthma were classified into Cold- and Hot-pattern groups according to the CMSDA. DNA methylation and metabolomic profiles were obtained using Infinium Human Methylation 450 BeadChip and gas chromatography-mass spectrometer. Subsequently, an integrated bioinformatics analysis was performed to compare those two patterns and identify Cold/Hot-associated candidates, followed by functional validation studies. Results: A total of 20 patients with difficult-to-control asthma were enrolled in the study. Ten were grouped as Cold and 10 as Hot according to the CMSDA. We identified distinct whole-genome DNA methylation and metabolomic profiles between Cold- and Hot-pattern groups. ALDH3A1 gene exhibited variations in the DNA methylation probe cg10791966, while two metabolic pathways were associated with those two patterns. Conclusions: Our study introduced a novel diagnostic classification approach, the CMSDA, for difficult-to-control asthma. This is an alternative way to categorize diverse syndromes and link endotypes with omics profiles of this disease. ALDH3A1 might be a potential biomarker for precision diagnosis of difficult-to-control asthma.


Introduction
Asthma is a complex inflammatory disease due to its phenotypic complexity, genetic heterogeneity, and environmental conditions, as well as the mutual interaction of these factors, especially in the case of difficult-to-control asthma [1]. Difficult-to-control asthma is asthma that remains uncontrolled despite treatment with high-dose inhaled glucocorticoids or other controllers [2]. Currently available treatments for this disease are still not satisfactory. Previous genome-wide association studies (GWAS) have highlighted several cellular and molecular endotypes of asthma driven by significant alterations in single nucleotide polymorphisms, copy number variations, and mRNA expression of key pathogenic players [3], such as ORMDL3, GSDMB, interleukin 33, and HLA-DR/DQ [4][5][6][7]. However, only few effective biomarkers for diagnosis and treatment of difficult-to-control asthma have been identified in early GWAS.
DNA hypermethylation in CpG islands has been associated with smoking and pollution, which are considered as risk factors for respiratory diseases [8][9][10][11]. Compared to transcriptional and translational changes, metabolites are more intuitive and dynamic to time-sensitively reflect changes in biochemical effects induced by diseases or extrinsic interventions [12][13][14][15]. Therefore, DNA methylation and metabolomic profiles might influence phenotypic transmission and thereby modulate the development of asthma [2,16,17]. Moreover, studies on linking phenotypic heterogeneity to whole-genome DNA methylation and metabolomic profiles would be helpful to understand the intrinsic mechanisms and identify diagnostic biomarkers for difficult-to-control asthma.
To date, there are various approaches for the clinical classification of asthma, including clinical symptoms, lung function, exacerbations, and the need for rescue medications [1,3,14]. Thus, a simplified, precise, and effective classification for abundant phenotypes would be useful to identify novel diagnostic markers and guide effective therapy for difficult-tocontrol asthma. Until now, few studies have taken advantage of the unique Chinese medicine theories for phenotypic classification, in order to explore molecular signatures in patients with difficult-to-control asthma. In the present study, we introduced a diagnostic classification approach, namely the Chinese medicine syndrome differentiation algorithm (CMSDA), to categorize the comprehensive phenotypic natures of difficult-to-control asthma [18].
The CMSDA is a syndrome scoring method based on the Chinese medicine overall observation theory. This theory is a holistic approach defined as the observation of both the body's interior and exterior, reflecting the rational of Yin-Yang body balance and environmental factor influence, as well as their interaction [18][19][20]. Accordingly, all internal organs, the human body, superficial orifices, and the environment are unified. In contrast, exterior characteristic symptoms, including patient-reported symptoms, sputum or skin color, tongue, and pulse, reflect changes in interior organs [21,22]. Based on the overall observation theory, the CMSDA has been used for distinguishing many complex diseases and has a very long history, of more than 5000 years, which might be considered as a clinical observational study of diverse diseases for over a 5000-year period, using a very large sample size. Furthermore, previous studies have demonstrated that asthmatic patients classified by using the CMSDA are treated differently and show effective responses [17,21,23,24]. Therefore, this diagnostic classification approach is a simple method to distinguish subtypes of difficult-to-control asthma.
In the present study, we introduced this novel diagnostic classification approach, CMSDA, for difficultto-control asthma. We hypothesized that there would be apparent differences between Cold and Hot patterns in the pathobiological characteristics that lead to distinct clinical phenotypes and epigenomic and metabolomic profiles, and that the respective patient sub-populations would need to be treated differently. This classification approach might provide new insights into the clinical implementation of phenotypic and endotypic profiles, as the first step toward precision diagnosis of difficult-to-control asthma.

Study design and omics data information
The study design is described in detail in Figure 1. Briefly, 90 subjects were recruited, of which 85 were diagnosed with asthma according to the GINA guidelines [2,25]. Among those, we selected 20 patients with difficult-to-control asthma who met the requirements and further classified them into Cold-and Hotpattern groups (10 patients per group; see details below) according to the definitions of the CMSDA (Supplementary Table S1A). Six out of those patients (three per group) were used to compare genomic DNA methylation profiles. We further investigated the metabolomic profiles of the 20 selected patients and performed an integrated bioinformatics analysis and functional validation. Details on patient selection, sample collection, and omics data analysis are described in the Supplementary file.
All participants provided written informed consent for their participation in the study, in accordance with the rules of the local ethics committee and protocol. The research was approved by the Institutional Review Board of Guang'anmen Hospital (Registration No.: 2014EC088-01, Registered 21 March 2014). No organs/ tissues were procured from prisoners in our study.

Pyrosequencing analysis
The different CpG sites were selected for pyrosequencing validation of the Infinium Human Methylation450 BeadChip data. EpiTect Bisulfite Kit (Qiagen) was used for bisulfite conversion of human islet DNA, which was then purified and recovered. Bisulfite-converted DNA was amplified with the PyroMark PCR kit (Qiagen) with the following thermal profile: 95 C for 3 min, 40 cycles of 94 C for 30 s, 52 C for 30 s, and 72 C for 1 min, followed by a final extension at 72 C for 7 min. Primers were designed using the PyroMark Assay Design 2.0 (Supplementary Table S2A). Pyrosequencing was performed with PyroMark Q96 ID (Qiagen), and data were analyzed with the Pyro Q-CpG software.

Real-time quantitative reverse transcription polymerase chain reaction (QRT-PCR)
Total RNA from peripheral blood was isolated using Trizol Reagent (Ambion by life technologies), purified by NucleoSpin RNA Clean-up kit (MACHEREY NAGEL), and used for cDNA synthesis and realtime QRT-PCR using Transcriptor First Strand cDNA Synthesis Kit (Roche) and SYBR V R Premix Ex Taq TM II kit (TaKaRa), respectively. All experiments were performed in triplicate with glyceraldehyde 3phosphate dehydrogenase (GAPDH) gene as an internal control. The 2 À᭝᭝Ct method was used for comparative quantification [26]. The primers used in this process are listed in Supplementary  Table S2B.

Statistical analyses
Statistical analyses for DNA methylation and metabolomic data were conducted as described in Supplemental Methods. Other results from QRT-PCR and pyrosequencing assays were expressed as the mean ± standard deviation. Statistical significance was assessed using ANOVA followed by Student's t test (p < 0.05).

Results
Characteristics of patients with difficult-to-control asthma and study design As summarized in Figure 1, among the 85 patients with asthma who met the definition of the GINA guidelines, 20 were diagnosed with difficult-to-control asthma and further classified into groups of Cold and Hot patterns using the CMSDA (Supplementary Table  S1A), according to previous reports [2,25]. We randomly selected three patients from each group to compare DNA methylation profiles, and 10 from each group to obtain the metabolomic profiles. Table 1 shows the demographic characteristics of the 20 patients, as well as information on their clinical treatment. The majority of the patients were women in Figure 1. Workflow of the study. We recruited 90 subjects from an Asthma outpatient clinic and confirmed 85 patients with asthma who met the definition of the GINA guidelines. Among the 85 asthmatic patients, 20 patients were diagnosed with difficult-to-control asthma, according to previous studies, and were further classified as having Cold and Hot patterns according to the CMSDA. Subsequently, three patients with Cold and three with Hot patterns were selected to compare DNA methylation profiles using Infinium Human Methylation 450 BeadChips, followed by KEGG analysis for differentially methylated genes and associated pathways. Next, blood samples from the 20 patients with difficult-to-control asthma were collected to obtain metabolomic profiles and distinguish different molecular metabolites using Agilent 7890A-5975C GC-MS system. Finally, an integrative analysis was performed to identify diagnostic biomarkers, followed by a functional validation analysis.
both groups, and the age ranged from 38 to 65 years. The baseline demographics and clinical characteristics were well-matched between groups ( Table 1). The recruited subjects had inadequately controlled asthma (see Supplemental Methods). After we collected blood samples from the patients to generate omics profiles, we treated patients with either Xiaoqinglong (XQLD) decoction for Cold symptoms or Ding Chuan Tang (DCT) for Hot symptoms, in addition to routine therapeutic agents for difficult-to-control asthma (Table 1). Both treatments improved asthma-related symptoms, including coughing, wheezing, shortness of breath, and chest tightness. The cold and heat related symptoms of patients, such as feeling cold/heat, color of the sputum and urine, also improved, back to the state of Yin-Yang balance.

Differences in DNA methylation profiles between cold and hot patterns
We utilized Human Methylation 450 K BeadChip to obtain the genome-wide DNA methylation profiles of six patients with difficult-to-control asthma, three from the Cold-and three from the Hot-pattern group. A total of 3098 different CpG sites mapping to 1499 genes were identified (adjusted p 0.05; Figure 2(A) and Supplementary Table S3A). Among these differentially methylated CpG sites, 1722 were hypermethylated in patients of the Hot-pattern group. Probe cg14651435 in the 3 0 UTR of DNAJB6 showed the greatest difference (b-difference ¼ À0.723, compared to the Cold-pattern group). Moreover, 1376 differentially methylated CpG sites were found to be hypomethylated in TSS1500 of OR2A5 using probe cg18845598, which showed the biggest difference (b-difference ¼ 0.598) between the two groups. Functional enrichment analysis indicated that the 1499 differentially methylated genes between the Cold-and Hot-pattern groups were enriched in pathways associated with metabolism, viral myocarditis, type I diabetes mellitus, allograft rejection, and cell adhesion molecules. Among these, metabolic pathways were the most significant, including 95 differentially methylated genes ( Figure 2B and Supplementary Table 3B and 3C). As shown in Supplementary Table 3B and 3C, all these 95 differentially methylated genes were mapped to 128 DNA methylation sites and were prevalent in metabolic pathways.
To confirm that the DNA methylation profiles in the six patients with difficult-to-control asthma were distinct from those in normal healthy people, we adopted an independent dataset from Gene Expression Omnibus (GEO: GSE36369) as a normal control. Between the six patients and six healthy Han Chinese-American individuals, we identified 7063 differentially methylated CpG sites annotated to 3476 genes (P cutoff ¼ 0.001). KEGG pathway analysis showed that these differently methylated genes were mainly enriched in asthma or asthma-related pathways, such type I diabetes mellitus, graft-versushost disease, and intestinal immune network for IgA production (Supplementary Figure  S1 and Supplementary Table S4). As expected, the six patients were diagnosed correctly using the GINA guideline with apparently different profiles than normal, healthy subjects. Analysis of two other datasets of 96 Caucasian-American and 96 African-American patients showed similar results (Supplementary Figure S2 and Supplementary Table  S5), thereby confirming the robustness of our analysis.

Differences in genomic profiles between cold and hot patterns
To assess differences in genomic profiles between the two asthma patterns, we also analyzed copy number variations for the selected six patients. Indeed, we identified 96 copy number variable regions (CNVR) mapped to 92 genes between the two patterns. The majority of observed CNVRs were copy number gains (85.4%), and there were only 14 loci with copy number loss (Supplementary Table 6A). This suggests that some genomic markers might contribute to the differences between Cold and Hot patterns, although this requires further studies. Among the 92 genes, 22 were enriched in metabolic pathways as shown by KEGG analysis (Supplementary Table 6B), further supporting our hypothesis and CMSDA reliability.

Metabolomic analysis
Based on DNA methylation analysis results, we hypothesized that the two groups of patients with difficult-to-control asthma might also differ in metabolomics. To explore this, we collected plasma samples from 20 patients with difficult-to-control asthma and performed a metabolomic analysis. We selected metabolites with both multivariate and univariate significance (VIP > 1.0 and p < 0.05). The score plot from principal component analysis of GC-MS metabolic profiles showed a distinct separation of differentially expressed metabolites between patients of the Cold-and Hot-pattern groups ( Figure 3A). Using KEGG analysis, we identified 18 different metabolites between the two groups (shown in Table 2) mapped to two pathways, namely the beta-alanine metabolism (hsa00410) and phenylalanine metabolism (hsa00360) pathways. These different metabolic profiles might help to diagnose and treat the two asthma patterns clinically, which are likely due to the altered methylation status of genes in the above two metabolic pathways. The 18 different metabolites corresponded to a total of 47 genes (Table 2). We further performed an integrated analysis to identify genes whose DNA methylation sites might be associated with these altered metabolites. Thereafter, we obtained five genes (ABAT, ALDH3A1, ACADM, ALDH6A1, and ALDH7A1) for further studies (Table 3 and Figure  3B), which were also among the 95 differentially methylated genes identified from the DNA methylation analysis ( Figure 2B and Supplementary  Table S3B). Specifically, ALDH3A1 was associated with both the beta-alanine and phenylalanine metabolic pathways, suggesting that it might be important in classifying patients as "Cold" and "Hot."

Functional validation of candidates
To technically validate the results of the microarray assay for the methylation probe in ALDH3A1, we used genomic DNA samples from the 20 patients with difficult-to-control asthma and performed pyrosequencing assays. Figure 4(A) shows that the DNA methylation level of the probe (cg10791966) in ALDH3A1 was significantly lower in patients of the Cold-than the Hot-pattern group (p < 0.01), consistent with our microarray results. Next, we tested these samples using real-time QRT-PCR assay. The ALDH3A1 mRNA expression levels were significantly higher in patients with Cold-than with Hot-pattern asthma ( Figure 4B), as expected based on the hypomethylation of ALDH3A1 in the Cold-pattern group. This suggests that ALDH3A1 mRNA expression might be transcriptionally regulated in the dependent of its DNA methylation status.

Discussion
In the present study, we used the CMSDA classification approach to distinguish a variety of diverse phenotypes and define variations in this disease status. Considering the diversities in genotype, phenotype, and their interactions with the environment in patients with difficult-to-control asthma, this approach could be a simple method to sort and summarize diverse phenomena from a wide variety of sources, including surveys and medical records. To identify novel biomarkers of precision diagnosis for difficult-to-control asthma, we linked genome-wide epigenomic and metabolomic data to extensive phenotypic patterns according to the CMSDA classification approach.  Our study employed a highly similar idea to a recently developed approach, PheWAS (phenome-wide association study), an alternative methodology to understand etiologies of complex diseases and to compensate or provide solutions for a variety of limited factors in GWAS. PheWAS usually investigate associations of a genotype with a wide spectrum of human phenotypes, namely the phenome. With this phenotype-to-genotype approach, many phenotypes of previously unappreciated etiologies in comprehensive diseases have been linked to some genes/pathways [27]. Although some researchers have made progress in endotypic classification of patients with asthma, most studies have only focused on airway inflammation, thereby ignoring the holistic nature of the human body. In contrast, our CMSDA classification approach focuses not only on known symptoms, like allergy, inflammation, or other asthma-associated symptoms, but also on unknown systematic phenotypes [28,29]. Therefore, the CMSDA could help identify more phenotypes not yet shown to be associated with asthmarelated genes or asthma in general. The combined examination of local airway response and systematic evaluation of the human body could be more intuitive and helpful in future PheWAS. Moreover, this approach might provide new insights regarding this disease, in order to develop methodologies that define diverse phenotypes and enhance PheWAS practicality [30][31][32]. Certainly, more precise PheWAS stratification would be obtained by this approach in the future, to define the phenome using electronic medical records for larger sample size. Chinese medicine has been well known for its holistic approach to complicated human diseases, with its sophisticated theory, long-term phenotyping, and effective therapies [21][22][23][24]33]. In both the Cold-and Hot-pattern groups, patients were responsive to the corresponding Chinese herbal medicine (CHM) treatment. CHMs exhibit broad actions on multiple pathological mechanisms of asthma, including antiinflammatory and immunomodulatory effects, and airway remodeling inhibition, which are different from the actions of corticosteroids [34][35][36]. On one hand, XQLD, known as Sho-seiryu-to (SST) in Japan, is a traditional Chinese herbal formula developed and used to treat asthma in Asian countries for centuries. Previous reports have shown that SST possesses an anti-inflammatory activity, modulates T helper 1 (Th1)/Th2 balance in the lungs, restores normal expression of spectrin a2, and augments nerve growth factor expression in the lungs [37][38][39]. On the other hand, a randomized, double blind, placebo-controlled clinical trial showed that DCT significantly improves airway hyperresponsiveness, as well as symptom and medication scores, in children with persistent asthma. This suggested that more stable airways are achieved by this add-on complementary therapy in patients with mild-to-moderate persistent asthma. The antiasthmatic effect of DCT on allergen-induced airway inflammation is mainly attributed to its bronchodilation effect and its ability to inhibit eosinophils from entering the airway [40,41]. Based on these distinct features, XQLD and DCT are thought to cure different types of asthmatic patient populations [28,29]. The two formulas might alter the corresponding pathobiological characteristics of different patient subpopulations, while the responsiveness to these different treatments could reflect variations in the characteristics of asthma patterns. Such variations differently affect the selection of therapeutic strategies, as well as their effectiveness. Indeed, the effective responses that we observed verified the reliability and practicability of the CMSDA approach, to some degree, further supporting that the apparent differences between the two asthma patterns, attributed to epigenomic and metabolomic profiles, might contribute to distinct clinical phenotypes. Therefore, responsive sub-populations of patients should be treated differently.
In our study, we found quite different epigenetic profiles even when using only three Cold and three Hot samples, with metabolite-related pathways being the most significantly different. These results allowed us to extract substantial features by using metabolomic data. The combination of epigenetic profiling with metabolic data in the follow-up analysis revealed ALDH3A1 as a potential marker for the diagnostic classification of difficult-to-control asthma, further verified in our validation analysis. This gene belongs to the aldehyde dehydrogenase family, with oxidoreductase and NAD activities related to P450 metabolic pathways. ALDH3A1 is highly induced by particular material (PM) extracts and serves as a marker of oxidative stress in human lung epithelial cells. Two previous studies [42,43] have reported that environmental polycyclic aromatic hydrocarbon content may play a role in oxidative stress. The authors found that air pollution with PM2.5 and PM10 increased the expression of ALDH3A1 gene by more than 30-fold. Another study indicated that ALDH3A1 increased in patients with COPD after smoking and concluded that smoking exerts a differential effect on ALDH3A1 in patients with established COPD than in two control groups. Those findings imply that ALDH3A1 might also be involved in oxidative stress induced by COPD and air pollution. This is not surprising considering that ALDH3A1 serves as a marker gene for the activation of antioxidant receptor element signaling pathways. Moreover, it has been shown that oxidative stress damages pulmonary function and might worsen asthma symptoms [44,45].
Intragenic DNA methylation might influence gene expression by transcriptional elongation and alternative splicing, which can interfere with transcription by preventing false initiation by an active transcription unit [8][9][10]. This process might explain the association of the hypermethylated probe (cg10791966) in ALDH3A1 gene with its lower mRNA expression in the Hot pattern, which is consistent with the clinical observation of worse treatment outcomes in patients with Hot-pattern asthma. Therefore, the methylation status of the ALDH3A1 probe might play a key role in interactions of this gene with the environment and might lead to a significant decrease in its mRNA expression as shown in Figure 4, thus altering oxidative stress responses to alternative cellular metabolism and pulmonary function in difficult-to-control asthma. The identification of ALDH3A1 as a marker to classify Cold and Hot patterns of difficult-to-asthma further confirms the importance of this gene in respiratory diseases, as well as the reliability of our classification approach.
The present study has some limitations. (1) First, we classified and characterized only two typical phenotypes, Cold and Hot patterns, based on Chinese medicine theory and clinical practice. However, asthma disease is complex, and, according to the Chinese medicine phenotyping classification approach, it involves diverse phenotypes, besides the Cold and Hot patterns, such as hot-prone or cold-prone neutral patterns, mixed hot and cold pattern, and less-apparent prone patterns. Further studies on CMSDA classification and identification of diagnostic biomarkers for difficult-to-control asthma remain to be conducted for other well-defined phenotypes. (2) Second, we selected only three patients from each group to obtain DNA methylation data due to our insufficient data source and financial limitations. However, to maximally avoid false positives, we classified patients and determined asthma as Cold or Hot pattern by at least three Chinese medicine experts, who deeply understood the CMSDA scoring and GINA diagnostic standards. Moreover, we incorporated the whole-genome DNA methylation profiles of these six patients with additional metabolomic data from 20 blood samples in our study, and performed an integrative bioinformatics analysis and functional validation [46][47][48]. More robust next-generation sequencing technologies, such as whole-genome and whole-exome sequencing, as well as DNA methylation sequencing techniques, would help to accurately reflect phenotypic differences. Future studies including more patients to obtain multiple omics datasets from plasma samples would help draw more robust conclusions. (3) Third, we identified only one gene (ALDH3A1) for validation studies, possibly due to the small sample size; more genes could be identified with a larger sample size. Due to relevant biological functions of ALDH3A1, it is important to elucidate the mechanisms by which it might contribute to Cold-and Hot-pattern classification in difficult-to-control asthma. More studies with additional patient samples are required to obtain genomic data and determine if other polymorphisms in ALDH3A1 are associated with the development of difficult-to-control asthma. (4) Finally, environmental factors impact not only on disease development itself, but also on the classification of asthma phenotypes [49]. Future studies with larger samples should also focus on the interactions between environmental oxidative exposure and the genome and epigenome. Thus, more genes involved in antioxidant pathways or other molecular signaling pathways should be explored to explain the differences in clinical phenotypes of this disease.

Conclusions
In the present study, we utilized the CMSDA classification approach to compare epigenomic and metabolomic data between the Cold and Hot patterns of difficult-to-control asthma. We identified a novel diagnostic biomarker, ALDH3A1, which might provide new insights into precision diagnosis for difficult-tocontrol asthma. Our findings demonstrate the practicability and scientific nature of the CMSDA. This approach might provide novel insights in categorizing diverse syndromes and defining endotypes for asthma and would be the first step toward precision diagnosis for difficult-to-control asthma.

Author contributions
Li L, Li H, Li J, Liu SG, Li GX, and Shao RG designed the experiments. Li M and Zhang X carried out the sampling collection. Cao R and Ye C executed sample processing. Zheng S performed bioinformatics analysis. Li L and Song WP performed the functional validation and wrote the paper. All authors reviewed the manuscript.