A DNA Damage Response Related Signature to Predict Prognosis in Patients with Acute Myeloid Leukemia

Abstract The prognosis of acute myeloid leukemia (AML) is disappointing in most subtypes and varies widely. DNA damage response (DDR) is associated with prognosis and immunotherapy in multiple cancers. Here, we identify a signature of eight DDR-related genes associated with overall survival, which stratifies AML patients into high- and low-risk groups. Patients in low-risk group were more likely to respond to sorafenib. The signature could be an independent prognostic predictor for patients treated with ADE and ADE plus gemtuzumab ozogamicin. Therefore, this DDR prognostic signature might be applied to prognostic stratification and treatment selection in AML patients, which warrants further studies.


Introduction
Acute myeloid leukemia (AML) is a type of hematologic malignancy, which is characterized by the accumulation of myeloid blasts in the peripheral blood and bone marrow as a result of clonal expansion (1). Over the past forty years, the prognosis of AML patients has been discouraging, mainly due to the limited treatment options (2). The 5-year overall survival (OS) of AML patients is below 30% (3), despite advances in understanding the molecular biology of AML and the flux of new-approved targeted drugs in recent years (4,5). High heterogeneity in cytomorphology, cytogenetics, immunophenotypes, and genetics could account for poor outcomes of AML patients (6).
Currently, European LeukemiaNet (ELN) risk classification is commonly used for AML diagnosis and management, which combines genetic mutations with cytogenetic abnormalities (7,8). Unfortunately, it cannot efficiently predict the outcomes of individuals. According to the ELN, about 50% of normal karyotype (NK) AML patients are conventionally at a favorable or intermediate risk.
However, some of them still have poor outcomes (8,9). Therefore, an efficient prognostic stratification tool involving biological mechanisms would benefit AML patients. Several prognostic models have been constructed based on miRNA (10,11), leukemia hematopoietic stem cells (12), and gene signatures (13,14). However, as the outcomes of AML patients treated with standard therapy vary (15), responses to treatment should be taken into account to optimize the signature.
The malfunction of DNA damage response (DDR) results in genomic instability, which is one of the essential carcinogenic mechanisms and is considered a hallmark of cancer. The role of the DDR in prognostic stratification has been studied in lung adenocarcinoma (16), esophagus carcinoma (17), glioma (18), and other multiple cancers (19). Increased DNA damage and altered DNA damage responses were the signs of genetic instability in the pathogenesis of AML (20). Dysregulation of DDR genes has also been investigated in AML. The DDR gene expression profile of acute promyelocytic leukemia (APL) patients was different from that of non-APL AML patients (21). DNA repair genes including MRE11A, XRCC1, BRCA1, RAD51, and PARP1 were overexpressed and associated with poor prognosis in patients with non-APL AML (21). In addition to prognostic stratification, a previous study showed that a signature of ten genes in the DDR pathway involved in the response to calicheamicin could predict clinical outcomes of AML patients treated with ADE and gemtuzumab ozogamicin (GO) (22), providing a potential tool to evaluate the clinical application of GO based on the diagnostic leukemic cell gene expressions of pediatric AML patients. These results suggest the potential of DDR-related genes for prognostic stratification and treatment selection in subgroups of AML patients. However, a practical tool of DDR-related genes for stratification of prognosis and selection of treatment has yet been fully developed in patients with AML.
In this study, an eight-gene DDR expression signature was constructed and validated for prognostic stratification and treatment selection, which demonstrates its association with OS in AML. In addition, the eight-gene DDR expression signature could be an independent prognostic factor for patients treated with chemotherapy with or without GO. Altogether, this signature could be a potential application in prognostic stratification and guiding treatment for AML patients.

Data collection
Gene expression profiles and clinical data of 173 AML patients and 337 normal blood samples were obtained from The Cancer Genome Atlas (TCGA)-LAML and Genotype-Tissue Expression (GTEx)-Blood cohort (https://xenabrowser.net/ datapages/). Microarray and clinical data of GSE71014 (n ¼ 104) (23) and GSE37642_GPL96 (n ¼ 422) cohorts were obtained from the Gene Expression Omnibus (GEO) database. Expressions and clinical data of patients treated with standard ADE chemotherapy comprising cytarabine, daunorubicin, and etoposide, or ADE plus GO were obtained from a previous study (22). The detailed baseline characteristics are demonstrated in Supplementary Table S1.
For data preprocessing, the Ensemble IDs or probes were transformed into gene symbols and the median expression of the gene with multiple symbols was used. In total, 276 DDR-related genes were retrieved from previous research (Supplementary Table S2) (24), 265 of which were identified in TCGA-LAML.

Construction and validation of a DDR prognostic signature
To identify differentially expressed DDR-related genes (DEGs), the difference between AML blood samples with normal blood samples in the TCGA-LAML cohort was studied by the "limma" R package. The filter thresholds were set as absolute log 2 fold change >2 and false discovery rate (FDR) <0.05 (25). Then, univariable Cox regression was implemented to explore the prognostic value of these DDR-related DEGs. The DDR-related DEGs with p value <0.05 were subjected to the least absolute shrinkage and selection operator (LASSO) penalized Cox regression to construct a prognostic signature. The risk score of the prognostic signature was calculated by multiplying the expression level of each gene and the corresponding regression coefficients. All patients were divided into high-and low-risk groups according to the median cutoff value. The prognostic value of the DDR signature was further validated in the GSE71014 and GSE37642_GPL96 cohorts.
Association between the DDR signature and efficacy of ADE or ADE plus GO A previous study showed that ADE plus GO improved the event-free survival (EFS) of De Novo AML in the clinical trial AAML0531 (NCT00372593) (26). The transcriptomic data of the blood and tissue specimens and the clinical information of the patients with different treatments (ADE arm, n ¼ 207; ADE plus GO arm, n ¼ 219) were analyzed, which were downloaded from the TARGET database (https://target-data. nci.nih.gov/Public/AML/mRNA-seq). The EFS and OS in this study were identical to the definitions in the AAML0531 trail. The median cutoff value of the risk score of the DDR signature was used to stratify these patients treated with the ADE regimen or ADE plus GO.

Immune relevance
The TCGA_LAML cohort was used to study the abundance of 28 immune cell infiltration of patients by translating the gene expression levels into the relative proportion of immune cells using the ssGSEA algorithm.

Drug sensitivity
The "oncoPredict" R package was performed to analyze the drug sensitivity of each sample Genomics of Drug Sensitivity in Cancer (GDSC) database (https://www.cancerrxgene.org/). To determine correlations between drug sensitivity and the DDR-signature risk score, Spearman's correlation between risk score and the half maximal inhibitory concentration (IC50) of drugs was performed.

Statistical analysis
The R software (version 4.0.0; https://www.R-project.org) was used for all statistical analyses. Kaplan-Meier survival curves with log-rank tests were plotted to compare the OS between two risk groups using "survminer" R package. The "survminer" and "timeROC" R packages were used for the time-dependent receiver operating characteristic (ROC) analysis to evaluate the predictive performance of the DDR signature. Considering the effect of other clinical and molecular covariables, the hazard ratio (HR) and 95% confidence interval (CI) were estimated to explore the independent prognostic value of DDR-related signature using multivariable Cox regression. The "glmnet" and "survival" packages were conducted for LASSO regression analysis. Wilcoxon sum-rank or Fisher's exact test was adopted to study the differences in variables between the two risk groups. All p value < 0.05 was considered significant unless otherwise stated.

Construction of a prognostic DDR signature based on the TCGA cohort
The schematic diagram of the study design is depicted ( Figure 1). RNA expression levels were compared between 173 AML samples and 337 normal samples to identify DEGs in the DDR pathway. Among the 7,806 DEGs, 163 DDRrelated DEGs were identified between AML and normal samples, 2 genes of which were downregulated and 161 genes were upregulated in AML ( Figure 2(A)). The 163 DDR-related DEGs are illustrated in the heatmap (Figure 2(B)).
We performed univariable Cox regression on the 163 DDR-related DEGs to reduce candidate DDR-related genes for prognostic signature construction, 14 genes of which showed significant correlations with OS in AML patients (p < 0.05, Figure 2(C)). For instance, MNAT1 was positively associated with OS (p ¼ 0.007; HR, 0.59; 95% CI, 0.40-0.87, Figure 2(D)), while RNF8 was negatively associated with OS (p ¼ 0.01; HR, 1.62; 95% CI, 1.10-2.38, Figure 2(E)), which suggested a prognostic role of DDR genes in AML. Furthermore, an eight-gene signature, including RNF8, PARP1, POLN, IDH1, SMC5, MNAT1, SHPRH, and SMARCC1, was constructed with the minimum value of k of LASSO regression (Supplementary Figure 1(A-C)). Then, by the median cutoff value, AML patients were divided into high-and low-risk groups. In the high-risk group, the OS of patients was significantly shorter than those in the low-risk group (p < 0.001, HR, 3.00; 95% CI, 1.96-4.61, Figure  2(F)). Consistently, a lower expression of genes associated with favorable prognosis in the signature-MNAT1, SMC5, SHPRH, and SMARCC1was observed in the high-risk group while the expressions of genes associated with a worse prognosis-PARP1, POLN, and RNF-were reduced in the same group (Figure 2(G)). Furthermore, the AUC of time-dependent ROC curves reached 0.780, 0.725, and 0.748, respectively, at 1, 3, and 5 years (Figure 2(H)). We also studied the distributions of different ELN risk groups in the high-and low-risk groups of the DDR-related signature. Compared with other risk groups, the proportion of favorable risk was significantly higher in the low-risk group (Figure 2(I)).
To further explore whether the DDR-related signature could independently predict the prognostic, we performed both univariable and multivariable Cox regressions on the DDR-related signature and other clinical factors. Besides the DDR signature in the univariable Cox regression, patients aged 65 or above also had worse OS than those younger than 65 years old (HR, 2.95; 95% CI, 1.93-4.51, Table 1). In the multivariable cox regression, the association between the DDR signature and OS remained significant (HR, 3.10; 95% CI, 1.99-4.84, Table 1). Altogether, these results indicated the potential of the DDR-related signature as a prognostic biomarker.

Validation of the DDR-related signature in the validation cohorts
The prognostic value of the DDR-related signature was further validated in the GSE71014 and GSE37642_GPL96 cohorts, both of which were divided into high-and low-risk groups according to their respective median cutoff value ( Figure  3(A-B)). As the training results, the OS of patients in the high-risk group was significantly worse than that of the low-risk group in both GSE71014 (p ¼ 0.01; HR, 2.32; 95% CI, 1.17 À 4.60, Figure 3(C)) and GSE37642_GPL96 (p ¼ 0.04; HR, 1.27; 95% CI, 1.01 À 1.58, Figure  3(D)). In GSE71014, RNF8 and IDH1 had a significantly higher level of expression in the highrisk group while SMARCC1 had a lower expression (p < 0.05, Figure 3(E)). Similar results were observed in GSE37642_GPL96 (p < 0.05, Figure  3(F)), indicating the robustness of the DDRrelated signature as a prognostic biomarker.
Association between the DDR-related signature and immune status and drug sensitivity The role of DDR in immunotherapy had been reported previously (27). In this study, we explored the association between the DDR-gene signature and immune infiltration. The immune infiltrations between the two risk groups in the TCGA-LAML were compared. Of note, the highrisk patients of the DDR signature had higher fractions of CD56 dim natural killer (NK) cells (p < 0.001), macrophages (p < 0.001), monocytes (p < 0.001), plasmacytoid dendritic cells (p < 0.001), activated dendritic cells (p ¼ 0.002), central memory CD4 þ T cells (p ¼ 0.002), gamma  (Figure 4(A)), which are demonstrated in the heatmap of infiltrations (Figure 4(B)). Spearman's correlation between the ssGSEA scores of immune infiltration and the risk score of DDR signature showed that in the low-risk group, the relative proportion of central memory CD4 þ T cells was positively correlated with the risk (Supplementary Figure 2(A)), while in the high-risk group, a variety of immune cells were positively correlated with the risk, such as natural killer cells, regulatory T cells, T follicular helper cells, macrophage.
Furthermore, we analyzed the correlation between drug sensitivity and the risk score of the DDR signature using Spearman's correlation, and   differences between the two groups of risk scores were studied. The half maximal inhibitory concentration (IC50) of obatoclax and pevonedistat, two promising potential treatments for AML (28,29), was shown to be positively and negatively correlated with the risk score, respectively ( Figure  4(C)). Moreover, DDR-signature low-risk patients were more likely to respond to obatoclax (p ¼ 0.002) and sorafenib (p ¼ 0.03) but unlikely to pevonedistat (p ¼ 0.002) and EPZ004777 (p ¼ 0.03) (Figure 4(D)). Given these results, the DDR signature might provide some clues to immunotherapy and chemotherapy of AML patients. Association between the DDR-related signature and efficacy of ADE or ADE plus GO In a previous study, a prognostic signature was constructed by comparing the DDR-related DEGs of AML patients treated by ADE with or without GO, the results of which suggested that the DDR-related signature could be a potential means to predict the outcomes of AML (22). To find out whether our DDR-related signature could also be associated with complete remission (CR) and, perhaps, with the survival of the patients with different treatments, the association between our DDR-related signature and the efficacy of ADE or ADE plus GO was explored in the AAML0531 trial (30). The baseline characteristics in high-and low-risk groups are demonstrated in Table S3. The CR results showed the same trend as the above prognostic value of the DDR signature. When treated with ADE alone (79.0% vs 90.2%, p ¼ 0.03, Figure 5(A)) or ADE plus GO (75.9% vs 87.4%, p ¼ 0.03, Figure 5(B)), patients in the high-risk groups had a lower complete remission rate than those in the low-risk groups. For the patients treated with ADE, the high-risk group had worse OS (p ¼ 0.009; HR, 1.90; 95% CI, 1.17-3.09, Figure 5(C)) and EFS (p < 0.001; HR, 2.16; 95% CI, 1.43-3.27, Figure  5(D)). However, when treated with ADE plus GO, the high-risk group tended to have worse OS (p ¼ 0.37; HR, 1.23; 95% CI, 0.79-1.92, Figure  5(E)) or EFS (p ¼ 0.16; HR, 1.33; 95% CI, 0.89-1.99, Figure 5(F)) than the low-risk group despite the insignificant association. We then performed multivariable Cox regression by including treatment, DDR, and treatment-DDR signature interaction. The results suggested that DDR signature was still associated with poor OS (p ¼ 0.009; HR, 1.90; 95% CI, 1.17-3.07) and EFS (p < 0.001; HR, 2.17; 95% CI, 1.44-3.29) irrespective of the treatment (Table 2). Altogether, these results added to the evidence that the DDR signature was a prognostic factor for AML patients.

Discussion
The current challenge to AML patients is a lack of accurate prognostic stratification tools for better treatment and outcomes, due to the immense heterogeneity of AML and the inefficiency of the standard first-line regimens for a large portion of the patients (31). A prognostic signature of eight DDR-related genes was constructed in this study. The risk score of the signature divided the AML patients into high-and low-risk groups with significantly different survival in both the training and validation datasets. Furthermore, the prognostic signature was associated with immune infiltration, small molecule drug sensitivity, and the efficacy of AML treatment regimens. Altogether, this prognostic signature based on DNA damage response DEGs could be a potential prognostic tool for clinical management. This prognostic signature consisted of eight genes including RNF8, PARP1, POLN, IDH1, SMC5, MNAT1, SHPRH, and SMARCC1, some of which have been reported to be related to clinical outcomes of AML patients (32)(33)(34)(35)(36). IDH1 mutation was generally associated with poor outcomes in AML patients (37). PARP1 encoded a DNA damage sensor, PARP-1, a higher expression of which was associated with a higher frequency of FLT3-ITD mutation. AML patients with a higher level of PARP-1 were reported to have a significantly shorter OS and EFS (36). Consistent with these previous results, both IDH1 and PARP1 adversely impacted the OS and EFS of AML patients in our study.
Immune landscapes, which are important features of AML pathophysiology, can predict chemotherapy resistance and immunotherapy response (38). In the present study, we also investigated whether the DDR signature could also indicate immune status of AML. Hence, we evaluated the immune cell infiltration using the ssGSEA for the immunity relevance of the risk score. Patients in the high-risk group of the TCGA cohort had higher fractions of CD56 dim NK cells, macrophages, monocytes, plasmacytoid dendritic cells, etc. CD56 dim NK cell, as a predominant subset of NK cell, was accumulated during aging. Since AML was also considered a disease of the elders, the higher fraction of CD56 dim NK cells in AML suggests consistency with the previous findings (39). However, CD56 dim NK cells presented more in AML patients with a better prognosis but were not considered a prognostic indicator so far (40).
Given that NK cells have been investigated for cancer immunotherapy, further exploration of CD56 dim NK cells might help identify patients who can benefit from immunotherapy. In addition, a previous study revealed that the high absolute monocyte count appeared as a poor prognostic factor for OS of AML patients (41). In the high-risk group of the DDR-related signature, the levels of monocytes and macrophages were higher. In AML, malignant cells could polarize macrophages to a cancer-supporting status, thereby promoting the progression of leukemia (42). Plasmacytoid dendritic cells are the principal natural type I interferon-producing dendritic cells. In a recent study, AML patients with plasmacytoid dendritic cell expansion were reported to have adverse risk stratification with poor outcomes (43). Given the significant correlation with immune status, our DDR signature might function as a predictive factor for changes in immune cell infiltration and immunotherapy response. In this study, the association between DDR signature and drug sensitivity was also investigated. As AML is featured as the uncontrolled proliferation of abnormal myeloid progenitors, inducing cell differentiation or triggering cell apoptosis might lead to new treatments for AML. In our study, the low-risk score of DDR signature was shown to be positively correlated with the response to obatoclax and sorafenib. Obatoclax, a pan-BCL-2 inhibitor, binds to anti-apoptotic BCL-2 to induce apoptosis in AML (29). It has been considered a promising treatment for AML (44). The clinical trials of sorafenib have generated some promising results implying an antileukaemic activity of sorafenib, in which patients treated with sorafenib plus chemotherapy irrespective of their FLT3 mutational status tended to have longer 5-year OS (45). Meanwhile, we also observed that high-risk patients were positively correlated with the response to pevonedistat, which is an inhibitor of NEDD8-activaing enzyme (NAE) in the ubiquitin-proteasome system (46). A previous study in AML cell line and xenograft mouse model demonstrated that pevonedistat along with histone deacetylase inhibitors could target the DDR pathway to cause cancer cell death and longer survival in the xenograft model. Recently, the results of a randomized phase II trial comparing pevonedistat plus azacitidine versus azacytidine in a small sample of lowblast (LB)-AML was reported, in which the median OS of pevonedistat plus azacytidine tended to be extended (47). Therefore, the application of those suggested treatments still needs further investigation in the clinical trials for AML patients while effective biomarkers could also benefit the outcomes of treatments by identifying suitable patients.
Recently, Gbadamosi et al reported a DDRrelated gene signature, which could predict the CR and clinical outcome of AML patients treated with ADE in addition to GO rather than those with standard ADE chemotherapy (22). AAML0531 is a randomized phase III trial to compare the clinical outcomes of ADE and ADE plus GO in AML (30), which was used in their study. Therefore, we used the AAML0531 trial to further analyze whether the DDR signature could also stratify patients into high-and low-risk groups with significant differences in clinical outcomes of AML. In both ADE and ADE plus GO treatments, the low-risk groups had a higher proportion of patients who achieved CR. Furthermore, the DDR signature was also associated with worse OS and EFS in the multivariable Cox regression. Interestingly, in Gbadamosi's study, the discrepancies in OS and EFS were shown in patients treated with ADE plus GO regimen. The distinct results of two DDR signatures could be caused by the selection of DDR-related genes, as they used the DEGs between the AML patients with the two different regimens while we used DEGs screened in the TCGA cohorts irrelevant to these treatments.
Our work has some limitations. Firstly, this prognostic signature was based on retrospective studies and not validated in prospective clinical trials. Secondly, further experimental studies are needed urgently to unveil the functional role of this signature in AML and the molecular mechanisms of eight genes as therapeutic targets or predictive biomarkers in future utilization.

Conclusion
In this study, we constructed a prognostic signature of DDR-related genes for AML, which was associated with the OS of AML patients independently. The associations between the DDRrelated signature and immune infiltrations, drug sensitivities, and regimens of ADE or ADE plus GO suggest this signature is a potential indicator for AML treatment. Taken together, the DDRrelated signature might be applied to the optimization of AML prognosis, and its clinical effectiveness needs to be further studied.