ZNF503 combined with GATA3 is a prognostic factor in triple-negative breast cancer

Abstract Purpose Triple-negative breast cancer (TNBC) is a subtype of breast cancer with poor prognosis. Therefore, there is an urgent need to identify prognostic markers to improve current treatment and therapeutic strategies. The transcriptional factor ZNF503 has been reported to promote aggressive breast cancer development through the down-regulation of GATA3 expression and has been identified as a candidate predictive marker. In this study, we explored whether ZNF503 and GATA3 could serve as prognostic markers independently or in combination. Material and Methods We performed a survival analysis of 989 breast cancer patients from The Cancer Genome Atlas (TCGA), and validated the findings in 202 breast cancer patients from tissue microarray (TMA). Results In TCGA database, the mRNA expression of GATA3 and ZNF503 could not predict TNBC prognosis alone, though the ratio index, ZNF503/GATA3 could be a novel prognostic biomarker in TNBC patients. In TMA database, we detected the protein expression of ZNF503 and GATA3 and found that the combination of the two genes, ZNF503-GATA3, significantly improved the predictive ability of clinical outcomes. Conclusions The results indicated that the binding index of ZNF503 and GATA3 could be used as a prognostic biomarker in TNBC.

Triple-negative breast cancer (TNBC) is a subtype of breast cancer with negative expression of oestrogen (ER), progesterone (PR), and human epidermal growth factor receptor-2 (HER2) (Reid et al. 2021, Yin et al. 2020, Zhao et al. 2020. Compared to other subtypes, TNBC patients has the poorest prognosis with a mortality rate of 40% within the first five years after diagnosis (Pareja et al. 2016). Due to its special molecular phenotype, TNBC patients are not sensitive to endocrine therapy or molecular targeted therapy. Chemotherapy is the main treatment, however, the efficacy of conventional postoperative adjuvant chemoradiotherapy is poor (Waks andWiner 2019, Yin et al. 2020). Immunotherapy is a promising treatment strategy for breast cancer. Immune checkpoint inhibitor monotherapy appears to be effective in metastatic TNBC patients (Rizzo et al. 2022b). Recent studies have shown that Ladiratuzumab vedotin, an antibody drug conjugate, has been assessed in metastatic TNBC patients (Rizzo et al. 2022a). Immune checkpoint inhibitors alone increase the risk of early death, whereas immunotherapy in combination with other drugs can prevent early mortality (Viscardi et al. 2022). Therefore, the predictive biomarkers are required for targeted therapy.
In recent years, many studies have been devoted to finding and validating molecular prognostic and predictive biomarkers for breast cancer. Several studies have shown that PD-L1 is considered to be an important predictive marker, and chemoimmunotherapy has become a novel first-line treatment for metastatic TNBC patients with increased PD-L1 overexpression Ricci 2022, Rizzo et al. 2022b). The classic molecular markers include ER, PR, HER2, Ki67, and so on (Taneja et al. 2010). However, most classical markers are expressed negatively in TNBC. Therefore, the discovery of new biomarkers is currently needed for TNBC patients. GATA binding protein 3 (GATA3), a member of the GATA family, plays a pivotal role in the tissue determination of many organs, particularly the mammary gland (Asselin-Labat et al. 2007, Kouros-Mehr et al. 2006, Querzoli et al. 2021. GATA3 is distinctively overexpressed in breast cancer among all GATA family members (Lin et al. 2017). GATA3 expression in human breast cancer is positively associated with ER expression. David et al. have concluded that low GATA3 expression can identify a subgroup of ER-positive (ER+) tumours with a higher risk of recurrence, and that GATA3 was very useful in distinguishing luminal A and luminal B subtypes (Voduc et al. 2008). GATA3 antibody (together with cytokeratin 7 antibody) is also widely used by pathologists for immunohistochemical identification and verification of breast carcinoma metastasis in lymph nodes and distant organs . GATA3 is more frequently expressed in ER + tumours compared with ER-negative (ER-) tumours (Albergaria et al. 2009, Ciocca et al. 2009). Although the data on the independent prognostic significance of GATA3 was inconsistent, its expression in ER + tumours seemed to predict good survival (Gulbahce et al. 2013, Parikh et al. 2005, Yoon et al. 2010. We have explored that GATA3 expression could predict progression-free survival in ER + breast cancer patients who received first-line tamoxifen for recurrent disease (Liu et al. 2016). In TNBC, GATA3 expression can inhibit tumour progression, EMT, and metastatic potential, and the absence of GATA3 induced resistance to chemotherapy (El-Arabey and Abdalla 2022).
ZNF503/Zeppo2 (zinc finger elbow-related proline domain protein2, Zpo2/Nolz1/Zfp503) is a conserved regulatory factor in many species (Nakamura et al. 2004, Shahi et al. 2015. Recently, Shahi et al. found that ZNF503 was a target transcription factor expressed in mammary epithelial cells. High ZNF503 expression level promoted mammary epithelial cell proliferation (Lu and Zhang 2019). Moreover, ZNF503 expression was up-regulated during breast cancer progression, and the overexpression of ZNF503 leaded to a higher incidence of lung metastases in mammary tumour transplant experiments (Shahi et al. 2015). Shahi et al. have identified that ZNF503 targeted to the GATA3 gene and inhibited the expression of GATA3 in ER + breast cancer cells. High ZNF503 level was correlated with poor survival of breast cancer patients (Shahi et al. 2017). However, there was limited evidence that ZNF503 could be used as a prognostic marker in TNBC.
In this study, we explored whether GATA3 and ZNF503 could serve as prognostic markers independently or in combination. Survival analysis and receiver operating characteristic (ROC) analysis of the breast cancer patient data in TCGA database and TMA database were performed to investigate their prognostic roles for future diagnostic and therapeutic strategies.

Clinical significance
• TNBC is a subtype of breast cancer with poor prognosis. There is a paucity of prognostic markers for TNBC.
• The expression of GATA3 in TNBC was lower than that in non-TNBC, while the expression of ZNF503 was higher in TNBC.

•
The combination of ZNF503 and GATA3 could be a potential prognostic factor in TNBC for future diagnostic and therapeutic strategies.

Data analysis employing the breast cancer cohort of database
The RNA sequencing data and clinical information of breast cancer patients were obtained from the Cancer Genome Atlas (TCGA) (https://www.cancer.gov/about-nci/organization/ccg/ research/structural-genomics/tcga). For clinical data, breast cancer cases with incomplete clinical information were removed. In total, 989 cases including 137 TNBC and 852 non-TNBC cases were included ( Figure 1). The mRNA expression values of ZNF503 and GATA3 were provided as transcript per million (TPM). The ZNF503/GATA3 index (Z/G) = TPM value of ZNF503/TPM value of GATA3. The relationship between ZNF503 and GATA3 expression and clinical survival in patients was investigated using TCGA database.

TMA construction and immunohistochemistry
TMA is a widely used, cost-effective, and tissue and reagent conserving method for molecular analysis (Matthew et al. 2019). TMAs contained a total of 202 female patient samples, including TMA-1 and TMA-2 (TMA-2A and TMA-2B). TMA-1 was purchased from Shanghai Xinchao Biotechnology Co., Ltd., (Shanghai, China) and there were 130 samples available for data analysis, of which 129 had survival information (Tables S1 and S2, Figures S1 and S2). TMA-2 was purchased from Xi'an Elina Biotechnology Co., Ltd. (Xi'an, China). TMA-2A included 30 breast cancer samples with two repetitions (Tables S3 and S4, Figures S3 and S4), and TMA-2B included 42 breast cancer samples with three repetitions (Tables S5  and S6, Figures S5 and S6). In total, there were 72 tissue samples without survival information from TMA-2 ( Figure 1). The TMA slides were dried at 60 °C overnight. After deparaffinization and endogenous peroxidase blocking, the sections were transferred to the repair solution (Tris-EDTA, pH: 9.0) for 3 min and washed with water. Sections were then incubated with GATA3 antibody (ab199428, Abcam, Cambridge, UK) or ZNF503 antibody (HPA026848, Sigma-Aldrich, USA) for 1h at 37 °C. Thereafter, the sections were rinsed 3 times with tris-buffered saline (TBS), then incubated with anti-rabbit secondary antibody (HA1001, HUA Bio, Hangzhou, China) at room temperature for 50 min and washed with TBS. Finally, positive staining signals were illuminated with diamino benzidine and counterstained with haematoxylin.
TMA sections were scored based upon the proportion of stained tumour cells and the staining intensity. GATA3 showed a nuclear staining when it is positively expressed. Intensity was scored as follows: 0, negative; 1, weak; 2, moderate and 3, strong. The proportion of positive cells was defined as follows: 0, <5%; 1, 5%-25%; 2, 26%-50%; 3, 51%-75% and 4, >75%. A staining index (values, 0-12) was defined by multiplying the score for staining intensity with the score for the positive area (Guo et al. 2021). When the staining was heterogeneous, each component was scored independently and summed for the scores. For statistical analysis, we defined 0-7 as the low-expression group and 8-12 as the high-expression group. The ZNF503-GATA3 index (Z-G) = score value of ZNF503 -score value of GATA3. The cut-off value was defined as the median number of Z-G index.

Statistical analysis
Statistical analysis was performed using Student's t-test or nonparametric Mann-Whitney U test. IBM SPSS Statistics 21 was used to analyse ROC curve, and perform univariate analysis and COX multivariate analysis. We defined survival status as 0 for patients with recurrence-free survival (RFS) greater than 5 years, and survival status as 1 for patients with RFS less than 5 years. The optimal cut-off thresholds were determined at the point on the ROC curve at which the Youden index was maximal. Survival curve analysis was performed for breast cancer patients using the Kaplan-Meier method. Interrelationships among clinical parameters, PR and HER2 status were calculated using chi-square test. Results were considered statistically significance when P < 0.05.

GATA3 and ZNF503 expression was analysed using TCGA database
GATA3 expression was reported to be significantly correlated with ER expression (Parikh et al. 2005, Usary et al. 2004. A total of 989 patient samples from TCGA breast cancer database were analysed. The results showed that GATA3 expression was significantly down-regulated in ER-breast cancer than that in ER + breast cancer (P < 0.0001, Figure 2A). However, there was little difference in ZNF503 expression between ER-and ER + breast cancer (P > 0.05, Figure 2B). Among 989 breast cancer patients from TCGA, 137 were TNBC and 852 were non-TNBC. GATA3 expression was significantly down-regulated in TNBC (P < 0.0001, Figure 2C), while there was no significant difference in ZNF503 expression between TNBC and non-TNBC ( Figure 2D). Heat map results showed that there was a slight negative correlation between ZNF503 and GATA3 expression in TNBC and non-TNBC ( Figure 2E, F).

Prognostic value of GATA3 and ZNF503 alone and in combination in breast cancer
To explore the prognostic value of GATA3 and ZNF503, we performed an in-silico assay using TCGA databases. At the low 25% cut-off value, the results showed that in 989 cases, high GATA3 expression predicted a good prognosis (P = 0.0011,  As ZNF503 and GATA3 have opposing patterns of expression, we investigated whether the expression ratio of ZNF503 to GATA3, ZNF503/GATA3 (Z/G), could provide a better composite predictor. We found that the higher the ZNF503/GATA3 ratio, the worse the prognosis in TNBC at the 50% cut-off value (P = 0.0409, Figure 3G-I). The results indicated that the ZNF503/GATA3 ratio could be a novel prognostic biomarker in TNBC.
We compared the predictive effects of ZNF503, GATA3 and ZNF503/GATA3 ratio on the five-years survival in breast cancer patients. Both t-test and ROC analyses demonstrated that the ZNF503/GATA3 ratio had a stronger correlation with the five-years survival than either gene alone had in TNBC subtype (P = 0.003, Table 1), but not in overall breast cancer or non-TNBC subtype (Table 1).
To verify the accuracy of the ROC analysis, we calculated the best cut-off value and then performed the survival analysis. The best cut-off values were determined at the point on the ROC curve at which the Youden index was maximal. Among the 137 TNBC cases, the AUC value was 0.687 (sensitivity = 57.3%, and specificity = 77.8%, Figure 4A), and the optimal cut-off value for the five-years prognosis was 0.182.
Applying this cut-off point, the high ZNF503/GATA3 ratio also predicted poor clinical survival in TNBC patients (P = 0.0409, Figure 4B). Interestingly, it was the same cut-off point as shown in Figure 3H.

GATA3 and ZNF503 expression was negatively correlated in TMA database
To verify whether the expression of ZNF503 and GATA3 in breast cancer patient was consistent with the prediction, tissue microarrays were used for validation. A total of 202 well-informed patient samples were collected, including 68 TNBC and 134 non-TNBC.
The results demonstrated that the expression of GATA3 was higher in non-TNBC than that in TNBC (P < 0.05, Figure  5A), while the expression of ZNF503 was higher in TNBC (P < 0.0001, Figure 5B). Subsequently, we divided the 202 breast cancer samples into two groups, ER + breast cancer (N = 101) and ER-breast cancer (N = 101). The results showed that GATA3 expression was lower in ER-breast cancer than that in ER + breast cancer (P < 0.0001, Figure 5C), whereas ZNF503 expression was higher in ER-breast cancer (P < 0.0001, Figure 5D).
In both non-TNBC and TNBC, the tumour samples with the highest GATA3 expression showed the lowest ZNF503 expression, whereas the samples with the lowest GATA3 The difference expression of GATA3 (a) and ZNF503 (B) between er + and er-breast cancer samples; The difference expression of GATA3 (C) and ZNF503 (D) between TNBC and non-TNBC subtypes; Heat map of TCGa database analysis for ZNF503 and GATA3 expression in TNBC patients (e) and non-TNBC patients (F) by TBtools. ****P < 0.0001; ns means that the P value is meaningless. expression showed the highest ZNF503 expression ( Figure  6A-D, F-I). The probability of low GATA3 expression was 70.9%, while that of low ZNF503 expression was 93.3% in non-TNBC (P < 0.0001, Figure 6E). In TNBC, the probability of low GATA3 expression was 94.1%, while that of low ZNF503 expression was 54.4% (P < 0.0001, Figure 6J). The results verified a negative correlation between the expression of GATA3 and ZNF503 in both non-TNBC and TNBC.

Detection of GATA3 and ZNF503 of the prognosis in breast cancer patients alone or in combination
We explored the relationship between these two genes and clinical characteristics of 202 breast cancer patients, and found that GATA3 and ZNF503 expression was significantly associated with ER, PR, HER2, and grade (Tables 2 and 3). ZNF503 expression was also significantly associated with age and stage (Table 3).
Of the 202 breast cancer samples, 129 had complete RFS information including 15 TNBC and 114 non-TNBC cases. We explored the relationship between the two genes and clinical prognosis. GATA3 only had the predictive value in overall breast cancer patients (P = 0.0045), but not in non-TNBC or TNBC subtype ( Figure 7A-C). The expression of ZNF503 had no relationship with survival in any breast cancer subtype ( Figure 7D-F). However, we found that the ZNF503-GATA3 index predicted poor survival in both overall breast cancer (P = 0.0127) and TNBC subtype (P = 0.0079, Figure 7G-I). These results indicated that the combination of ZNF503 and GATA3 was a prognostic indicator in overall breast cancer and TNBC.

The prognostic value of ZNF503 combined with GATA3 for predicting breast cancer outcomes
We performed a univariate analysis of ZNF503, GATA3 and ZNF503-GATA3 index in 129 breast cancer patients to explore the relationship with clinical indicators (Table 4). The results demonstrated that GATA3 expression was related to ER, PR, grade, type and recurrence. The expression of ZNF503 had no relationship with these indicators, whereas the ZNF503-GATA3 index was related to ER, PR, grade and recurrence.
We compared the predictive effects of ZNF503, GATA3 and ZNF503-GATA3 index on a five-years survival. The AUC value of the ZNF503-GATA3 index reached 0.674 in 129 breast  cancer patients and 0.795 in 15 TNBC patients which was higher than the AUC value of ZNF503 or GATA3 alone (Table  5). In non-TNBC patients, the combination of the two genes did not differentiate patients with good survival from patients with poor survival (Table 5). Thereafter, we calculated the Youden index using the data analysis of the ROC curve, and plotted the survival curve based on the optimal Youden index. In overall breast cancer, the optimal cut-off value for ZNF503-GATA3 index was 3.5 (sensitivity = 50%, specificity = 82.5%). Among the 15 TNBC cases, the optimal cut-off value for ZNF503-GATA3 index was 3.5 (sensitivity = 100%, specificity = 75%). Survival analysis at the optimal cut-off value confirmed that ZNF503-GATA3 index could be a prognostic marker in breast cancer patients (P = 0.0037, Figure 8A) and TNBC subtype (P = 0.0015, Figure  8B). In summary, the combination of ZNF503 and GATA3 may improve prediction ability.

Discussion
In recent years, many prognostic markers for breast cancer have been studied (Petrelli et al. 2015). IHC4 is a cheap biomarker comprising ER, PR, HER2 and Ki67 markers and serves as an inexpensive prognostic test for patients with breast cancer (Yeo et al. 2015). A lot of novel markers for metastatic breast cancer have also been discovered, including gross cystic disease fluid protein-15 (GCDFP-15), mammaglobin, and GATA (Gown et al. 2016). Currently, the research on biomarkers of TNBC is ongoing.
GATA3 is a zinc finger transcription factor that regulates morphogenesis and differentiation in many tissues, including the breast (Byrne et al. 2017, Cimino-Mathews et al. 2013. GATA3 and ER are involved in a positive cross-regulatory loop and frequently co-expressed in breast cancer. In some studies, GATA3 expression has been shown to be an independent predictor of overall and disease-free survival in ER + breast cancer (Cakir et al. 2017), however it is unclear whether it could be a predictor in ER-breast cancer.
In this study, through the analysis of TCGA data, we identified that the expression of GATA3 was higher in ER + breast cancer than that in ER-breast cancer, which was consistent with the findings of various other studies (Husni Cangara et al. 2021, Liu et al. 2014. Moreover, we found that the expression of GATA3 was lower in TNBC group and could be used as a prognostic marker in overall breast cancer patients. However, the expression of GATA3 was not a prognostic biomarker in either non-TNBC or TNBC subtype. Several studies have shown that ZNF503 binds to GATA3, and there was a negative regulatory relationship between the two genes (Shahi et al. 2015, Yin et al. 2019. We verified a certain negative correlation between ZNF503 and GATA3 expression in TCGA and TMA databases. ZNF503 expression  was higher in TNBC than that in non-TNBC, while GATA3 expression was higher in non-TNBC than that in TNBC. In ER + breast cancer, GATA3 co-expressed with ER, both of which were in a positive cross-regulatory loop (Cakir et al. 2017), thus GATA3 was highly expressed. GATA3 expression is regulated by other factors and plays an important role in TNBC progression (Kong et al. 2016). GATA3 is the   downstream of BRCA1, which interacts with the enhancer of zeste homolog 2 (EZH2) and DNA methyltransferases to control GATA3 transcription. Moreover, GATA3 expression suppresses TNBC metastasis by inhibiting lysyl oxidase (El-Arabey and Abdalla 2022). ZNF503 has been found to drive tumour development and process in lung cancer (Qi and Li 2020), breast cancer (Shahi et al. 2017) and hepatocellular carcinoma (Matthew et al. 2019). ZNF503 could be a potential candidate gene for future cancer diagnostic and therapeutic strategies. However, the prognostic value of ZNF503 has not yet been investigated in breast cancer. In this study, we envisioned ZNF503 as a prognostic molecule in breast cancer and validated its potential using TCGA and TMA databases. The results showed no significant difference in ZNF503 expression between TNBC and non-TNBC subtype in TCGA database. However, in TMA database, ZNF503 expression was higher in TNBC than that in non-TNBC subtype. We speculated that this difference was due to the inconsistency between the RNA level and protein level. However, ZNF503 was found to be not a prognostic biomarker independently.
Various traditional molecular prognostic markers and prognostic gene expression signatures for patients with breast cancer have been identified (Lal et al. 2017). Gene signatures are sets of genes that together have predictive power to predict the clinical outcomes. The 21-gene OncotypeDx assay was defined to detect early-stage ER + breast cancer (Paik et al. 2004). A two-gene expression ratio of HOXB13 and IL17BR, HOXB13:IL17BR ratio, was associated with a high risk of recurrence in patients with primary breast cancer (Jansen et al. 2007, Ma et al. 2004. The continuous identification of relevant prognostic biomarkers and signatures has improved the prediction of prognosis and guided the use of anti-cancer therapy in breast cancer. In this study, since GATA3 and ZNF503 alone were not effective prognostic biomarkers, we   combined the two genes to predict the breast cancer prognosis. In TCGA database, we found that the ratio of ZNF503 and GATA3 expression could be a prognostic marker only in TNBC. In TMA database, the ZNF503-GATA3 index could be used to predict the prognosis of overall breast cancer and TNBC patients (Figure 9). The limitation of this study was the sample size. Although the results are meaningful, the prediction results for 15 TNBC samples may not be sufficiently accurate. Extensive future clinical data is needed to evaluate the prognostic value of ZNF503-GATA3 index.
Previous studies have shown that ZNF503 binds to the promoter region of GATA3 as a transcriptional repressor in association with ZBTB32 (Repressor of GATA) and then down-regulates GATA3 expression (Shahi et al. 2017). In this study we only investigated the expression of ZNF503 and GATA3. For further validation and mechanisms exploration, it is necessary to develop large cohort studies and the following functional studies.
To our knowledge, this is the first study to assess the combination ZNF503 and GATA3 as a prognostic factor in TNBC. It could be applied for distinguishing the prognosis of TNBC patients and facilitating the therapeutic strategies. However, limited by the small samples and functional studies, further studies are warranted to consolidate the role of ZNF503 and GATA3 in TNBC. In the next five years, many  potential prognostic biomarkers will be discovered. The combination ZNF503 and GATA3 could act as the complementary clinical biomarker with other identified biomarkers in clinical.
In general, the predictive power of ZNF503/GATA3 ratio in TNBC patients from TCGA was better than that of ZNF503 and GATA3 expression alone. In TMA database, ZNF503-GATA3 index predicted TNBC survival. In conclusion, the combination of ZNF503 and GATA3 could be a useful prognostic factor in triple-negative breast cancer.

Disclosure statement
No potential conflict of interest was reported by the authors.

Data availability statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.