Major osteoporosis fracture prediction in type 2 diabetes: a derivation and comparison study

The widely recommended fracture prediction tool FRAX was developed based on and for the general population. Although several adjusted FRAX methods were suggested for type 2 diabetes (T2DM), they still need to be evaluated in T2DM cohort. This study was undertaken to develop a prediction model for Chinese diabetes fracture risk (CDFR) and compare its performance with those of FRAX. In this retrospective cohort study, 1730 patients with T2DM were enrolled from 2009.08 to 2013.07. Major osteoporotic fractures (MOFs) during follow-up were collected from Electronic Health Records (EHRs) and telephone interviews. Multivariate Cox regression with backward stepwise selection was used to fit the model. The performances of the CDFR model, FRAX, and adjusted FRAX were compared in the aspects of discrimination and calibration. 6.3% of participants experienced MOF during a median follow-up of 10 years. The final model (CDFR) included 8 predictors: age, gender, previous fracture, insulin use, diabetic peripheral neuropathy (DPN), total cholesterol, triglycerides, and apolipoprotein A. This model had a C statistic of 0.803 (95%CI 0.761–0.844) and calibration χ2 of 4.63 (p = 0.86). The unadjusted FRAX underestimated the MOF risk (calibration χ2 134.5, p < 0.001; observed/predicted ratio 2.62, 95%CI 2.17–3.08), and there was still significant underestimation after diabetes adjustments. Comparing FRAX, the CDFR had a higher AUC, lower calibration χ2, and better reclassification of MOF. The CDFR model has good performance in 10-year MOF risk prediction in T2DM, especially in patients with insulin use or DPN. Future work is needed to validate our model in external cohort(s).


Introduction
Type 2 diabetes (T2DM) is a rapidly growing public health problem. In China, the prevalence of T2DM in adult population is approximately 11% [1]. Meanwhile, the aging Xiao-ke Kong and Zhi-yun Zhao contributed equally to this work.
* Wei-qing Wang wqingw@163.com * Jian-min Liu ljm10586@rjh.com.cn population and increasing life expectancy have further increased in the global burden of osteoporosis, especially in China [2]. The pathophysiological interaction between T2DM and osteoporosis is complex. T2DM affects the bone mineral density (BMD) and bone quality, while certain antidiabetic medications also affect bone metabolism, and there is an association between diabetes chronic complications and the risk of falls and subsequent fractures [3]. It should be noted that diabetic patients have impaired bone quality despite displaying a normal or even high BMD [4,5]. However, screening or targeting T2DM patients at high fracture risk is a challenge due to the inconsistency among BMDs, fracture risk, various diabetic chronic complications and antidiabetic medications [6,7]. The use of risk calculators has greatly facilitated the management of chronic diseases since the development of the first cardiovascular risk assessment equation in the 1970s [8]. In the field of fracture prediction, the most widely used tool is FRAX, which is an online tool recommended by the WHO for predicting the 10-year probability of major osteoporotic fracture (MOF) and hip fracture (HF). Fracture risk is calculated from easily assessed clinical risk factors for fracture and (optionally) the femoral neck BMD. The FRAX tool is country-specific and is currently calibrated for over 60 countries [9]. The international community has reinforced its role in guiding treatment decisions [10]. However, on the one hand, patients with T2DM have increased fracture risk despite the presence of normal or high BMDs, indicating that BMDs are not sensitive enough to predict fracture in T2DM [11]. On the other hand, the FRAX does not include T2DM and diabetes-related variables and underestimates the fracture risk in T2DM [12]. Thus, in recent years, several adjustments to the FRAX have been proposed to solve this limitation, such as selecting rheumatoid arthritis (RA) as an equivalent replacement of T2DM, increasing the input age by 10 years, and reducing the BMD T score by 0.5, which produce similar effects on improving the accuracy of FRAX in T2DM [13]. However, to what extent the FRAX score underestimates the fracture risk and whether these adjusted methods could correct for the underestimation of FRAX scores in Chinese patients with T2DM still remain unclear.
In recent years, many countries have attempted to construct predictive tools for fractures in patients with T2DM, for example, the simple risk prediction tool for HF based on Australian T2DM cohort [14] and the scoring assessment tool for vertebral fracture from Japanese T2DM [15]. It must be emphasized that T2DM is an independent clinical condition associated with an increased risk of fracture and is even independent of the BMD and components of the FRAX tool [16], while its related factors, such as disease duration, chronic complications, poor glycemic control and the use of insulin, are all associated with increased fracture risk [17]. This suggests that in addition to verifying the accuracy of the FRAX tool in T2DM in various countries, it is paramount to introduce diabetes-related variables into the fracture prediction model for patients with T2DM.
Therefore, the purpose of this study is to accomplish the following aims: 1) evaluate and compare the predictive value between the FRAX and diabetes-adjusted FRAX for MOF risk in T2DM patients; 2) derive and internally validate a prediction model for Chinese diabetes fracture risk (CDFR) based on clinically accessible variables in the management of T2DM; 3) compare the performance of the CDFR against the FRAX in T2DM patients with different MOF risks.

Participants
This was a retrospective cohort study. The clinical information of patients with T2DM who were hospitalized in the Department of Endocrine and Metabolic Diseases, Ruijin Hospital, Shanghai Jiao-tong University School of Medicine between August 2009 and July 2013 was extracted from electronic health records (EHRs). Each patient had a unique identification number and the date of initial admission was defined as the baseline date. At baseline, patients with primary or secondary hyperparathyroidism, end-stage renal disease (eGFR < 30 ml/min/1.73 m 2 ) and malignant tumors were excluded. The outcomes of 1876 patients were retrospectively collected through telephone interviews and then double-checked according to the EHRs. Ultimately, a total of 1730 participants (92.2%) completed the retrospective survey with a median follow-up of 10 years [interquartile range (IQR) 9-11]. The flow chart is shown in Supplementary Figure S1.
Our study was approved by the Ethics Committee of Ruijin Hospital, Shanghai Jiao-tong University School of Medicine. This research has been retrospectively registered in the Chinese Clinical Trials Registry (ChiCTR2100050913).

Baseline variables
The standardized self-administered questionnaires pertaining to sociodemographic characteristics, medication records, and any medical or surgical histories were conducted by trained residents. Height was measured to the nearest 0.1 cm, and weight was recorded to the nearest 0.1 kg while participants were wearing lightweight clothing. The body mass index (BMI) was calculated as the body weight divided by the squared height (kg/m 2 ). Diabetes duration was self-reported by the patients, and insulin use was identified based on the patients' hypoglycemic agents at discharge in the EHRs. A history of hypoglycemia was defined by typical symptoms such as hunger, palpitations and sweating with or without documented blood glucose levels lower than 3.9 mmol/l during the treatment of diabetes.
All participants were questioned regarding neurologic symptoms and examined for neurologic signs. Diabetic peripheral neuropathy (DPN) was identified by both at least one neurologic symptom/sign and an abnormality of peripheral nerve conduction velocity [18]. Diabetic retinopathy was diagnosed based on fundus examination (any grade detected by ophthalmoscopy and/or ophthalmologist assessment). Osteoporosis was diagnosed based on BMDs measured by dual-energy X-ray absorptiometry (DXA, Lunar Expert-1313, Lunar Corp, Madison, WI) and prior medical history, according to the expert consensus in China [19]. Chronic-disease history was self-reported or diagnosed in the medical records, including hypertension, coronary heart disease, and cerebrovascular disease. Medication use (including insulin, oral hypoglycemic drugs, lipid-lowering medication, and antihypertensive medication) was obtained from the EHRs.
Patients fasted for at least 10 h before morning blood collection. Glycated hemoglobin A1c (HbA1C) was measured using a hemoglobin testing system (Variant II, Bio-Rad, Hercules, CA). Total serum protein, serum albumin, total cholesterol, triglycerides, LDL, HDL, apolipoprotein A, apolipoprotein B and creatinine levels were measured using an automatic biochemical analyzer (Modular E170, Roche, Basel, Switzerland). The estimated glomerular filtration rate (eGFR) was calculated according to the CKD-EPI equation [20].

Outcome
The primary outcome of the study was MOF (hip, clinical spine, proximal humerus, and forearm). Fractures occurring before or within the first 30 days of follow-up were considered previous fractures. Multiple open fractures or fractures from traffic accidents were not considered MOFs. MOFs were identified by self-reporting by telephone and/or radiographic documents in the EHRs. All disputed fractures were decided after discussion by a panel of experts.

FRAX and adjusted FRAX
The 10-year MOF risk was calculated using the China FRAX® model with or without the femoral neck BMD T-score. We compared three adjustments intended to enhance the performance of the FRAX in the presence of BMD. First, the FRAX was tested with RA input as a proxy for the effect of T2DM; second, the age input to the FRAX was increased by 10 years; and third, the femoral neck T-score input to the FRAX was reduced by 0.5 SD. In the absence of the BMD, we compared two adjustments intended to enhance the performance of the FRAX, including the RA adjustment and age adjustment [13].

Statistical analysis
All analyses were conducted using the IBM SPSS Statistics 25 (version 9.2) and R (version 4.0.5). Continuous variables are presented as the mean ± SD for normal distributions and as the median (IQR) for skewed distributions. Independent-samples t tests and Mann-Whitney U tests were used for comparisons between two groups. Categorical variables were expressed as percentages, and chi-square (χ 2 ) tests were used for comparison of these values between two groups. The data were substantially complete for most variables, except for some biochemical variables; however, no single variable was missing more than 3% of data. Missing variables were imputed according to the mean value. Sensitivity analyses were carried out using only data without imputed values.
Cox regression was used to estimate the coefficients and hazard ratios associated with the first MOF event. The assumptions of proportional hazards were checked using log-log curves. The prediction model was constructed using a backward stepwise selection procedure based on the p value, effect size, and clinical importance of the variables. The interactions between variables and any interaction terms with p < 0.05 or statistically significant integrated discrimination improvement (IDI) values were included in the model [21]. Model performance was evaluated by the discrimination C statistic [22] and calibration using the Hosmer-Lemeshow goodness-of-fit test [23]. The final model was internally validated by bootstrap resampling 1000 times to correct for overoptimism in the discrimination measures [24]. According to the MOF risk scores calculated by the final model, participants were divided into three categories (< 10%, 10-20%, and > 20%), which were based on clinically meaningful cut-points [25]. The cumulative incidence curve was drawn by Kaplan-Meier method, and any differences were evaluated with a stratified log-rank test.
In addition to the area under the receiver-operating characteristic curve (AUC), the categorical net reclassification index (NRI) [26] and IDI [21] were calculated to evaluate the degree of improvement in discrimination. Calibration charts were drawn in deciles of predicted risk to illustrate differences in calibration performance. The observed 10-year MOF probability was derived from the cumulative incidence function (CIF) [27]. In general, an observed MOF risk within 10% of the predicted risk (an observed-to-expected calibration ratio between 0.90 and 1.10) was considered to represent good calibration [28].

Participant characteristics and outcomes
The 1730 eligible participants had a mean age of 55.1 ± 11.9 years, and 66% were males. During a median follow-up of 10 years, 109 (6.3%) participants experienced at least one MOF, and the fracture site with the highest incidence was the hip (39%) (Supplementary Table S1). The incidence of MOF fracture increased with age in both genders; however, the prevalence of MOF in females [13% (95%CI, 10.2-15.7%] was significantly higher than that in males [2.9% (95%CI, 1.9-3.9%)] (P < 0.001) (Supplementary Figure S2). The baseline characteristics of those who experienced an MOF during follow-up and those who did not (N-MOF) are shown in Table 1. Indicators of age, diabetes duration, HDL, apolipoprotein A, the percentage of females, DPN, insulin use, previous fracture, hypertension and osteoporosis were significantly higher in the MOF group than in the N-MOF group, however, the triglycerides and

Model development and internal validation
Age, gender, previous fracture, DPN, insulin use, total cholesterol, triglycerides, and apolipoprotein A were screened by using a backward step-by-step selection process in multivariate Cox regression. As shown in Table 2, although current smoking, current drinking, diabetes duration, eGFR category, HDL, hypertension, and osteoporosis were significantly correlated with MOF in univariate Cox regression (p < 0.05), no significant effect existed in multivariate Cox regression (p > 0.05). Moreover, available interaction terms between age and other variables (including insulin use, DPN and apolipoprotein A) were also added to the model as covariates (the parameters used for the final model are shown in Table 3). For example, for a woman aged 68 years with a total cholesterol of 4.14 mmol/L, a triglyceride of 2.39 mmol/L, an apolipoprotein A of 1.14 g/L, insulin use, DPN, and no previous fracture, according to the final model, the predicted 10-year MOF risk was 23.1%. The CDFR predicted a mean 10-year MOF risk of 6.8% compared with the observed risk of 6.3% (95%CI 5.2-7.4%), and the calibration ratio was 0.93 (95%CI 0.76-1.09). The C statistic of the CDFR model was 0.803 (95%CI 0.761-0.844), while the calibration χ 2 was 4.63 (p = 0.86), indicating excellent goodness of fit for our model. The C statistic of the alternative model developed using complete-case data was 0.795 (95%CI 0.754-0.836), which was similar to that of the model with imputation. The corrected model C statistic was 0.79 after bootstrap sampling repeated 1000 times, suggesting good internal consistency. By using a CDFR-predicted MOF risk of 10% and 20% as cutoff values, 78.1% of participants were classified as low-risk, 14.6% as medium-risk, and 7.3% as high-risk. As shown in Supplementary Figure S3, there was a statistically significant difference in the cumulative MOF incidence among the three groups (P < 0.001). Table 4, the AUC was 0.752 for the unadjusted FRAX, 0.749 for the RA-adjusted FRAX, 0.759 for the age-adjusted FRAX, and 0.803 for the CDFR. The AUCs of the FRAX with RA adjustment and the FRAX with age adjustment were similar to those of the unadjusted FRAX (p > 0.05), while the AUC of the CDFR was significantly higher than that of the unadjusted FRAX (p < 0.001). The overall predictive power of the FRAX for the 10-year MOF risk was improved after the RA adjustment (NRI: 4.88%; 1-S 10 e(IndX'B-MeanX'B) S 10 is the survival rate for MOF at 10 years ("Baseline survival" in Table 3), Ind X'B is "Individual sum" in Table 3, Mean X'B is mean "Coefficient × Value" sum, which is shown as "Mean (coefficient × value)" in Table 3 Coefficient  (Fig. 1a). The calibration χ 2 was 63.3 (p < 0.001) for the RA-adjusted FRAX, 63.7 (P < 0.001) for the age-adjusted FRAX, and 4.63 (p = 0.86) for the CDFR. Although all models showed better calibration than that of the unadjusted FRAX, only the calibration plot of the CDFR showed no substantial differences between the observed and predicted rates of MOF according to the risk deciles (Fig. 1d), while the RA-adjusted FRAX and age-adjusted FRAX both still significantly underestimated risk in the top three deciles of predicted MOF risk (Fig. 1b, c). There was no significant difference in the 10-year incidence of MOF between patients who had BMD screening and those who did not (p > 0.05) ( Supplementary Figure S5). As shown in Supplementary Tables S2, S3 and Supplementary Figure S4, the FRAX showed no improvement in discrimination or calibration for MOF prediction when the BMD was added. In the presence of the BMD, the AUC of the FRAX did not increase significantly after any adjustment method (p > 0.05); in contrast, all three adjusted FRAX models underestimated the 10-year MOF risk.

As shown in
The cumulative incidence plots revealed that patients with T2DM who had one or more of the characteristics including older age, longer course of diabetes, female gender, insulin use or DPN were at higher risk of MOF ( Supplementary Figure S5). The AUC of the subgroups of the CDFR was higher than that of the FRAX, while the AUCs of the FRAX scores in the subgroups did not improve significantly after RA adjustment (Supplementary Table S4). As shown in Fig. 2 and Supplementary Table S5, the CDFR showed no significant miscalibration across the different subgroups. However, the FRAX significantly underestimated the 10-year MOF risk in all subgroups, and the degree of underestimation was over 67% (observed/predicted ratio was over 3) in subjects with ages older than 60 years, diabetes duration over 10 years, female gender, insulin use or DPN. Even with the RA adjustment, significant underestimation still occurred for subgroups except those younger than 50 years, those with a diabetes duration below 5 years and males, with the degree Fig. 1 Calibration plots of the observed vs. predicted 10-year MOF risk by decile. Calibration ratio: the observed 10-year MOF incidence versus the predicted mean 10-year MOF risk ratio; overall underestimation ratio =

Discussion
Using longitudinal data from EHRs and telephone interviews, we developed a model for 10-year MOF risk prediction in T2DM (CDFR) based on readily accessible clinical variables. The CDFR had a better performance in MOF risk prediction than the FRAX. Our study also demonstrated that the unadjusted FRAX significantly underestimated the MOF risk, and diabetes-adjusted FRAX models still failed to correct for this underestimation, especially in patients with older ages, female gender, longer courses of diabetes, DPN, and insulin use.
All of the variables in our model are included in routine regular assessments of patients with T2DM. Age, gender and previous fracture, which are classical risk factors included in the FRAX [29], are also major predictors of MOF in patients with T2DM. We did not construct separate models for gender differences since there were no significant interactions between other variables and gender. Originally, T2DM and its related factors were not considered predictors when the FRAX was developed [29]. However, subsequent studies repeatedly suggested an increased risk of MOF in diabetes independent of the FRAX score and BMDs [16]. A longer duration of T2DM and poor glycemic control were considered to be associated with higher fracture risk [30,31]. A recent study of more than 150,000 people showed that the longitudinal 2-year HbA1c was independently associated with an elevated fracture risk in T2DM individuals during a 2-year follow-up period [32]. However, we did not find a  correlation between the baseline HbA1c and 10-year MOF risk, and it is expected that HbA1c can only reflect glycemic control within the most recent 2-3 months, while fractures occur over a long period of time. Thus, measuring HbA1c multiple times and calculating the average levels during a 10-year period might be more effective than using the baseline HbA1c alone to explore the relationship between glycemic control and the 10-year MOF risk. In addition, in recent years, several studies have demonstrated that diabetic complications, insulin use and hypoglycemia were associated with higher fracture risk [33][34][35]. However, these substantially elevated risks are not captured by FRAX [36]. It should be noted that patients with T2DM who undergo insulin treatment often have a long diabetes duration and multiple chronic complications of T2DM, which increases the risk of hypoglycemia. All of these factors increase the risk of falls and post fall fractures [37]. In our study, it was also likely that the effects of the diabetes duration and hypoglycemia were covered by highly correlated variables such as insulin use and DPN in the final model. It is well recognized that patients with T2DM often display obesity, high BMI, and dyslipidemia [38]. Obesity and high BMI have been considered as protective factors against fracture, as they were associated with high bone mass in previous studies [39], while recent studies have shown that dyslipidemia was associated with bone loss and increased bone brittleness [40]. The Tromsø study showed that nonfasting HDL levels increased the risk of fracture in both men and women [41], while another prospective study in America found that midlife women with high fasting plasma triglycerides levels had an increased risk of incident lowtraumatic fracture [42]. In our study, after excluding the influences and interactions of other variables in the multivariate Cox regression model, the apolipoprotein A and triglycerides were positively correlated with the 10-year MOF risk, while total cholesterol was negatively correlated with the 10-year MOF risk. Our study also showed that the BMI was negatively correlated with the 10-year MOF risk by univariate Cox regression. However, due to the complex glucolipid metabolic environment in patients with T2DM, the role of a low BMI was not significant, and it was covered by lipid metabolism parameters (total cholesterol, triglycerides, and apolipoprotein A) in our model predicting the 10-year MOF risk.
The FRAX, which uses common clinical risk factors to calculate fracture risk, was developed to compensate for BMD limitations. The FRAX can be used without the BMD for identifying individuals at higher risk of fracture within the general population, while BMD tests are reserved for those close to a probability-based intervention threshold [43]. Similar to previous studies, although the FRAX provided good discrimination in patients with T2DM, the absolute risk of MOF was significantly underestimated in our study. Several methods for adjusting the FRAX have been proposed to address the systematic underestimation of fracture risk in T2DM. Our study showed that RA and age adjustments could improve the calibration of the FRAX to some extent, but the 10-year risk of MOF was still significantly underestimated in T2DM, especially in older subgroups and those with longer courses of diabetes, female gender, insulin use and DPN. It is worth noting that the FRAX's conservative assumptions regarding the fracture risk in secondary osteoporosis (such as in premature menopause and chronic liver disease) were mediated by a low BMD, but in the absence of the BMD, the risk ratios for these secondary causes were thought to be similar to the risk associated with RA. If patients had T2DM concomitant with RA or other diseases secondary to osteoporosis, the RA adjustment for the FRAX was not appropriate [43]. Using the age adjustment might be less desirable in older individuals as the effect of competing mortality may paradoxically reduce the fracture probability [13].
In the general population, measuring the BMD adds predictive value to the FRAX [9]. Although a lower BMD remains a risk factor in T2DM [44], an increased risk of T2DM fracture is not always associated with a decreased BMD [45]. This paradox suggests that the increased fracture risk in T2DM is not fully captured by the DXA-based BMD [7]. Our study also showed that the addition of the BMD did not significantly improve the predictive power of the FRAX in T2DM. Even in the presence of the BMD, the three adjusted FRAX models still underestimated the MOF risk. The emerging conclusion is that the FRAX with or without the BMD and their adjustments might be inappropriate for 10-year MOF risk prediction in Chinese patients with T2DM.
The main advantage of our study was that patients with T2DM were routinely examined for glucose and lipid metabolism and screened for complications of diabetes during admission. Another advantage was that their primary outcome event, MOF, was captured during up to 10 years of follow-up. Recently, the management of T2DM has become much more standardized even in developing countries such as China; however, patients with T2DM often experience fractures with high BMD values, especially for older women who have a longer course of diabetes accompanied by DPN and insulin therapy. Moreover, the underestimation of fracture risk by the FRAX for these patients leads to their high fracture risk often being overlooked in clinical practice. This is very unfavorable for the early prevention and control of osteoporotic fractures in T2DM patients. In this study, the CDFR model was constructed by introducing diabetesrelated variables on the basis of FRAX. Compared with the adjusted FRAX, the MOF risk predicted by the CDFR was closer to the actual risk, especially in patients with older ages, female gender, longer courses of diabetes, DPN, and insulin use. This means that the CDFR may be more helpful in compensating for the systematic underestimation by the FRAX in T2DM patients than the methods that adjust only the input variables.
Nevertheless, there are several limitations in our study. First, the CDFR was developed based on patients with T2DM at a single-center, and the lack of an external validation cohort limits the universality of the model for application in other regions. Second, further studies are needed to determine whether these results are applicable to all T2DM patients. Third, this retrospective study might inevitably have recall bias, although the investigators tried their best to help patients recall the details of fractures during telephone interviews and then to search the radiographic documents in the EHRs to double-check the reports. Fourth, during follow-up, information about the dosage and duration of anti-osteoporosis drugs and other drugs that influence bone metabolism was not obtained, which might affect the evaluation of fracture risk. Finally, whether the CDFR can be translated into predict better outcomes or decrease in MOF risk needs to be verified in prospective studies.

Conclusion
With the collection of 10-year records of MOFs in patients with T2DM, our study developed the CDFR model with a good ability to predict the 10-year MOF risk of T2DM. The application of this tool will be helpful to compensate for the FRAX's underestimation of fracture risk in T2DM, especially in older women who have a longer course of diabetes accompanied by DPN and insulin therapy. This will be of great significance for the early prevention and management of T2DM patients with high fracture risk. Future work is needed to validate our model in external cohort(s).