Towards individualized dose constraints: Adjusting the QUANTEC radiation pneumonitis model for clinical risk factors.

Abstract Background. Understanding the dose-response of the lung in order to minimize the risk of radiation pneumonitis (RP) is critical for optimization of lung cancer radiotherapy. We propose a method to combine the dose-response relationship for RP in the landmark QUANTEC paper with known clinical risk factors, in order to enable individual risk prediction. The approach is validated in an independent dataset. Material and methods. The prevalence of risk factors in the patient populations underlying the QUANTEC analysis was estimated, and a previously published method to adjust dose-response relationships for clinical risk factors was employed. Effect size estimates (odds ratios) for risk factors were drawn from a recently published meta-analysis. Baseline values for D50 and γ50 were found. The method was tested in an independent dataset (103 patients), comparing the predictive power of the dose-only QUANTEC model and the model including risk factors. Subdistribution cumulative incidence functions were compared for patients with high/low-risk predictions from the two models, and concordance indices (c-indices) for the prediction of RP were calculated. Results. The reference dose- response relationship for a patient without pulmonary co-morbidities, caudally located tumor, no history of smoking, < 63 years old, and receiving no sequential chemotherapy was estimated as D500 = 34.4 Gy (95% CI 30.7, 38.9), γ500 = 1.19 (95% CI 1.00, 1.43). Individual patient risk estimates were calculated. The cumulative incidences of RP in the validation dataset were not significantly different in high/low-risk patients when doing risk allocation with the QUANTEC model (p = 0.11), but were significantly different using the individualized model (p = 0.006). C-indices were significantly different between the dose-only and the individualized model. Conclusion. This study presents a method to combine a published dose-response function with known clinical risk factors and demonstrates the increased predictive power of the combined model. The method allows for individualization of dose constraints and individual patient risk estimates.


Background
understanding the dose-response of normal tissue is of critical importance for the optimization of radiotherapy (RT). Radiation-induced pneumonitis (Rp) remains a dose-limiting toxicity in the treatment of lung cancer patients, and hence identification of relationships between dose and the probability of Rp is vital if constraints on the safely deliverable radiation dose are to be determined. A great many studies have proposed predictive models for the risk of developing Rp after curative RT [1]. Although progress has been made [2], a considerable challenge remains with the generalizability of such models. Even when sophisticated statistical techniques are employed in order to avoid over-fitting, model results may not be generalizable to different patient cohorts [3].
A major step forward for the understanding of the dose-dependence of radiation-induced lung toxicity was represented by the organ specific paper from the Quantitative Analysis of normal Tissue Effects in the Clinic (QuAnTEC) initiative [1], which performed a collective analysis on a number of published studies. The authors related mean dose to the lungs (MLD) to the risk of Rp for non-small cell lung cancer (nsCLC) patients, and provided a quantitative estimate of the dose-response relationship. This represents an impressive and important undertaking, which will aid both clinical dose planning and design of future RT studies. However, the published dose-response relationship is an estimate of the risk in an 'average' nsCLC patient. prediction of the risk of Rp for individual patients is still fraught with uncertainties, and this may be part of the reason for the limited predictive power of even the best published models. patient, disease-and treatment-related factors (which we will collectively refer to as clinical risk factors) are well known to influence the risk of developing Rp. The quantitative effects of such risk factors have recently been summarized in a large metaanalysis [4]; but even though a number of individual studies have examined the specific effects of various clinical factors on the Rp dose-response (e.g. [2,5,6]), these studies remain limited in their power to detect and accurately quantify the simultaneous effects of several risk factors. Relatively recently, molecular biomarkers for the development or Rp have also been added to the mix (e.g. [7,8]), but absence of validation in independent datasets remains a problem in this setting [9].
In this study, we demonstrate how the QuAn-TEC estimate for the dose-response relationship for Rp can be adjusted for the effects of the multiple clinical factors identified in the meta-analysis by Vogelius & Bentzen [4]; and how individual risk estimates and dose-constraints can subsequently be calculated. We validate this method in an independent dataset by illustrating the improved predictive value of the individual risk estimates compared to the dose-only QuAnTEC relationship.

Material and methods
The QuAnTEC paper suggested a relationship between MLD and the risk (normal tissue complication probability, nTCp) of 'symptomatic Rp' described by a logistic function In the present study, the aim was to obtain the dose-response relationship for a patient with a known set of clinical factors. (For the sake of brevity, all such factors are, in the following, referred to as 'risk factors', despite some demonstrating a protective effect.) This was done employing our previously published method [11] for correcting dose-response relationships for clinical risk factors, using the estimated prevalence of the various risk factors as well as odds ratios (ORs) for the effects. From our paper, if a dose-response relationship (with parameters D 50 and g 50 ) is based on a study population wherein a fraction s of the patients demonstrated a risk factor (with a corresponding OR), then the dose-response in another population, with the risk factor completely absent, can be approximately described by note that Equation 4 implicitly assumes that the OR for a given risk factor is independent of dose. Doseresponse relationships for individual patients with a specific set of risk factors were based on these parameter estimates. The modified logistic function in the presence of a single risk factor was and hence the effect of the risk factor on the parameters describing the dose-response relationship was given by: γ γ 50 50 Individualized dose constraints for radiation pneumonitis 607 All ORs were assumed to be independent and hence multiplicative in the logistic model (cf. Equation 1). For multiple risk factors, this multiplicity of odds ratios (i.e. OR combined  OR risk factor 1 * OR risk factor 2 * …) was exploited, and the model parameters were found from Equation 7 and 8 with OR combined .
The constraint on the MLD corresponding to a stipulated probability of normal tissue toxicity (e.g. nTCp  20%) could be found directly from the inverted equation ( ORs for the effects of the various clinical risk factors were drawn from Vogelius and Bentzen [4], see Table I. The prevalence of the various risk factors in the study populations included in the QuAnTEC analysis was only reported in a subset of the studies. However, assuming that those studies were representative for the remaining, the overall prevalence was estimated by calculation of weighted averages. Confidence intervals for the calculated parameter values, risk estimates and dose constraints were approximated using random sampling. For each variable entering the model, we considered an analytical probability density function (see below) for that variable around the point estimate, based on the 95% confidence intervals found in the original reports. Then random values were sampled from the probability distributions and used to calculate estimates of parameters, risks and dose constraints. This was repeated 10 6 times to provide for probability distributions for each of the reported estimates; the 95% confidence interval reported in this study corresponds to the 2.5 to 97.5 percentile of the sampled distribution. The analytical probability density functions were assumed lognormal for ORs for risk factors. For the QuAnTEC parameter estimates, no simple parametric distribution matched with the reported asymmetric confidence intervals. The probability distributions were therefore approximated by one normal distribution matching the point estimate to the lower confidence bound and another normal distribution matching the point estimate to the upper confidence bound. The two distributions were then conjoined to form an approximation of the prob ability distribution of the parameter considered. For the prevalence estimates from the QuAnTEC studies, only point values were used.

Validation in an independent patient cohort
The predictive values of the QuAnTEC and individualized nTCp risk estimates were tested in a cohort of nsCLC patients treated in a single institution from 2007 to 2009 (details of this cohort are reported in a study currently in preparation). patient records were reviewed retrospectively to evaluate the incidence and severity of Rp as well as tumor recurrence and survival. Clinical factors -pulmonary co-morbidities, smoking status and history, age, type and scheduling of chemotherapy -were also recorded. Rp was graded according to the CTCEA v. 3.0 radiation morbidity scoring scheme during the first year after RT by evaluation of CT scans and notes from follow-up visits. Time to Rp was measured from the first day of RT treatment. patients with Rp below grade 3 were recorded as having not experienced Rp, while patients with grade 3 and above were recorded as experiencing Rp.
Radiation was primarily delivered as threedimensional (3D) conformal RT. Dose distributions were calculated using an analytical anisotropic algorithm, and dose-volume histograms (DVHs) were extracted using CERR [12]. Mean doses to the combined lungs minus the solid tumor (MLD) were calculated, and the predicted risk of Rp from the QuAnTEC estimate was computed using Equation 1. patients were subsequently divided into two risk groups, based on nTCp above or below the cohort median. Additionally, individual risk estimates were calculated based on the method outlined above and these estimates provided for an alternative division into risk groups (again, above/below median nTCp).
subdistribution cumulative incidence functions for Rp were estimated in a competing risk analysis [13], with competing failure causes being death from lung cancer, death from other (or unknown) causes, and local progression. patients were stratified according to nTCp risk groups, and a test of difference of subdistributions across groups was conducted using a K-sample test by gray [14]. The concordance index (c-index) for risk groups as a predictor for Rp was calculated using standard methods, but with time for all competing risk events set at t max  1 (i.e. beyond the maximum follow-up time, as suggested by Wolbers et al. [15], since this ensured that patients experiencing a competing risk were kept 'at risk' for Rp at all evaluated times). The c-index is defined as the proportion of evaluable patient pairs for which predictions and outcomes are concordant; i.e. the proportion of patient pairs for which higher risk prediction (comparing the two patients involved) corresponds to (earlier) experienced toxicity. A c-index of 0.5 corresponds to the null hypothesis, where the risk prediction is no better than a random allocation of patients. Additionally, the c-index for (continuous) nTCp as a predictor for Rp was calculated. Both were tested for difference from 0.5. All analyses were performed using both the QuAnTEC nTCp model and the individual risk estimate, taking clinical factors into account. C-indices for the two methods of nTCp calculation were tested for difference using a two-sided student's t-test for dependent samples. Confidence intervals for the cumulative incidence functions were based on the variance of the point estimates, and approximated using the ln(-ln) transformation, as suggested by Choudhury [16]. survival analysis was conducted in R [17], using the 'cmprsk' [18] and 'survcomp' [19] packages. nTCp calculation and confidence interval estimations were done in MATLAB ® (2010b, The Math-Works Inc, natick, MA) using customized functions.

results
A summary of the prevalence of clinical risk factors in the patient cohorts included in the QuAnTEC analysis can be found in Table II. The mean age in the reported studies (65 years) can be compared to the mean age used as a cut-off point in the risk factor meta-analysis (63 years); for the sake of all subsequent analyses these were considered equal. Even though the studies included in the meta-analysis compared sequential chemotherapy with concomitant chemotherapy (rather than with no chemotherapy), patients receiving no chemotherapy were grouped with the ones receiving concomitant chemotherapy. For details of the individual studies, as well as further assumptions underlying the analysis, see Appendix A (available online at http://informahealthcare.com/doi/ abs/10.3109/0284186X.2013.820341).  1.00, 1.43). This baseline relationship is valid for a patient without any pulmonary co-morbidities, a tumor in the upper lobe, no history of smoking or current smoking habit, below 63 years old, and not treated with sequential chemotherapy. note that there is a slight dependence of the result on the order that the corrections are carried out, since the correction method is not exact -however, it is on the order of  0.1 gy for D 50 0 and  0.05 for g 0 50 . Dose-response relationships for specific risk groups were constructed based on adjusted doseresponse parameters, cf. Equations 5 and 6. Figure 1 compares the risk predictions for patients with the highest and the lowest risk of Rp with the crude QuAnTEC estimate. While the QuAnTEC study suggested a constraint on the MLD of 19.8 gy (in order to limit the risk of Rp to below approx. 20%), optimal constrains, as estimated with the additional information from clinical risk factors, varied considerably between patients with highest and lowest number of risk factors. patients currently smoking, with no risk factors (no pulmonary co-morbidities, no lower/ middle tumor, below 63 years old, and no sequential chemotherapy) could potentially receive an MLD up to 27.8 gy (95% CI 23.8, 32.5) without the predicted risk of Rp exceeding 20%. However, the group of patients with the highest risk of Rp (pulmonary co-morbidities, middle or lower tumor location, no history of smoking, above 63 years of age, and sequential chemotherapy) had an estimated dose-constrain of 7.0 gy (0, 14.5) to yield the same risk of Rp.
As a further example of the individualized dose-response curves for the risk of Rp which can be constructed using this approach, consider an elderly patient ( 63 years old) with pulmonary co-morbidity, without any history of smoking, with an upper lung tumor and not receiving sequential chemotherapy. The relationship between the mean lung dose and the risk of Rp is then described by

Validation in an independent patient cohort
A total of 103 patients were included in the validation dataset. Median prescribed dose was 60 gy (range 59.5-70 gy), and the median value of the mean physical dose to the lungs was 18.2 gy (range 0.79-29.4 gy). prevalence of clinical risk factors can be found in Table III, and the incidence of Rp is reported in  competing risks considered (Rp grade 3 and above, local progression, death from lung cancer and death from other or unknown causes) are shown in Figure 2. The estimated cumulative incidence of Rp grade  3 one year after start of treatment was 15.5% (95% CI 9.3%, 23.2%) in this population. since no information on tumor location was available for these patients, an alternative baseline dose-response relationship valid without tumor location was estimated This relationship is suitable for a patient without any pulmonary co-morbidities, no history of smoking or current smoking habit, below 63 years old, and not treated with sequential chemotherapy. Figure 3 contains the estimated cumulative incidence curves for Rp for risk groups (splitting the cohort above/below median nTCp risk) constructed  using either the crude QuAnTEC risk calculation (Figure 3a) or the individualized risk estimate ( Figure  3b). The test for difference of subdistributions between groups was not statistically significant using the QuAnTEC estimate (p  0.11), but was significant using the individual risk estimates (p  0.006). For the QuAnTEC estimate, the c-index for the risk groups was 0.69 (95% CI 0.46, 0.91, p  0.10), while for nTCp as a continuous predictive parameter c-index  0.62 (0.51, 0.73, p  0.03). using the individualized risk estimates, the c-index for risk groups was 0.83 (0.65, 1.00, p  0.0003) and for the continuous nTCp values 0.63 (0.52, 0.74, p  0.02).
Comparison of the two nTCp calculation techniques (crude QuAnTEC estimate and individualized risk estimate) showed a significant increase in the predictive value, as characterized by the c-index, by correcting for clinical factors, both for division into high/ low-risk groups (p  0.004) and for nTCp as a continuous variable (p  0.04).

Discussion
We have demonstrated a method to calculate doseresponse relationships for Rp which takes clinical risk factors into account, and which is based entirely on data from large, published meta-analyses. The improved predictive power of this method compared to a dose-only approach has been illustrated in a single clinical cohort. The validation set used for testing the model is completely independent in the sense that no parameters have been fitted to the validation set; moreover the method can easily be accommodated to additional future identified risk factors. It allows for individualization of dose constraints and for individual risk estimation. One limitation of the method is the dependence on the determination of the prevalence of risk factors in the patient cohorts underlying the dose-response relationship. Only a subset of the studies included in the QuAnTEC analysis reported sufficient patient characteristics and treatment details; the present study had to assume that the studies reporting those details were representative for the remainder, and that no correlation existed with the radiation dose to the lung (inter-and intra-study). These are optimistic assumptions causing uncertainty in the prevalence estimation. Furthermore, an important model assumption is that the effects of the various risk factors (including dose) are independent, as required for multiplicative ORs. unfortunately, even first order interactions between risk factors are hardly ever reported in the published literature. see [4] for a detailed discussion of these of assumptions and other potential limitations of the OR estimates.
Including risk factors as ORs is only one out of several possible approaches to incorporating clinical risk factors in dose-response relationships. Our approach is equivalent to using the QuAnTEC dose response model, but with the addition of a 'risk factor equivalent dose' to the dose variable, such that MLD risk  MLD  ln(OR)D 50 /(4g 50 ). One alternative would have been to use 'dose modifying factors', included as multipliers to the dose variable [6]. Current knowledge does not allow determination of which approach provides the best fit to data. However, a pragmatic first approximation is to keep the model linear in all parameters. Furthermore, our approach has the advantage of allowing the utilization of odds ratios as estimated using standard metaanalysis methodology.
Comparing clinical data from a validation cohort with the QuAnTEC model predictions entail quantifications of dose metrics and toxicity data which are comparable to the ones employed in the studies underlying QuAnTEC. This does not represent an unambiguous problem. For example, most (but not all) of the 10 studies used to derive the QuAnTEC dose-response relationship reported the physical dose to the lung, without correcting for fraction-size effects. Therefore the DVHs in the present study were not corrected for fractionation. Furthermore, the definition of 'symptomatic Rp' in the studies included in the QuAnTEC analysis varied, with a number of grading systems and 'cutoff ' grades used. We deemed CTCEA v. 3.0 grade 3 and above Rp to be the closest to the 'average cut-off ' grade, and hence patients with Rp below grade 3 were recorded as having not experienced Rp, while patients with grade 3 and above were recorded as experiencing Rp.
The statistical method used for the calculation and reporting of the incidence of Rp varied across studies as well. The majority of studies reported the crude incidences of Rp during the follow-up period, usually around 6-12 months; some of those studies limited the analysis to patients with a minimum of follow-up (e.g. six months). Three studies used the Kaplan-Meier estimates for the incidence of Rp. Consequently, there is not one uniform definition of 'events' and 'subjects at risk' that can be used for comparisons with the QuAn-TEC dose-response model. We chose to use survival data analysis for the validation study, as it provided us with the formalism and tools to make a self-consistent and informed choice on how to handle competing risks and define the subjects at risk. Tucker et al. [20] have discussed the effects of variation in follow-up on the estimated Rp incidence in more detail. In general, it can be discussed what consists the optimal actuarial method for reporting the rate of late toxicities [21]. In this case, however, we believe that cumulative incidences are conceptually closest to the outcome data used for the QuAnTEC model estimate. Rp can then be considered one of several competing risks of failure for irradiated lung cancer patients; the others being death from lung cancer, or other causes, and local progression. Here, the latter was included as a competing risk, as local recurrence will usually prevent reliable detection of Rp.
The correction for the effect of chemotherapy on the risk of toxicity is based on comparisons of sequential chemotherapy with platinum-based concomitant chemotherapy, in concordance with the results of the large individual patient data metaanalysis conducted by Aupérin et al. [22]. However, a recent analysis of the risk of Rp after chemoradiotherapy with concurrent chemotherapy in an international, multi-institutional dataset [2] saw a con siderable increased risk of toxicity with the use of carboplatin/paclitaxel combinations as compared to cisplatin/etoposide. Hence caution should be exercised when applying the results of the present study to modern, intensified chemoradiotherapy regimens.
More recent preclinical studies have shown an effect of irradiating the whole volume of the heart [23]. Whether this effect plays a major role for typical clinical treatments remains uncertain [24,25]. However, such an effect could potentially contribute to the statistical association of Rp with inferior tumor position, a factor that was included in our modeling.
The QuAnTEC study reported the risk of Rp as a function of the MLD, as no deviation of effective dose (D eff ) from mean dose was found. While MLD appears as a simple and robust predictor, it is, however, not necessarily the correct one from a physiological point of view -it may merely act as a crude measure of treatment intensity. A number of recent studies have questioned whether MLD is actually the optimal dose metric to consider [26,27], but without reaching a consensus. The method presented here does not dependent on the dose metric, but can be employed irrespectively of the dose metric used in a dose response model.
The impact of clinical risk factors on the probability of Rp after curative radiotherapy for lung cancer is a much-studied issue. However, a consistent quantification of the effect on the dose-response has proven challenging, partly because of the demands on data quantity and completeness required for a multivariate analysis taking all relevant factors into account. As an illustration of this problem, a logistic regression analysis including one dose and six clinical factors would, as a rule of thumb, require approximately 160 events and hence approximately 800 patients at an event probability of approximately 20%. The method presented here provides an alternative that allows for exploitation of already conducted studies and meta-analyses, such that already known clinical risk factors do not need to be re-fitted in each dose-response study. As a straightforward generalization of the method presented here, the known ORs of clinical risk factors can be applied to new dose-response studies, such that a dose-response model including clinical risk factors can be fitted without introducing more free parameters than a dose-only fit. Hopefully, such a strategy can improve the generalizability of results [3].
In conclusion we have presented a method to combine a published dose-response function with known clinical risk factors and demonstrated that the predictive power of the combined model is greater than a dose-only model in an independent dataset.

Declaration of interest:
The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.
ALA and IRV are supported by CIRRO -The Lundbeck Foundation Center for Interventional Research in Radiation Oncology and The Danish Council for strategic Research. ALA acknowledges support from the Region of southern Denmark. IRV is supported by the global Excellence in Health program of the Capital Region of Denmark. sMB acknowledges support from the national Cancer Institute grant no. 2p30 CA 014520-34.