A Bayesian joint model for multivariate longitudinal and time-to-event data with application to ALL maintenance studies

ABSTRACT The most common type of cancer diagnosed among children is the Acute Lymphocytic Leukemia (ALL). A study was conducted by Tata Translational Cancer Research Center (TTCRC) Kolkata, in which 236 children (diagnosed as ALL patients) were treated for the first two years (approximately) with two standard drugs (6MP and MTx) and were then followed nearly for the next 3 years. The goal is to identify the longitudinal biomarkers that are associated with time-to-relapse, and also to assess the effectiveness of the drugs. We develop a Bayesian joint model in which a linear mixed model is used to jointly model three biomarkers (i.e. white blood cell count, neutrophil count, and platelet count) and a semi-parametric proportional hazards model is used to model the time-to-relapse. Our proposed joint model can assess the effects of different covariates on the progression of the biomarkers, and the effects of the biomarkers (and the covariates) on time-to-relapse. In addition, the proposed joint model can impute the missing longitudinal biomarkers efficiently. Our analysis shows that the white blood cell (WBC) count is not associated with time-to-relapse, but the neutrophil count and the platelet count are significantly associated with it. We also infer that a lower dose of 6MP and a higher dose of MTx jointly result in a lower relapse probability in the follow-up period. Interestingly, we find that relapse probability is the lowest for the patients classified into the “high-risk” group at presentation. The effectiveness of the proposed joint model is assessed through the extensive simulation studies.


Introduction
Acute Lymphocytic Leukemia (ALL), also known as Acute Lymphoblastic Leukemia, is quite rare for adults, but it is possibly the most common type of cancer diagnosed among children.It is usually developed from immature white blood cells (WBC) which are the key components to our immune system.ALL is "acute" in the sense that it progresses rapidly by creating immature blood cells.Although ALL is not inherited, it is known that it occurs due to mutations in the DNA (Yokota and Kanakura 2016).The most effective drugs used for the treatment of ALL were discovered in the 1960s when the first multi-drug chemotherapy regimens increased its survival rate quite significantly.
Globally, ALL is the main cause of death from cancer among children (Belson et al. 2007).In 2015, a total of 876,000 confirmed cases of ALL resulting in 110,000 death were globally reported.Pui and Evans (2013) reported that the survival rate for ALL has increased from 0.10 (in 1960) to 0.90 (in 2010) in most of the developed countries, in particular, in the United States.This is due to success in the maintenance therapy used for treating leukemia patients.Patients are treated for the first one or two years, and are then followed for a longer period (approximately three years) with an expectation that a relapse should not occur after the follow-up period.Pui et al. (2018) reported that the survival rates for ALL are still less than 0.60 in most of the African countries, and are less than 0.70 in most of the Asian countries.More recently, Abdelmabood et al. (2020) analyzed a dataset from Egypt, and estimated the survival rate for ALL as 0.63.They recommend an urgent need for the modification of the current chemotherapy regimens so that the treatments are suitable for local conditions.In fact, they also recommend a better government healthcare globally, and in particular, for Egypt.
Despite the fact that the survival rate for ALL is unsatisfactorily low in India, there is a lack of consensus on its estimate.Approximately, 75000 new cases of ALL are diagnosed (among children) every year in India, and more than 80% of these cases occur in families with low income.Arora and Arora (2016) mentioned that a treatment abandonment (which typically occurs due to the lack of financial support) may result in the lack of data for which the survival rate cannot be estimated accurately in India.Varghese et al. (2018) analyzed data from the Southern part of India and reported the overall survival rate for ALL as 0.40; with an event-free survival rate as 0.28.Note that "event-free survival" refers to survival without a relapse (recurrence of cancer) or other related complications.
Most of the existing works focus on the survival (no death) of the patients, which is of course an important event to note.However, for assessing the effectiveness of the treatment it is important to note the time-to-relapse (if any).Multiple recurrences definitely result in adverse effects on the patients' health conditions, and therefore it is extremely important to identify the biomarkers associated with the relapse time, and also to assess the effects of the drugs on those biomarkers.Tata Translational Cancer Research Center in Kolkata conducted a study in which the children diagnosed as ALL patients were treated with two standard drugs (i.e.6-mercaptopurine (6MP) and methotrexate (MTx)) for 100 weeks (on an average) and were then followed at most for the next 178 weeks.For a total of 236 patients data were collected longitudinally on three biomarkers (i.e.WBC count, neutrophil count and platelet count), and also on the drug doses.Dataset also contains a number of time-invariant covariates (details in Section 2).We develop a Bayesian joint model for the longitudinal biomarkers and the time-to-relapse based on the existing literature.
There is a vast literature on the joint modeling of longitudinal outcome(s) and time-to-event (typically the survival time).Henderson et al. (2000) proposed different likelihood-based models for such data.In a Bayesian setting (Wang and Taylor 2001) developed a joint model for the CD4 counts and time to progress into AIDS for HIV patients Guo and Carlin (2004) considered similar Bayesian models for comparing efficacy of two antiretroviral drugs (Chi and Ibrahim 2006;Fieuws and Verbeke 2004;Rizopoulos and Ghosh 2011;Zhu et al. 2011) developed joint models for multivariate longitudinal and survival data.Rizopoulos 2011;Rizopoulos et al. (2017) developed flexible Bayesian joint models which can automatically predict the subject-wise survival probability over time.All these authors considered (generalized) linear mixed models for the longitudinal outcomes and a proportional hazards (PH) or an accelerated failure time (AFT) model for the time-to-event, and then link the two models either by shared random effects or by correlated random effects.
We build our work on the existing literature, and develop a Bayesian joint model for jointly analyzing WBC count, absolute neutrophil count (ANC), platelet count; and time-to-relapse.The motivation for developing a joint model for our dataset is discussed in Section 2. We consider a number of joint stochastic models where different Gaussian correlated random effects are used to capture the longitudinal and the inter-biomarker dependence.By using some popular model selection criteria we choose the "best fit" model for our dataset.The missing (correlated) biomarkers are imputed efficiently using the joint model, and the probability of a relapse is predicted for each subject for the follow-up period (after the end of treatment).While different components of our work exist in the literature, our analysis combines them and addresses some clinically interesting research questions; for example, how the trajectories of different biomarkers will be different (on the average) for the patients receiving a lower dose of 6MP and a higher dose of MTx from the patients receiving a median dose of each drug.Thus, the combination of different components (of our modeling) is novel and our application is quite unique in the existing literature.
The rest of this paper is organized as follows.In Section 2, we describe the dataset under consideration, and discuss the motivation for the proposed modeling.In Section 3, we describe the proposed Bayesian joint models.The computational details are also discussed in this section.The results from the data analysis are summarized in Section 4. In Section 5, we report the results from simulation studies.Finally in Section 6, we discuss some of the limitations of our work and conclude.

ALL chemotherapy dataset
Tata Translational Cancer Research Center (TTCRC) in Kolkata conducted a study in which 236 children were treated with advanced chemotherapy.The study started in 2014 and ended in 2019.As a part of the Tata Medical Center (TMC), TTCRC develops advanced treatments for the cancer patients in the eastern part of India.In the current study, children (patients) were treated with two standard drugs, i.e. 6-mercaptopurine (6MP) and methotrexate (MTx).The median duration of the treatment was 92 weeks.Ethical approval was obtained from TMC-Institutional Review Board.
The starting time and the end time of the study were fixed, but all the 236 children did not join the program at the same time point, and the number of visits were also different for different children.Based on the total WBC count (at presentation) and the risk-group specified by the National Cancer Institute (NCI), patients were treated with different drug doses at the beginning.After that the subjects were tested for WBC count, ANC, and platelet count in the subsequent time points (bi-weekly, or once in a month).The median number of visits was 41.After the end of the treatment, patients were followed at most for the next 178 weeks.If there was a relapse either during treatment or in the followup period then the time-to-relapse was noted; otherwise the subject was censored at the time when the follow-up ended for that patient.
Among 236 patients, 36% were female and 64% were male; and the children belonged to the age interval [1-17.5]with a median age of 4.7 years (at presentation).Only 29% of the children had a bulky disease (i.e. the cancerous masses 10 cm or larger in diameter), and 31% had a disease related to the central nervous system (CNS).Almost 71% of the patients did not show the sign of a relapse when the study ended in May 2019.
For our modeling, drug doses are the time-varying covariates with fixed effects.Among the timeinvariant covariates, we have (i) age at diagnosis, (ii) gender (M/F), (iii) lineage (B cell/T cell), (iv) WBC count at presentation, (v) NCI risk group (high risk/standard risk), (vi) presence of bulky disease (Y/N), (vii) presence of CNS disease (Y/N), (viii) risk at presentation, (ix) day 8 risk, (x) day 35 risk, (xi) morphological remission (Y/N/Unknown), and (xii) minimal residual disease (MRD) status.We note that NCI "high-risk" group refers to the children with WBC counts higher than 50,000 (cells/ mm 3 ), and "standard-risk" group refers to the group with counts lower than 50,000 (cells/mm 3 ), at presentation.By the end of the treatment phase, if ANC and platelet count are in the normal range for a patient then the patient is considered in morphological complete remission (CR).The MRD refers to the remaining cancer cells after treatment.Table S.1 (web-appendix) provides the list of variables used in our modeling, and their roles in our analysis.The summary statistics of the time-invariant covariates are provided in Table S.2 (in the web-appendix).
In our dataset, there are some missing biomarker values for some patients.Overall percentage of missingness for WBC count, neutrophil count and platelet count are 32.04,4.83 and 4.80, respectively.We note that there is no withdrawal in the study, and hence these missing values are not because of the dropouts.In fact, we observe missingness only in the longitudinal biomarkers and not in the covariates.As informed by TTCRC, these biomarker values are missing for no valid reason and mostly because of the human error.Therefore, we need to improve our estimation method without really worrying about why these responses are missing.
Figure 1 shows the longitudinal trajectories of the three biomarkers for four randomly selected patients.We notice that for the patients with a relapse (in the follow-up period) the biomarkers are mostly lower than the respective grand means (shown by the solid lines) during the treatment.For the patients with no relapse, the biomarkers are mostly above the respective grand means.This indicates that a relapse is possibly associated with the observed values of the biomarkers.
Based on the three types of available data (i.e.time-to-relapse, longitudinal biomarkers, and the covariates) it is of interest to study (i) the progression of the biomarkers with time, and the effects of the covariates on it, (ii) effects of the biomarkers on time-to-relapse, and (iii) (direct) effects of the covariates on time-to-relapse.A joint model for the longitudinal biomarkers and the time-to-relapse can be used for such inference.Alternatively, one can use two separate models (one for the longitudinal biomarkers, and a second model for time-to-relapse with the biomarkers and the covariates as predictors).However, due to the measurement errors in the longitudinal biomarkers the second model will result in biased estimates of the model parameters (Guo and Carlin 2004;Rizopoulos 2011;Wang and Taylor 2001).A joint model provides improved estimates, and hence provides a more meaningful inference.In our analysis (Section 4.2) we also fit two separate models and see that a joint model gives a better fit to our data.This is the motivation for our proposed joint modeling.

Proposed model and estimation method
Recall that we have three biomarkers, i.e. (i) WBC count, (ii) neutrophil count (ANC), and (iii) platelet count, which we consider as longitudinal outcomes.For stabilizing the variances in the raw biomarker values, we consider log-transformed biomarker values for our analysis (refer to Figure S.1 in the webappendix).Let Y ijk be the (log transformed) k-th biomarker (k ¼ 1; 2; 3) from the i-th patient at time t ij , for j ¼ 1; 2; . . .; τ i .Note that the number of longitudinal measurements differs from one patient to the other, and hence we use the notation τ i to denote the number of measurements from the i-th patient.For each patient, we either observe the relapse time (T i ) when the cancer returns (during treatment or in the follow-up), or the censoring time (C i ) when the follow-up ends for the i-th patient.We define an indicator variable δ i ¼ 1, if T i < C i ; (and 0, otherwise), and define s i ¼ minðT i ; C i Þ as the time-to-event for the i-th patient.

Longitudinal submodel
We propose the following multivariate linear mixed model for the longitudinal biomarkers.Our model is similar to the models proposed in (Das 2016;Henderson et al. 2000;Rizopoulos and Ghosh 2011): where the continuous function f k ðt ij Þ is the general effect of time on the k-th biomarker, and we model it as a polynomial function of time, i.e. we consider We note that some unknown functions of time could be used for modeling f k ðt ij Þ, but the plots (in Figure S.1 in the web-appendix) suggest that a polynomial function should suffice for our dataset.For selecting the optimal order (r) of the polynomial function, we use the deviance information criteria (DIC) for a linear mixed model as proposed in (Celeux et al. 2006).The vectors β 1k and β 2k , respectively, denote the (fixed) effects of the time-varying covariates (x ij ), and the time-invariant covariates (z i ) on the k-th biomarker.The zero-mean Gaussian random effects W ik ðt ij Þ capture the longitudinal dependence as well as the dependence among the three biomarkers (Rizopoulos 2016).Note that these random effects are biomarker-specific since the between-patient variations might be different for different biomarkers.The random errors e ijk are assumed to be iid N(0,σ 2 ek ), and they are independent to W ik ðt ij Þ.We consider two different specifications for W ik ðt ij Þ in our analysis, following the existing literature.First, we consider model with random intercepts only, i.e.W ik ðt ij Þ ¼ a ik , where a i ¼ ½a i1 ; a i2 ; a i3 � T ,Nð0; D 1 Þ, and D 1 is a 3 � 3 covariance matrix.This specification assumes that the dependence among the biomarkers and the longitudinal dependence (for each biomarker) remain unchanged throughout the study.
As an alternative specification, we consider model with random intercepts and random slopes (of time), i.e.
Here D 2 is a 6 � 6 covariance matrix, and this specification assumes that the inter-biomarker dependence and the longitudinal dependence change with time.

Time-to-event submodel
Since the time-to-relapse is (possibly) associated with the longitudinal biomarkers, a joint model is meaningful in our analysis (refer to Figure 1).However, the nature of this association can be complicated, and therefore many different specifications of the joint model exist in the literature (Das 2016;Henderson et al. 2000;Rizopoulos and Ghosh 2011); and the references therein).We consider Cox proportional hazards (PH) model, which is the most commonly used model for time-toevent data.We consider two alternative specifications of PH model for linking it to the longitudinal submodel given in equation ( 1).
In our first choice, we use the expected longitudinal outcomes (conditional on the random effects) as time-varying predictors in the PH model.Note that the model in equation ( 1) can also be written as follows: For a fixed time point, say t, we define μ i ðtÞ ¼ ½μ i1 ðtÞ; μ i2 ðtÞ; μ i3 ðtÞ� T .Let λ i ðtÞ denote the hazard (instantaneous probability of relapse) for the i-th patient at time t.Assuming that the expected biomarker values (conditional on the Gaussian random effects) are associated with hazards, we consider the following PH model: where the vector of coefficients Ψ 1 measures the effects of the expected longitudinal biomarkers on the hazards (Das 2016).And θ 1 denotes the effects of the fixed covariates on hazards, λ 0 denotes the baseline hazard.This specification assumes that the time-to-relapse depends on the expected biomarker values, and on the fixed covariates.The effects of the two drugs on the time-to-relapse are rather indirect, i.e.only through the observed biomarkers.
As an alternative specification, we also consider the following PH model for the time-to-relapse: where x � i is a 2 � 1 vector of the median dose (during the treatment) for 6MP and MTx given to the ith patient, and the vector Ψ 2 measures the effect of the median doses on time-to-relapse.Note that we consider the median dose as a covariate since the median gives a robust summary of the patient's dose distribution.This specification assumes that the drugs and the time-invariant covariates directly affect the time-to-relapse.Similar to the model in equation ( 2), θ 2 denotes the effects of the fixed covariates on hazards.Here W � i ðtÞ are zero-mean Gaussian random effects; different specifications for W � i ðtÞ are considered to select the "best fit" model for our data (discussed in Section 4.2).The dependence between the longitudinal biomarkers and the time-to-relapse can be modeled by considering a joint distribution for W ik ðt ij Þ and W � i ðtÞ, as described in Section 4.2 (for Models III and IV).For modeling the base-line hazard function λ 0 ðtÞ in equations ( 2) and (3), we use flexible cubic B-splines following the JMbayes package in R, which we use for our computation.This package was written by (Rizopoulos 2016) and has been used for Bayesian joint modeling by several authors (Balan and Putter 2020;Papageorgiou et al. 2019).This package models λ 0 ðtÞ as follows: γ λ 0 ;q B q ðt; vÞ, where B q ðt; vÞ denotes the q-th basis function of B-spline with knots v 1 ; v 2 ; . . .; v Q .For detection of the optimal number (and the location) of knots the JMbayes package considers a relatively large number of knots (15, 20, 30 etc.), and then penalize the B-spline coefficients by considering suitable prior distributions (e.g.Laplace prior, Horseshoe prior).The non-zero coefficients are finally considered in the model.

Joint likelihood and estimation method
For any specification of the submodels, let α i denote the vector of subject-specific random effects.For example, if we specify and consider equation (2) as the submodel for the time-to -event data, then α i ¼ ½a T i ; b T i � T , and α denotes the vector of subject-specific random effects from all subjects.Additionally, let Θ denote the set of all fixed model parameters, and β denotes the set of fixed model parameters in the longitudinal submodel.The complete data likelihood is expressed as follows: where P denotes the probability distribution of the biomarker Y ijk conditional on α i from equation (1), and πðα i Þ denotes the probability distribution of α i .We use the likelihood given in equation ( 4), and by considering appropriate prior distributions for the model parameters we compute the joint posterior distribution.All our inferences are based on the joint posterior distribution.We sample from the joint posterior distribution using a hybrid combination of Gibbs sampler and Metropolis-Hastings algorithm, and estimate the model parameters by their respective sample means.All our computations are done using JMbayes package in R (Rizopoulos 2016).
In our dataset, there are some missing biomarker values for some patients.Since the missing observations (only in the biomarkers and not in the covariates) are due to human error, we assume an "ignorable" missingness.Instead of considering only the complete data (data for the patients with no missing values) or the available data, we impute the missing outcomes within MCMC iterations using the proposed joint model for improving the estimates of the model parameters.This imputation inherently assumes "missing at random" (MAR) as the missingness mechanism, and imputes the unobserved biomarkers as follows.
In the m-th iteration of MCMC, let Ω ðmÞ denote the set of updated (estimated) model parameters.Conditional on Ω ðmÞ , we first sample the subject-specific random effects α i for the i-th patient.Then, we sample the missing biomarker(s) from the current step's predictive distribution(s) conditional on the observed biomarkers, covariates, and the random effects.Since we have multivariate longitudinal biomarkers, we need to condition on the random effects for sampling correlated data (Schafer and Yucel 2002).This is done for all the T time points, and thus we get a complete dataset (with no missing biomarkers).This complete dataset is used for estimating model parameters in the ðm þ 1Þ-th iteration.Thus, in each iteration, we update the parameters and the missing biomarkers simultaneously.By considering m iterations, we thus get a set of m complete datasets based on which we get the final estimates of the model parameters.

Subject-wise prediction of relapse probabilities
The proposed joint model is used for dynamic prediction of the subject-wise relapse probability based on the historical longitudinal measurements and covariates (Rizopoulos 2016).Let H i ðtÞ denote historical measurements for the i-th subject who is event-free (no relapse) upto time point t, i.e.H i ðtÞ ¼ fY ijk ; x ij ; z i ; 0 � j � t; k ¼ 1; 2; 3g.Additionally, let D n denote the training dataset, i.e.D n ¼ fY ijk ; x ij ; z i ; s i ; δ i ; k ¼ 1; 2; 3; j ¼ 1; 2; . . .; τ i ; i ¼ 1; ::; 236g.Given that the subject is eventfree (no relapse) until time t, the probability that it will be event-free upto time u ¼ t þ Δt, for Δt > 0, is given as follows: where Θ denotes the set of all fixed model parameters in the joint model (equation 4).Additionally, where Hi ðu; α i ; ΘÞ denotes the estimated subject-specific longitudinal trajectories (with random effects) until time point u, and S i ð:Þ is the event time function conditional on Hi .Here, Prðα i jT i > t; H i ðtÞ; ΘÞ denotes the posterior distribution of the random effects, and Prðα i jT i > t; H i ðtÞ; ΘÞ / PðT i > tjα i ; ΘÞPðH i ðtÞjα i ; ΘÞPðα i jΘÞ.
A Monte Carlo estimate of π i ðujtÞ is obtained as follows.Sample Θ l , l ¼ 1; 2; . . .; L; from PrðΘjD n Þ as post burn-in iterations.For a fixed Θ l , we sample αi l q (l q ¼ 1; 2; . . .; L q ) from Prðα i jT i > t; H i ðtÞ; Θ l Þ; and compute And finally, we compute based on all L samples, and plot the survival (no-relapse) probabilities over time.

Prior specifications and computational details
We specify diffuse prior distributions for the model parameters following the existing literature (Das 2016;Rizopoulos 2016).For β 1k and β 2k in equation ( 1), we consider multivariate normal prior with mean vector = 0, and diagonal covariance matrices with the diagonal element = 1000.For the residual variances σ 2 ek we specify an Inverse Gamma(0.01,0.01)prior.For the coefficients η kl ; ð l ¼ 0; 1; . . .; r) in the polynomial functions f k , we specify Nð0; 100Þ prior distributions.In the time-to-event submodel, for Ψ 1 (and Ψ 2 ) and θ 1 (and also for θ 2 ), we specify multivariate normal prior with mean vector = 0, and a diagonal covariance matrix with diagonal element = 1000.We perform a sensitivity analysis, and the results (for some parameters) are summarized in Table S.3 (in the web-appendix).We notice that the hyperparameters have minimal effects on the final estimates.
We use MCMC iterations (based on Gibbs sampler and Metropolis-Hastings algorithm) for estimating the model parameters.We run 12,000 MCMC iterations, discard the first 2,000 as "burnin", and use the remaining 10,000 iterations for estimating the model parameters.The estimated posterior densities and trace plots for some of the coefficients are provided in Figures S.2-S.5 (in the web-appendix).We note that these plots indicate a good convergence of the chains.In addition, we compute scale reduction factors (Brooks and Gelman 1998) for assessing convergence of the chains, and all the computed scale reduction factors are smaller than 1.2, indicating a good convergence.We use sample means for estimating the respective model parameters.A 95% Bayesian credible interval is also computed for each coefficient based on the MCMC iterations.
For obtaining the optimal order (r) of the polynomial functions f k (in equation 1) we consider the "complete DIC" (Celeux et al. 2006) which is computed based on the observed and the imputed missing observations (based on MCMC iterations) and is defined as DIC= À 4E logLðY; SjαÞ ½ � þ 2logLðY; SjαÞ.Note that LðY; SÞ denotes the joint likelihood given in equation ( 4), and α is the estimated random effects.We consider only two choices, r ¼ 2; 3; and we compute DIC values (shown in Table S.4 in the web-appendix) for the four different specifications of the joint model as discussed in Section 4.2.The smallest (complete) DIC value is obtained for r ¼ 2 across all different joint models, and hence we consider r = 2 for our analysis.
All our computations for this analysis are performed in R. In a Windows 10, i5 processor machine it takes nearly 24 hours for the complete analysis (including the posterior predictive inference in Section 4.4).

Model selection
We compare the performance of several competing models, and select the one which provides the "best" fit and "best" prediction for our data.In the joint modeling literature, it is of great importance to evaluate the prediction accuracy and discriminative capability of a model.Hence, we also compute the prediction error (expected error of predicting future events) and the area under the receiver operating characteristic curve (AUC) for different models under consideration.Note that AUC measures how effectively a joint model can discriminate the patients for which a relapse occurred from the patients with no relapse.Both there measures are computed within JMbayes package (Rizopoulos 2016).
From equation (5) in Section 3.4, recall that π i ðt þ ΔtjtÞ denotes the probability that the i-th patient will be event-free (no relapse) upto time t þ Δt given that it is event-free until time t.For a randomly chosen pair of patients ½i; j� who are event-free until time t, the discriminative power of a model is assessed by computing AUC as given below: This means that in a fixed time interval ðt; t þ Δt� if a relapse occurs for the i-th patient but the j-th patient is event-free upto time t þ Δt, then the model must assign a higher non-relapse probability to the j-th patient.We use this criterion to compare different competing models along with the prediction error.
First, we consider random intercepts only in the longitudinal submodel, and use the expected longitudinal outcomes as predictors in the survival submodel (as shown in equation 2).In other words, we specify W ik ðt ij Þ ¼ a ik , where a i ¼ ½a i1 ; a i2 ; a i3 � T ,Nð0; D 1 Þ, as mentioned in Section 3.1, and specify the PH model given in equation ( 2) for the time-to-event.For the covariance matrix D 1 , we consider an Inverse Wishart (4, M 1 ) prior, where M 1 is a diagonal matrix whose diagonal elements are generated from Gamma (0.5,0.01) distribution.We refer to this model as Model I.
Second, we consider random intercepts and random slopes of time in the longitudinal submodel.Specifically, we consider ; D 2 Þ (refer to Section 3.1), and the model given in equation ( 2) is used for the time-to-event.For the covariance matrix D 2 , we consider an Inverse Wishart (7, M 2 ) prior, where M 2 is a diagonal matrix whose diagonal elements are generated from Gamma (0.5,0.01) distribution.We refer to this model as Model II.
Third, we specify W ik ðt ij Þ ¼ a ik in the longitudinal submodel given in equation ( 1), and consider the PH model given in equation ( 3) for the time-to-event.We specify W � i ðtÞ ¼ c i , and assume that the vector of random effects ½a i1 ; a i2 ; a i3 ; c i � T ,Nð0; D 3 Þ.Note that the association between the time-toevent and the biomarkers is captured by the covariance matrix D 3 .For D 3 we consider an Inverse Wishart prior similar to Models I and II.We refer to this model as Model III.
Finally, we specify in the longitudinal submodel given in equation ( 1), and consider the PH model given in equation ( 3) for the time-to-event assuming W � i ðtÞ ¼ c i þ d i t.Thus, we consider patient-specific random intercepts and random slopes of time in the PH model.Further, we assume that ½a T i ; b T i ; c i ; d i � T ,Nð0; D 4 Þ, and the covariance matrix D 4 captures the association between the biomarkers and the time-to-event.This model is referred to as Model IV.
Table 1 shows prediction error (PE), and AUC values for the four competing models (with r = 2 for all specifications) for t = 100 (duration of treatment) and for three different choices of Δt.We notice that Model II provides higher AUC values and lower PE values than the other three models, consistently.Hence, we select Model II as the "best model" for our dataset.
Next, we fit Model II to our data and compute the DIC as discussed in Section 4.1.Additionally, we fit two separate models.For the longitudinal biomarkers, we fit the linear mixed models given in equation ( 1), and for time-to-event we use the PH model (similar to equation ( 2)) with the biomarkers (observed) and the fixed covariates as predictors.We compute the DIC for these two models, and add them to get the combined DIC.The DIC value for the joint model is 371.5, where the combined DIC for the separate modeling is 493.8.Thus, we conclude that the proposed joint model gives a better fit to our data.

Findings
In Table 2, we summarize the estimated coefficients and 95% Bayesian credible intervals (based on MCMC iterations) for the longitudinal submodel.We consider a covariate as "significant" if the corresponding estimated 95% CI does not contain a zero (Das 2016).
We notice that two medicines are significant for all the three biomarkers.We also note that the effects of 6MP are negative and effects of MTx are positive for all the three biomarkers.The NCI risk, presence of bulky disease, presence of CNS disease, morphological remission, and Day 35 risk are significant for all the three biomarkers.Age and gender are significant for ANC and platelet count.Risk at presentation is significant only for WBC count.Lineage (T/B cells) and risk at day 8 are significant for WBC count and platelet count.MRD status is significant only for platelet count.
In Table 3, we summarize the effects of the time-invariant covariates, and the biomarkers on the relapse time.Interestingly, we notice that ANC and platelet count are significant for the relapse time but WBC count is not significant.Additionally, the effects of ANC and platelet count are negative, indicating that a higher (mean) ANC and a higher (mean) platelet count will result in a reduced hazard (i.e. a lower probability of relapse).Note that in Table 2 we noticed that 6MP has negative effects, and MTx has positive effects on the biomarkers.Combining the findings, we infer that a lower dose of 6MP and a higher dose of MTx can be recommended for reducing the relapse probability.Among the other covariates, except age, all the other covariates are significant for the relapse time.Lineage, risk at day 8, MRD status, and morphological remission have negative effects, and all the other covariates have positive effects.
In Figure S.6 (in the web-appendix), we plot the (time-varying) correlations among the biomarkers.We notice that the correlation between ANC and the platelet count is mostly higher than the other two correlations.However, the correlation between any two biomarkers decrease over time during treatment, possibly due to the effects of the medicine.

Posterior predictive inference
Next, we investigate the specific effects of some significant covariates on the longitudinal and the event time outcomes through posterior predictive distributions.Posterior predictive checks are used for assessing the model fit with missing and latent data (Gelman et al. 2005).Let ðY rep i ; S rep i Þ and ðY obs i ; S obs i Þ, respectively, denote the replicated and the observed datasets for both the longitudinal and event time outcomes, and let Θ denote the set of all model parameters (including random effects) in the proposed joint model.The posterior predictive distribution of the replicated data conditional on the observed dataset is given as follows: For our analysis, we first fix a specific covariate of interest (for example, fix gender as male), and then sample the other covariates from their respective empirical distributions (based on the given dataset).Model parameters are sampled from their respective posterior densities.We sample data for 50,000 patients over 277 time points, where the biomarkers are measured for the first 30 time points (the average number of visits in the actual dataset).However, we consider a longer follow-up period (than the data at hand) for a better prediction of relapse.Details on the posterior predictive computations are given in Section 3.2 of the web-appendix.
In Figure 2, we show the mean predicted (log transformed) ANC, and the mean predicted platelet count for two genders.We also show the predicted (median) non-relapse probabilities in this figure.We notice that the mean ANC and the mean platelet count for females are consistently higher than those for males.The jumps (for both these plots) at week 30 is due the fact that no treatment is given after that week.Additionally, we observe that the non-relapse probabilities are higher for females over time.This is consistent with the results shown in Table 3, where we saw that the effects of (mean) ANC and (mean) platelet count are negative on hazard.
Next, we investigate the effect of NCI risk group on the biomarkers and the relapse time.In Figure 3, we notice that ANC for "high-risk" (HR) group is uniformly higher than "standard-risk" (SR) group, but the trend is the reverse for the platelet count.Additionally, we see that the non-relapse probability for the HR group is higher than the SR group.It is possibly because the patients in the HR group (at presentation) are treated more carefully than those in the SR group.But this is indeed a very interesting finding from our analysis.
Next, we focus on the two drugs which are found to be significant for the biomarkers.The doses are grouped into high, medium, and low; based on the 0.75 quantile and the 0.40 quantile of their respective empirical distributions.We consider all the nine different dose combinations (e.g.high-high, medium-low etc.), and plot the estimated mean curves for each group.The dose combination "high-low" was rarely given (in the actual dataset), and hence we do not include it here.If a particular dose combination is given to a particular patient only for more than 15 weeks (i.e. more than 50% of the times), then we assign the patient to that specific dose group.The patients who are not treated with one specific dose combination at least for 15 weeks are discarded, and we end up getting nearly 38,000 patients who are given a particular dose combination at least for 15 weeks.In Figure 4, we notice that a low dose of 6MP and a high dose of MTx result in the highest ANC across the weeks.On the other hand, high doses for both the medicines result in the lowest ANC.For the platelet count, we notice a decreasing trend for all the groups.The curve for "low-medium" group is consistently higher while that for "high-high" group is consistently lower than the other groups.In Figure 4, it is noted that, in general, higher ANC and higher platelet counts are observed for low (or medium) dose of 6MP and high (or medium) dose of MTx.Thus, on the average, we recommend a lower dose of 6MP and a higher dose of MTx during treatment.Finally, we focus on the risk-at-presentation which was found to be significant for the relapse time (but not significant for ANC and platelet counts).There are three groups, i.e. high risk (HR), standard risk (SR), and intermediate risk (IR) groups.In Figure 5, we observe that the (median) non-relapse probability is higher for the HR group, and it is the lowest for the IR group.A similar trend was observed for NCI risk groups (in Figure 3), and this again reflects that the patients assigned to the HR group (at presentation) show higher non-relapse probabilities in the follow-up period.Rhein et al. (2011) reported a similar inference based on their analysis on ALL patients.This interesting (and counter-intuitive) inference needs further investigation.

Subject-wise prediction of relapse probability
We compute the relapse probability for each patient (for which no relapse is observed during treatment) at different time points in the follow-up period.This is computed using JMbayes package in R. In Figure 6, we show the (predicted) relapse probabilities for some randomly selected patients.For each patient, the trajectory starts when the treatment (for that patient) ends.Conditional on the data (on biomarkers and the predictors) available until the end of the treatment, we compute the relapse probabilities for each patient.The relapse probability increases over time, as expected, and we group the patients based on their maximum (predicted) relapse probability corresponding to the last time point in the follow-up period.For the "low-risk" group, the predicted relapse probabilities are all smaller than 0.10.For the "moderate-risk" group, those are between 0.10 and 0.40.All the other patients belong to the "high-risk" group.The thresholds (0.10 and 0.40) are chosen based on doctors' recommendation.We note that for 85% of the children belonging to the "high-risk" group relapse occurred (during follow-up) in the actual dataset.For the "moderate risk" group and the "low-risk" group, the occurrences are 25% and 7%, respectively, indicating a good predictive power of our model.

Simulation study
We perform a simulation study for assessing the effectiveness of Bayesian imputation of the missing responses in joint modeling.We simulate a dataset which is quite similar to the ALL dataset.We consider three biomarkers measured from a set of 200 subjects over twenty evenly spaced time points.We consider ten covariates, two of them are time-varying.Biomarkers are simulated using equation (1).The details for the simulation of the covariates and the longitudinal biomarkers are given in the web-appendix.At time T = 20, the treatment phase is over, and the follow-up period starts.
We simulate time-to-event for the subjects assuming that the subjects are censored at T = 50, when the study ends.Thus, the patients are followed for the next 30 time points after the end of treatment.We use equation ( 2) for generating the time-to-event, the values of the model parameters (used for data generation) are given in the web-appendix.Once the complete dataset is generated, we randomly create some missing values in the three biomarkers.We consider 32% missing values in one biomarker, 5% missing values in the second biomarker, and 4% missing values in the third one (similar to the ALL dataset).For each biomarker, we first randomly select some subjects for which we create some missing observations again at some randomly selected time points.We use "missMethods" library in R for creating missing biomarkers in a complete dataset.We consider three alternative approaches.First, we consider only the subjects with no missing values, and we refer to this approach as "Completers only".Second, we consider the available dataset where the missing observations are treated as missing values and we do not impute those.This approach is referred to as "Available data only".Finally, we consider the Bayesian imputation using the joint model.In all the three approaches, the model parameters are estimated using 10,000 MCMC iterations.
Next, based on 200 replications we estimate the average bias and average width of the 95% Bayesian credible intervals (and the coverage probabilities) of the regression coefficients.In Table 4 we show the  results for a selected set of coefficients.We notice that the "Completers only" approach performs the worst since it results in a larger average bias and a wider credible interval almost for all the coefficients.Bayesian imputation of the missing biomarkers using the joint model is worth since this approach results in the smallest average bias, and the shortest credible intervals with reasonable coverage probabilities.In Figure S.7 (in the web-appendix), we show the true survival (no event) curve, and the estimated (mean) survival curves from the three competing approaches.We note that the Bayesian imputation method provides an estimated (mean) survival curve which is closest to the true curve.There is some overlaps between the estimated curves from "Available data only" and "Bayesian imputation" approach.As expected, "Completers only" approach provides "not so good" estimate for the survival curve.Thus, our simulation study illustrates that for our setting it is indeed worth of imputing the missing biomarkers using the joint model.

Discussion
Despite the significant improvement in its survival rate over the years, ALL is still globally considered as the main cause of death from cancer among children.ALL typically occurs more often in Caucasians, Hispanics, and Latin Americans than in Africans (Renbarger et al. 2008), and therefore, there exists a vast literature on ALL for the United States and for the European countries.However, there is a lack of similar significant work for India, and in general for most of the Asian and the African countries.In this paper, we present a comprehensive study for the Indian children (diagnosed as ALL patients) and address some interesting research questions.Specifically, we find that the female patients show a higher ANC and platelet counts; and consequently the (median) non-relapse probabilities are higher for them.We also infer that the WBC count is not directly associated with the relapse time.Additionally, we find that a relapse in the follow-up period is less probable for the children in "high-risk" group (at presentation).This particular finding definitely needs further investigations.
However, we note that our current analysis ignores the genetic factors associated with the development and progression of ALL.Additionally, assessing the effects of different covariates at different quantile levels (instead of the mean level) of the biomarkers can provide some interesting findings.We keep these as our future research works.

Figure 1 .
Figure1.Longitudinal trajectories of the three biomarkers for four randomly selected children.Solid lines represent trajectories for the patients with a relapse (in the follow-up period), and the dotted lines are for those with no relapse in the follow-up period.

Figure 2 .
Figure 2.Estimated mean curves for the neutrophil count, platelet count; and the estimated (median) non-relapse probabilities for the two genders (M/F) in ALL data analysis.

Figure 3 .
Figure 3.Estimated mean curves for the neutrophil count, platelet count; and the estimated (median) non-relapse probabilities for the two NCI risk groups [High-Risk (HR), and Standard-Risk (SR)] in ALL data analysis.

Figure 4 .Figure 5 .
Figure 4.Estimated mean curves for the neutrophil count and platelet count for different medicine doses in ALL data analysis.The curve with label (ij) refers to the i-th level of 6MP and the j-th level of MTx, where i; j=high (H), medium (M) or low (L).

Figure 6 .
Figure6.Subject-wise predicted relapse probabilities for some randomly selected patients in ALL dataset.

Table 1 .
Model selection in ALL data analysis.Prediction Error (PE), and AUC values (for t = 100, and Δt ¼ 50; 100; 150) are given for the four competing models, described in Section 4.2.

Table 3 .
Estimated coefficients and corresponding 95% CIs for the covariates in the time-to-event submodel in ALL data analysis.

Table 4 .
Average bias, width of 95% CIs, and the estimated coverage probability (C.P.) for a set of regression coefficients for the three competing approaches in the simulation study.