Time-to-Event Analysis with Unknown Time Origins via Longitudinal Biomarker Registration

Abstract In observational studies, the time origin of interest for time-to-event analysis is often unknown, such as the time of disease onset. Existing approaches to estimating the time origins are commonly built on extrapolating a parametric longitudinal model, which rely on rigid assumptions that can lead to biased inferences. In this paper, we introduce a flexible semiparametric curve registration model. It assumes the longitudinal trajectories follow a flexible common shape function with person-specific disease progression pattern characterized by a random curve registration function, which is further used to model the unknown time origin as a random start time. This random time is used as a link to jointly model the longitudinal and survival data where the unknown time origins are integrated out in the joint likelihood function, which facilitates unbiased and consistent estimation. Since the disease progression pattern naturally predicts time-to-event, we further propose a new functional survival model using the registration function as a predictor of the time-to-event. The asymptotic consistency and semiparametric efficiency of the proposed models are proved. Simulation studies and two real data applications demonstrate the effectiveness of this new approach. Supplementary materials for this article are available online.


Introduction
In time-to-event analysis, a well-chosen time origin should be a comparable start point for all individuals (Cox and Oakes 1984, chap. 1.2).In a randomized clinical trial, the start time of trial is a normal choice.In observational studies, however, the time of study entry is often arbitrary in terms of disease progression, which may not be a comparable time origin.Timeto-event analysis in the absence of a well-defined time origin is difficult to interpret and prone to biases (e.g., Cnaan and Ryan 1989;Korn, Graubard, and Midthune 1997;Cain et al. 2011).A natural time origin in this case would be the time at which an individual's symptoms meet certain criteria of severity, such as the time of disease onset.But such time is usually unknown due to delayed entry.Instead, it is common to observe a longitudinal biomarker trajectory for each subject.By study design, this biomarker is typically a proper surrogate of the disease progression that can be used to define a comparable time origin.Most existing methods on this problem rely on a rigid parametric longitudinal model to estimate the time origins (e.g., Berman 1990;DeGruttola, Lange, and Dafni 1991;Taffé and May 2008;Drylewicz et al. 2010;Sommen et al. 2011), and then impute the estimates to a survival model in a separate step (e.g., Munõz et al. 1992;Geskus 2001).This two-step approach is potentially biased because the estimation errors in the first step are typically ignored in the second step.In this article, we CONTACT Tianhao Wang tianhao_wang@rush.eduRush Alzheimer's Disease Center, Department of Neurological Sciences, Rush University Medical Center, Chicago, IL 60612.
Supplementary materials for this article are available online.Please go to www.tandfonline.com/r/JASA.
introduce a flexible semiparametric curve registration model by which the unknown time origins are modeled as random times.This random time origin serves as a shared parameter between the longitudinal and survival models, and is integrated out in the joint likelihood function.We propose to estimate the two models simultaneously by the joint likelihood function that facilitates unbiased inferences.
Our research was motivated by the vaginal birth after cesarean (VBAC) study (Macones et al. 2005), which was a large multicenter retrospective cohort study of women who had had a prior cesarean in 1996 through 2000.A case-control study for uterine rupture was nested within this cohort study and used to provide the cervical dilation measurement as a marker for labor progression.The major objective of this study was to investigate the risk factors for prolonged laboring time.The study entry was defined as arrival to the hospital; and the time in minutes from study entry is the observational time scale.The cervical dilations were measured repeatedly every few hours or minutes as needed up to delivery.The survival outcome of interest was the time to vaginal delivery.Rupture cases all resulted in C-section and were treated as right-censoring.Because only the women with the potential of successful vaginal delivery were allowed for the VBAC attempt, we assume there is a latent vaginal delivery time for the rupture cases.Figure 1 shows a sample of the observed longitudinal dilation trajectories.Most trajectories follow a similar pattern which is not generally known to follow any a-priori specified functional form.A flexible longitudinal model is needed to accurately capture the dynamics of the dilation trajectories.More importantly, since the trajectories started at different dilation levels, the study entry is not a well-defined time origin to analyze the overall laboring time.It is desirable to take advantage of the dilation data to define a biologically meaningful time origin that is comparable across the subjects.
The unknown time origins are typically located before the time of study entry, known as delayed entry (e.g., Keiding and Moeschberger 1992).There are two scenarios associated with delayed entry as classified by Cain et al. (2011), namely, (a) leftcensoring: the cases where subjects are entered into the study with unknown time origins before study entry, and (b) lefttruncation: the cases where subjects are not included into the study because they had events before study entry.This leftcensoring defined here is for the time origins, which is different from the one defined in Kalbfleisch and Prentice (2002, chap. 1.3, p. 14) for the time-to-event.In this paper, we focus on the case of left-censoring, where the key problem is how to handle the unknown time origins.Our framework also allows the unknown time origin to be within the observational window, but assumes there is no left-truncation.This is consistent with the VBAC study aforementioned where successful VBAC (event) before hospitalization (study entry), that is, lefttruncation, was extremely rare.The left-truncation scenario inherently involves a difficult selection bias issue even if the time origins are known (see, e.g., Brookmeyer and Gail 1987;Wang, Brookmeyer, and Jewell 1993;Asgharian, M'Lan, and Wolfson 2002;Shen, Ning, and Qin 2009;Su and Wang 2012), which is beyond the scope of this article.
In medical literature, a popular approach is to still use the study entry as the time origin but adjust the biomarker as a covariate, assuming that it properly controls the variation due to arbitrary entry.This method dates back to Farewell and Cox (1979).Korn, Graubard, and Midthune (1997) and Lee and Betensky (2018) showed that under proportional hazard model this method can recover the true regression coefficient if the baseline hazard is an exponential function and the biomarker trajectory follows a linear function of time.Thiébaut and Bénichou (2004) and Pencina, Larson, and D' Agostino (2007) showed that the estimation would be biased if the baseline hazard is not of exponential functional form in a Cox model.In our simulation, we show biased results of this covariate adjustment method in an accelerated failure time (AFT) model.
Since biomarkers are typically chosen to reflect the disease progression, a more natural way of using the biomarker information is to define a comparable time origin based on the longitudinal biomarker profile.In this article, we propose a new semi-parametric random-effect curve registration model.We assume that there exists a standard time domain on which every trajectory is comparable and follows a common shape function.The standard domain is linked to the observational time domain by a subject-specific curve registration function.The unknown time origins in the observational domain are assumed to correspond to a fixed time point in the standard domain, which ensures they are comparable.Both the curve registration functions and the unknown time origins are modeled as random.In contrast to the two-step approaches where time origins are modeled as fixed missing data to be imputed in the survival model, the random time origins proposed here are used as the link for joint modeling of the curve registration model and the survival model, and are integrated out in the joint likelihood function.This ensures unbiased and consistent estimation, similar to the classical joint modeling methods (e.g., Tsiatis and Davidian 2004;Zeng and Cai 2005;Tseng et al. 2015).Because the disease progression pattern is a natural predictor of the overall time to event, we further propose a functional survival model using the registration function as a predictor of the time-to-event.In addition, our framework allows the observation time to be related to the past history of biomarker values, following the same condition as Lipsitz et al. (2002).
To implement our model, we approximate the common shape function and the curve registration function by different B-spline expansions.The registration function must be monotonic to ensure a one-to-one correspondence between the standard and observational domains.This can be achieved by constraining the B-spline coefficients to be strictly increasing if equally-spaced knots are used (De Boor 2001).However, ensuring this constraint in semiparametric registration/warping is a difficult problem.For example, Gervini and Gasser (2004) added an extra ad hoc step in each iteration of their algorithm to ensure monotonicity.In the context of parametric nonlinear mixed-effects modeling, Brumback and Lindstrom (2004) proposed to use the Jupp transforms to convert the strictly increasing coefficients into an unconstrained vector.But their method requires fixed starting and termination time points for all trajectories, which is not directly applicable to the data with unknown time origins.In this article, we introduce a novel transformation method that ensures monotonicity with unconstrained random parameters in curve registration that is numerically easier to implement, and at the same time accommodates unknown time origins and flexible termination times of observation.The asymptotic consistency and semiparametric efficiency of the proposed method are proved.
The rest of this article is organized as follows.In Section 2 we describe the curve registration and survival models.In Section 3, a sieve EM algorithm is proposed to estimate the parameters based on B-spline approximations and a joint likelihood function.In Section 4, we discuss the asymptotic properties of the proposed estimator.The finite sample performance of the estimator is assessed by two simulated examples in Section 5.In Section 6 the proposed method is applied to the VBAC data.In addition, we present another application to the Medfly data (Carey et al. 1998;Tseng, Hsieh, and Wang 2005).This dataset contains the complete longitudinal data from birth to death.We can then create artificial left and right censoring to examine the performance of the proposed method.We conclude the article with some final remarks in Section 7.

Model
Suppose that we observe biomarker measurement y ij at time t ij , where i = 1, . .., n indexes subjects and j = 1, 2, . .., m i indexes longitudinal observations within subject i.Let y i = (y i1 , . .., y im i ) denote the vector of longitudinal biomarker trajectory for the ith subject, and t i = (t i1 , . .., t im i ) denote the vector of measurement times which may be unequally spaced.We assume time t ij is defined in an observational time domain T that comes naturally from study design, for example, time from study entry, calendar time, etc. Without loss of generality, we assume t i1 is the time of study entry and y i1 is the baseline biomarker value.
Let t i,E denote the time point in T when the event happens for the subject i, and t i,C denote the time point of right-censoring.Only r i = min{t i,E , t i,C } and the event indicator δ i = I(t i,E ≤ t i,C ) are observed.The traditional time to event is defined as the time from study entry to event, that is, t i,E − t i1 .When the time of study entry t i1 is arbitrary in terms of disease progression as seen in many observational studies, this traditional time to event may not be comparable across subjects, and the results are often difficult to interpret.In time-to-event analysis, a wellchosen time origin should align subjects such that they share the same risk profile at this point (Cox and Oakes 1984).Define such a time origin as t i,0 .The realigned time to event is defined as S i = t i,E −t i,0 .This realigned time to event is a more comparable target for survival analysis than the observed survival time from the arbitrary study entry.However, this t i0 is usually not available or not even properly defined in observational studies.

Curve Registration for Longitudinal Data
To leverage the longitudinal data to define a comparable time origin, we assume there exists a standard pattern of disease progression μ(τ ) defined in a standard τ -domain that is the same for all subjects.We then assume the individual longitudinal trajectories follow this pattern with a subject-specific timing of progression, determined by a monotonic registration function g i (τ ) that maps the τ -domain into the observational T domain.Specifically, we assume that the longitudinal data follow a curve registration model , where μ(τ ) is an unknown smooth function, τ ij is a warped time point relating to the calendar time point t ij via an unknown, smooth and monotonic function g i (•), and e i is the vector of measurement errors.Without loss of generality, we map t ij ∈ (0, 1) and assume τ ij ∈ (0, 1).To estimate μ and g i (•), we employed B-splines approximations.Details are postponed to Section 3.1.
Similar model has been studied in the literature for the purpose of aligning functional data with phase variabilities, see for example Gervini and Gasser (2004).Identifiability is a central problem for this type of model.We prove the identifiability of model (1) following a similar idea as Gervini and Gasser (2004).Details are given in the supplementary material.Moreover, ensuring monotonicity of g i (•) in this model is nontrivial, we propose a novel method for this problem as detailed in Section 3.1.

Curve Registered Accelerated Failure Time Model
The common curve μ(τ ) and the standard τ -domain offers a comparable anchor for all subjects.Each τ represents a stage of disease that is comparable across the subjects.We can choose a τ 0 in the τ -domain to represent a typical stage of the disease, and define a new time origin by the subject-specific time of attaining this stage, that is, t i,0 = g i (τ 0 ).Based on the curve registration model, all subjects would have the same underlying biomarker value μ [0] = μ(τ 0 ) at their new time origin t i,0 , which makes t i,0 more comparable than the arbitrary time of study entry t i1 .
We assume the realigned time to event S i follows a semiparametric linear model: where H is a predefined monotonic transformation function, z i is a q-dimensional vector of time-independent covariates and ε i s are independent and identically distributed with an unknown distribution.When H(•) is chosen to be log(•), we call the model (2) as the Curve Registered Accelerated Failure Time (CRAFT) model, which extends the classical AFT model (Cox and Oakes 1984;Kalbfleisch and Prentice 2002;Ding and Nan 2011) to the problem of unknown time-origins.
Remark 1.The time-origin t i,0 = g i (τ 0 ) is modeled as a random time point because g i (•) is a random registration function as assumed in model (1).This is the key difference between the proposed CRAFT model and the traditional two-step methods (Munõz et al. 1992;Geskus 2001) which treat the unknown time origins as missing data and impute them as fixed time points.
In their traditional setting, the unknown time origins should be observable for the validity of the imputation methods; and these two-step methods may still be biased.When modeled as random, the time-origin t i,0 need not to be observable for the model (2) to be estimable.Instead, we only need the conditional distribution of t i,0 given observed data is bounded above and below from 0. This is ensured by a proper distribution of g i (•) detailed in Section 3.1.Given a proper distribution of t i,0 , we use it as a link to joint model (1) and model (2), where the random time origin t i,0 is integrated out in the joint likelihood function.Details are presented in Section 3.2.Similar to the classical joint modeling approaches (see, e.g., Tsiatis and Davidian 2004;Zeng and Cai 2005;Tseng et al. 2015), the proposed joint model offers unbiased and consistent inferences.
Remark 2. Although the model ( 2) can be unbiasedly estimated with any given τ 0 , a reasonable choice of τ 0 is crucial for the efficiency of the analysis.If τ 0 corresponds to a disease stage that happened long before the observational time window for most subjects, the distribution of the random time origin would rely on extrapolation and may not be very accurate.In this case, one can select a more recent disease stage that are not very far from the observational window or partially observable.This can also be defined in terms of the biomarker value such as dilation at 2 cm in our VBAC example or the first time a fly laid 10 eggs in the Medfly example.
Remark 3. The CRAFT model is a semiparametric model with a nonparametric distribution of ε i (see Sections 3.1 and 4.1 for details).One may also wish to perform the semiparametric analysis using the Cox model based on partial likelihood or nonparametric maximum likelihood estimation (NPMLE).However, because g i (•) is modeled as a random function, the realigned time to event involves a random component g i (τ 0 ), which leads to a random risk set in the Cox model.This imposes a great challenge on the model estimation.Details of the challenge are further discussed in Section 3 after we propose the EM algorithm.

Functional Curve Registered Accelerated Failure Time Model
Because the longitudinal profile of the disease marker is often associated with the time-to-event, it is also of great interest to investigate how to incorporate the longitudinal trajectory μ(g −1 i (t)) into the CRAFT model.Note that the subject-specific information of the longitudinal trajectory is fully captured by the registration function g i (τ ).So it makes sense to use either μ(g −1 i (t)) or g i (τ ) as a covariate in the model.The registration function g i (τ ) is preferred here because it is computationally more tractable than the complicated term μ(g −1 i (t)).In addition, the domain for g i (τ ) is well-defined on [0, 1] and comparable across the subjects.
Taking advantage of the standard τ -domain for all g i (τ ), we propose a functional survival model, which adds a functional linear regression term to the model ( 2), where β(τ ) is a fixed unknown coefficient function.When H(•) is chosen to be log(•), we call the model (3) as the Functional CRAFT model.In model (3), the subject-specific registration function g i (•) appears on both sides of the equation.Note that τ 0 is either pre-determined or converges to a fixed point if defined by μ 0 .The identifiability of the model follows from the identifiability of g i (•).Equations ( 3) and (1) together form a joint model linked by the random registration function g i (•), which is integrated out in the joint likelihood.

Estimation
Since model (3) includes model ( 2) as a special case, we focus on the estimation of model ( 1) and (3).

B-spline Approximations
We assume the common shape function μ(τ ) in the curve registration model ( 1) is a smooth function and approximate it using a B-spline expansion.Specifically, we assume μ(τ ) = B μ (τ ) v, where B μ (•) is a vector of B-spline basis functions and v is the corresponding B-spline coefficients.Let V i (θ e ) = σ 2 e I m i .By model (1), we assume the observed longitudinal trajectory of the ith subject is a fragment of the common shape function, projected from the standard τ -domain to the observational T domain by the registration function To ensure the monotonicity of g i (τ ), we assume where B(τ ) = (B 0 (τ ), . .., B K+1 (τ )) is a vector of B-spline basis with equal-spaced interior knots.The B-spline coefficients , where u i = (u i1 , . .., u iK ) and a i,k (u i ) is as defined above.There are only K free parameters in a i .
Under the above parameterization, the registration function g i (•) is a bijection between [0,1] and [0,1] where g i (0) = 0 < t i1 < t im i < 1 = g i (1).Notably, the monotonicity of g i (•) is guaranteed by the monotonicity of a i,k , k = 0, . .., K +1, without putting any restriction on u i .We assume u i ∼ N(0, u ).The mean 0 is to ensure identifiability of the warping function g i (•).
For the survival model (3), the coefficient function β(•) is approximated by the same basis functions as Without loss of generality, we define b 0 = b K+1 ≡ 0 to ensure identifiability.As to the residual term ε i , following the idea of Ding and Nan (2011), we propose to approximate the log-hazard rate function of ε i , denoted by ψ(•) = log λ ε (•), by another B-spline expansion B ψ (t) γ , where B ψ (•) is another vector of B-spline basis functions.Then we have λ ε (t) = exp{B ψ (t) γ } and ε (t) = t 0 exp{B ψ (s) γ }ds which ensures the positivity of λ ε (t) and monotonicity of ε (t) without putting any restrictions on γ .
The parameters of interest in the curve registration model can be summarized as θ y = (μ(•), u , θ e ), where we fix the number of knots for g i (τ ) while allow the number of interior knots for μ(τ ) to go to infinity as n goes to infinity.The parameters of interest in the survival model are denoted by θ S = (b, ξ , λ ε (•)), where we fix the number of knots for β(τ ) as g i (τ ) while allow the number of interior knots for λ ε (τ ) to go to infinity as n goes to infinity, see Section 4.1 for details.
Remark 4. In practice, we need to specify the number of the knots for μ(τ ), g i (τ ), and λ ε (τ ), which can in principle be chosen using model selection criteria such as AIC and BIC.However, joint estimation can be computationally challenging.We select the number of knots for μ(τ ) by a preliminary analysis of the longitudinal data using the AIC method proposed by Elmi et al. (2011).Brumback and Lindstrom (2004) and Ding and Nan (2011) showed g i (τ ) and λ ε (τ ) in general cannot be too flexible, and two or three interior knots for g i (τ ) and one interior knot for λ ε (τ ) are usually sufficient.

The Joint Likelihood Function
Model (1) and model (3) are linked by the registration function g i (•), or equivalently u i .In this section, we derive the joint likelihood function for simultaneous estimation of the two models.
In clinical practice, individuals may have different measurement schedules depending on the previous health outcomes.For example, in labor study, women having a faster rate of dilation could have a higher frequency of examinations.We employ a similar working assumption as proposed by Lipsitz et al. (2002) that allows us to ignore the distribution of t i in the joint likelihood function with such outcome dependent measurement schedules.Details are given in supplementary material.Under this working assumption and the curve registration model (1), conditional on g i (•) and t i , the density function of the longitudinal data y i can be written as, . For the survival model (3), we further assume the censoring time t i,C is independent of ε i and g i (•) conditional on z i .Conditional on z i and g i (•), the density function of {r i , δ i } can be written as where ε (•) is the cumulative hazard function for the residual term ε i and λ ε (•) is the hazard rate function.fM (r i , δ i , z i ) only depends on the marginal distributions of z i and the conditional distribution of t i,C given z i , and is free of θ S = (b, ξ , λ ε (•)).Suppose e i and ε i are independent conditional on g i (•) and z i .Then the observed likelihood function of (θ y , θ S ) from n iid observations is proportional to Based on B-spline approximations, the observed likelihood function of (θ y , θ S ) from n iid observations is proportional to where and

EM Algorithm
Maximizing ( 6) directly is a difficult nonconvex problem.To avoid this problem, we take advantage of the B-spline approximations and propose an EM algorithm (Dempster, Laird and Rubin 1977), treating u i as the missing data.The complete data log-likelihood function is * denote the parameter estimates at step k and O i = (y i , t i , r i , δ i , z i ) denote the observed data.
In the E-step, the conditional density of u i given O i and current parameter estimates θ [k] can be written as ) .
For any integrable function G(u i ), we have We can approximate the integrals of ] by Monte Carlo integration and importance sampling, where the importance distribution is multivariate normal.Specifically, , we center the importance distribution at the posterior ), while the scaling matrix H i is computed using the Hessian matrix of (u i ; ) evaluated at ûi .The importance distribution for Monte Carlo integration is N( ûi , H −1 i ).In the M-step, parameters are updated by maximizing the expected complete data log-likelihood function.Specifically, we update θ y and θ S by solving the following two optimization problems: where The parameter ) can be solved explicitly as follows, k+1] , ξ [k+1] , γ [k+1] ) does not have an explicit solution.We can solve θ [k+1] S by solving (b [k+1] , ξ [k+1] ) and γ [k+1] iteratively.Specifically, we start from (b [k,0] , ξ [k,0] , γ [k,0] ) = (b [k] , ξ [k] , γ [k] ).At iteration l ≥ 1: Repeat steps 1-2 until convergence.The parameters at convergence are the updated estimates ).The integrals involved in the algorithm are all approximated by the trapezoidal rule.

From an initial value θ [0]
, the E-step and M-step are repeated until convergence.The estimates at convergence is denoted by θ = ( θ y , θ S ) = (v, ˆ u , σ 2 e , b, ξ , γ ) .In our implementations, the initial value θ[0] is obtained by a two-step estimator.We first estimate the model (1) as a semiparametric nonlinear mixedeffect model (Elmi et al. 2011)  Remark 5.Because of the nonlinearity in the longitudinal model, we do not have an explicit analytical solution for the E-step, where numerical integration method must be used.This numerical integration together with the random risk sets impose a major challenge on the Cox model if partial likelihood/NPMLE is used for estimation.Note that for each subject i, we generated M random samples u i can give a different subject-specific time origin.Across the n subjects, each combination of u [j] i (i = 1, . .., n, j = 1, . .., M) may result in a different sequence of events.Thus, to calculate the expectation, in total we need to deal with M n sequences of risk sets in the E-step.The computation burden is formidable.The challenge here mainly originates from the fact that the partial likelihood method builds on the relative order of the survival times.When the time origin is random, all subjects need to be considered together simultaneously rather than separately in a typical E-step.In contrast, by leveraging the full likelihood expression of the semi-parametric linear model, we avoid explicit consideration of these risk sets in our algorithm.

Time-to-Event Estimation
We can estimate the time to event based on model (3) by plugging in the estimated parameters.For example, if H(•) = log(•), the time to event S i can be estimated by where using Monte Carlo integration with the random importance sample obtained in the last iteration of the EM algorithm.The integrals , and E exp{ε i } ψ also need to be calculated by numerical integration.Our limited experience shows that the trapezoidal rule is sufficient for these calculations.

Extension to Partially Observed Time Origins
In some studies, the time origins may be partially observed.This situation can be included into the proposed framework.For those with observed time origins t [0]  i , we propose not to estimate it but directly use the observed time origins in the survival model.To put both missing and observed cases into a unified framework, we assume the observed time origins add a restriction on the registration function as g i (τ 0 ) = t [0]  i .Under this restriction, the model and estimation are essentially the same as proposed in Section 3.3.Specifically, we just need to modify the importance sampling step involved in Equation ( 7) with a constrained optimization ûi = arg min {u i :g(τ 0 ;u ) and draw the importance sample of u i on their constrained support.

Asymptotic Properties
We assume the true parameters of μ and ψ can take values in two infinite-dimensional functional spaces spanned by continuously differentiable functions, see conditions (C.6) and (C.7) in the Appendix.But in estimation, μ and ψ are approximated by B-spline expansions.The proposed estimator can be regarded as a sieve estimator (Shen and Wong 1994) where the sieve space is spanned by the polynomial splines (De Boor 2001).We prove that the sieve estimator is consistent, and the parametric part is asymptotically normal and achieves the semiparametric efficiency bound.
Let K μ and K ψ be the number of B-spline basis functions for B μ and B ψ , which are used to approximate functions μ(•) and ψ(•) , respectively.In this section, we present the asymptotic properties of the estimated parameters as K μ and K ψ goes to infinity together with the sample size n.For ease of expositions, the results do not include parameters σ 2 e in the model ( 1), but similar results hold for its estimator.Without loss of generality, we assume σ 2 e = 1.The parameter of interest is now denoted by Denote by p μ and p ψ the orders of the B-splines, respectively.The following theorem states the convergence rate of the sieve estimator θ n to the true parameter θ 0 = (μ 0 (•), 0 , ξ 0 , β 0 (•), ψ 0 (•)), where β 0 (τ ) = B g (τ ) b 0 .
Theorem 1 is proved by applying Theorem 1 of Shen and Wong (1994), which was previously used by Ding and Nan (2011) for the consistency of their B-spline based timeindependent AFT model.Detailed proofs are shown in the supplementary material.The optimal convergence rate implied by Theorem 1 is O P (n −p ψ /(1+2p ψ ) ), which is achieved when p μ = p ψ and K μ = K ψ = n 1/(1+2p ψ ) .It is the benchmark rate for nonparametric estimations.Note that the common shape function μ(τ ) can be consistently estimated over τ ∈ [0, 1] even with fixed and finite design points t ij , as long as g i (•) s have proper variability such that τ ij = g −1 i (t ij ) spread the whole domain.
Although the overall convergence rate for θ n is slower than the optimal rates for ξ , β(τ ) = B(τ ) b and ˆ u , the next theorem states that their estimators are actually asymptotically normal and semiparametrically efficient.Let ζ = (ξ , b, vech( u )), where vech( u ) denotes the half-vectorization of the lower triangular portion of u .
Theorem 2 is proved by constructing and checking the six conditions of Theorem 2.1 in Ding and Nan (2011).Details are given in the supplementary material.It implies that β(τ ) − β 0 (τ ) = B(τ ) ( b − b 0 ) asymptotically follows a Gaussian process with mean zero.This suggests approaches for calculating pointwise confidence intervals of β(τ ) as described in details in the next section.

Confidence Intervals
For classic time-independent AFT models, the efficient score function of ξ can be explicitly derived, where the parameter variance matrix can be consistently estimated by plugging in the estimated parameters (Ding and Nan 2011).However, because of the nonlinear components, the efficient score function for our joint model no longer has an explicit formula.An alternative approach proposed by Ding and Nan (2011) is to invert the observed information matrix from the last iteration.We propose a similar method based on the EM algorithm.Let Êi [•] denote the Monte Carlo numerical integration in the last iteration of the EM algorithm.The observed Fisher information matrix for all parameters can be estimated by (see, e.g., Ruud 1991) The matrix Ĵ is by definition positive-definite, that is, Ĵ−1 is a well-defined covariance matrix.This estimator is preferred over the estimator proposed by Louis (1982), since the Louis's estimator does not guarantee a positive-definite matrix in our setting, which may lead to negative standard error estimates, and have inaccurate inverses.The confidence intervals for b and ξ can be obtained by extracting the corresponding rows and columns of J −1 , and resorting to the normal distribution quantiles as suggested by Theorem 2. For example, denote the variance matrix of b by b .Then for any τ ∈ [0, 1], we have var{ β(τ )} = B(τ ) b B(τ ).The pointwise 95% confidence interval for β(τ ) can be constructed as β(τ

Simulation Study
To assess the finite sample performance of our model estimation, two simulation examples were considered.In the first example, we focus on comparison of our proposed approach with the popular baseline covariate adjustment method.In the second example, we check the stability and consistency of our model estimation where the data are generated from a full functional CRAFT model as proposed in Section 2.3.

Example 1
For the longitudinal data, we start with a common shape function μ(τ ) such that exp{μ(τ )} = b 0 − exp{−α}τ for τ ∈ [0, 1].To reflect the heterogeneity of subjects at baseline, we assume the subject-specific longitudinal trajectory starts at a random point τ s ∈ [0, 1].This τ s is pretended to be unknown in the following estimation.We design a time-independent covariate Z, which has an accelerating effect on the longitudinal trajectory.Specifically, the longitudinal data were generated by where ξ is the coefficient, and e(t) is a white noise Gaussian process.For instance, the longitudinal trajectory of a subject with Z = 1 will progress faster than a subject with Z = 0 along the calendar time, by a factor exp{ξ }.We assume that observation starts from t = 0 which denotes the baseline time (study entry).The corresponding registration function for y(t) The follow-up time T is assumed to follow a baseline AFT model: where the covariate Z accelerates the failure time (reduces the survival time) by the same coefficient ξ as the longitudinal model, μ(τ s ) is observed with noise as y(0) at the baseline, and ε is the residual item.Let T [τ 0 ] = T − g(τ 0 ), where τ 0 ∈ (0, 1).
If we temporarily let ε ≡ 0, it follows from direct calculations that Thus, under an ideal condition that ε ≡ 0, the model ( 9) can also be written as a CRAFT model: where α = α + μ(τ 0 ).In fact, it is easy to prove that the baseline AFT and CRAFT models share this ideal connection if and only if ε ≡ 0 and exp{μ(τ )} = b 0 − exp{−α}t.When ε ≡ 0, the baseline AFT model can no longer be transformed into a standard CRAFT model.But in this simulated example we show that CRAFT is robust to this type of model misspecification.
We considered two data scenarios.In scenario 1, Z i was independent of t s .In scenario 2, Z was correlated with t s with correlation 0.25.In both scenarios, Z followed a binomial distribution with success probability 0.5, the coefficient ξ = 0.1, μ(t) = log(exp{2} − (exp{2} − 1)t) and α = − log(exp{2} − 1).The residual {ε} was distributed as N(0, 0.05 2 ), which shared the same standard deviation as ξ Z.The censoring times were generated by a uniform distribution [0.5, ζ ] where ζ is chosen to give the required censor rate.
Four separate simulation settings were conducted varying: (a) the sample size n = 200 and 400; (b) the censor rate is 10% and 30%.In each setting three estimation methods were used: (a) the proposed CRAFT model where the common shape function μ was approximated by a cubic B-spline with no interior knot, the registration functions g i (•) were parameterized as a cubic B-spline with two equal-spaced interior knot, with τ 0 = 0.4; (b) The un-adjusted AFT model log(T) = −ξ Z + ε; and (c) The baseline adjusted AFT model log(T) = −ξ Z + ξ y(0) + ε.In all three methods, the log-hazard rate function log λ ε (•) was approximated by a cubic B-spline with 1 interior knot.In each setting, 100 replications were generated.
Figure 2 summarizes the estimation results of ξ and the estimation error of the survival distribution of ε i by Kullback-Leibler distance for the two data scenarios, respectively.For the scenario 1, all three methods give unbiased estimation on ξ .As expected, the adjusted AFT model as the true model improves on un-adjusted model in estimation of both ξ and the survival distribution of ε i .The proposed CRAFT model estimated by either two-step or joint modeling approach is comparable to the true model in estimating ξ and has smaller estimation errors of the survival function.Because the survival time is assumed to be independent of the longitudinal trajectory, the two-step method is as good as the joint modeling in estimating ξ , but it is much worse than joint modeling in estimating the survival distribution of ε i because the two-step method ignores the uncertainty of time origin estimation.For the scenario 2 where the time-shift t s and covariate Z are correlated, the CRAFT model provides the best bias correction of ξ .Both the un-adjusted and adjusted AFT models suffer nonnegligible biases.Again we observe that the proposed joint modeling method provides the best estimation of the survival distribution of ε i by a clear margin.Last but not the least, in all settings the spreads of the estimated coefficient ξ and the Kullback-Leibler distance of the survival distribution of ε i for n = 400 are narrower than n = 200, indicating improved estimation efficiency as sample size increases, in accordance with the theoretical asymptotic consistency of the proposed estimator.
The time-to-event data were generated from a functional CRAFT model:  we assume only r i = min{T i , C i }, δ i = I(T i ≤ C i ) and Z i are observed.
We used a functional CRAFT model to fit the time-to-event data.Three estimation methods were considered with different combinations of how the two submodels were estimated and how the common shape function was modeled.The method 1 is the proposed joint modeling method with a flexible nonlinear common shape function.The method 2 assumes a linear common shape function.The method 3 corresponds to the traditional two-step time-origin estimation method (e.g., DeGruttola, Lange, and Dafni 1991; Munõz et al. 1992), but with a more flexible common shape function for fair comparison.For the survival model, the log-hazard rate function log λ ε (s) was approximated by a cubic B-spline with 1 interior knot.
To check the robustness of the propose method to non-Gaussian random effects u i , we considered two distributions: (a) multivariate normal distribution and (b) multivariate tdistribution.For each distribution, four separate simulation settings were conducted varying: (a) the sample size n = 200 and 400; (b) the censor rate was 10% and 30%.In each setting, 100 replications were generated.The criterion to evaluate the performance of the estimators β is the mean squared error, defined as MSE( β) = β 0 (ν)}dτ dν, where cov{g i (τ ), g i (ν)} is estimated empirically with a large sample of u i .The performances of the fitted survival times are evaluated through biases and variances, defined as Bias( The results are shown in Figure 3. Overall the proposed CRAFT model estimated by joint modeling (method 1) performed the best in all estimations: unbiased estimates of ξ (top panel), smallest estimation error of β(τ ) (2nd panel), and most accurate estimates of the survival times (third and bottom panels).In contrast, the two-step estimator (method 3) gave the worst estimation in all scenarios, showing the great added value of joint modeling over two-step estimation.The estimator with mis-specified linear common shape function (method 2) also gave slightly biased estimates of ξ .The median estimation error of the coefficient function β(τ ) obtained from method 1 and 2 are comparable.But the method 2 produced many extremely inaccurate estimates (the upper outliers in the boxplots), due to model mis-specification.The slightly biased ξ and unstable β(τ ) in method 2 were then translated into more biased estimates of survival times (3rd panel) with larger uncertainties (bottom panel) than the propose method 1.The results for the multivariate t-distribution case are essentially the same as those of the normal distribution, showing the robustness of our method to non-Gaussian random effects.

Medfly Data
The medfly data were obtained from a medfly rearing facility in Mexico to study the associations of age patterns of fecundity with mortality, longevity, and lifetime reproduction (Carey et al. 1998).The survival outcome is death.The longitudinal biomarker is the daily number of eggs laid by each fly.Tseng, Hsieh, and Wang (2005) showed that the proportional hazards assumption failed for the medfly data based on Schoenfeld residuals, and proposed to fit the data by a time-dependent AFT model with known time origins.In this paper, we investigate the performance of the proposed CRAFT model in this data with artificially created unknown time origins.
Following Tseng, Hsieh, and Wang (2005), we used data from 245 flies that produced more than 1150 eggs in their lifetime.The mean survival time is 46.3 days (SD = 12.6, range = 24-101 days).For the ith medfly, denote the age at death in days as t [0]  i,E = m [0]  i .The daily number of eggs is denoted by y i (t [0] ij ), j = 1, . .., m [0]  i , where t [0]  ij denotes days after birth.To create unknown time origins, we generated a random left-censoring variable t i,L by a truncated exponential distribution with mean = 6 days and maximum = 20 days.To mimic an observational study, the time t i,L is pretended to be the "time of study entry".In the analysis, we pretend the daily number of eggs before t i,L is not observed, and define the "observational" time of the daily number of eggs by t ik = t [0]  ij k − t i,L , k = 1, . .., m i , where j 1 = arg min j {t [0]  ij ≥ t i,L }, j k = j k−1 + 1 for k = 2, . .., m i , and m i = m [0]  i − j 1 + 1.The corresponding longitudinal outcome is y ik = y i (t [0]  ij k ).Define the "observed" time to death as t i,E = t [0]  i,E − t i,L .Because the minimum survival time was 24 days in the left-censored data, every medfly had a positive t i,E , that is, no one got excluded from the sample because of leftcensoring.Further, we generated a random right censoring time t i,C by a uniform distribution on [0,120], which gives a roughly 20% censoring rate.Suppose we only observe r i = min(t i,E , t i,C ) and δ i = I(t i,E ≤ t i,C ).
In summary, we created a left-censored data {O i = (y i , t i , r i , δ i ), i = 1, . .., 245}, where y i = (y i1 , . .., y i,m i ) and t i = (t i1 , . .., t i,m i ).Notably, the time of birth was before the first day of observation for every medfly in the left-censored data.As discussed in Section 2.2, for robust survival analysis, it is suboptimal to still use the birth time as the time origin, as it requires out-of-sample extrapolations.We propose to use the first day a medfly laid more than 10 eggs as the time origin.We chose 10 eggs rather than 1 egg as a confirmation that a medfly is mature, as we observed that some medflys laid a few eggs (< 10) at early days but quickly went back to zero for a while, and then increased above 10 later (see Subject B in Figure 5 for an example).So the time origin defined by 10 eggs is more robust than 1 egg.This new time origin is treated as unknown for all medflies even for those started with less than 10 eggs at t i1 .
Similar to Tseng, Hsieh, and Wang (2005), we first applied a logarithmic transformation to the daily number of eggs.The longitudinal outcome to be registered is log(1 + y ij ). Figure 4a shows the spaghetti plots of log-transformed longitudinal daily egg-laying trajectories on the "observational" time scale of the left-censored data.We assume log(1+y ij ) satisfy model (1).The model for survival analysis is a simple version of the CRAFT model ( 2) with no covariate.The time origin is formally defined by t i,0 = g i (τ 0 ), where τ 0 = μ −1 (log( 11)).
To fit the curve registration model (1), we scaled the observational time t ij into a sub-interval of [0,1] by mapping t min = min i,j {t ij } to 0.1 and t max = max i,j {t ij } to 0.9; and the time points t ij (i = 1, . .., n, j = 1, . .., m i ) were mapped as interior points of [0.1, 0.9] proportionally as tij = 0.1 + 0.8 × (t ij − t min )/(t max − t min ).The common shape function μ(τ ) is approximated by a B-spline expansion with one equal-spaced interior knot, where the number of knots is pre-selected by AIC following Elmi et al. (2011).The τ used in model ( 1) was defined in [0, 1].But for ease of interpretation, Figure 4(b) shows the registered daily egglaying trajectories with τ transformed from [0, 1] back to the original daily scale by reverting the aforementioned mapping.We can see that the daily egg-laying of medfly follows a common pattern.This common pattern reflects an average timing of daily egg-laying.Namely, on average, egg-laying starts at day 10 and increases quickly to the peak number in about 20 days, then stably declines thereafter for 40 days, and finally has a precipitous decline for another 20 days in the end of life.These numbers of days are of course just average numbers.Each individual medfly can have its own timing of egg-laying along this common pattern as characterized by the registration functions g i (τ ). Figure 5 shows three examples of g i (τ ) estimated by BLUP, and the registered curves μ(g −1 i (t ij )) plotted alongside the observed longitudinal marker log(1 + y ij ), in the left and right censored data.It shows that the registered curve fits the observed data very well.
Figure 6 shows the estimated baseline hazard function and survival function.For comparison, we considered two data scenarios and in total four estimators: 1. Full data with the true time a medfly first laid more than 10 eggs set as the time origin and no right censoring.This is the golden benchmark for comparison.2. Left and right censored data with unknown time origins:   All the survival models except for CRAFT used a traditional time-independent AFT model similar to the model (2) without the random g i (τ 0 ) term.As shown in Figure 6, among the three estimators for left and right censored data, only CRAFT model provided an accurate estimate of the baseline hazard function and survival function.Both the baseline adjusted estimator and the two-step estimator underestimated hazard function at younger days and overestimated it at older days.As a result, the estimated survival functions from these two estimators missed the golden benchmark (full data) by a clear margin.

VBAC Data
The VBAC study collected information on major risk factors for VBAC, as well as records of labor progression (Macones et al. 2005).In practice, both the overall laboring time and the remaining time to vaginal delivery after arrival to the hospital are of great clinical interest.The prolonged laboring time increases the risk of uterine rupture and usually results in an emergent C-section.Therefore, it is important to estimate the expected laboring time given an observed history of dilations.
To perform the time-to-event analysis, we defined the event as the successful vaginal delivery without rupture.Both ruptures and related C-sections were treated as right censors.Of note, C-sections can be thought of as a competing risk to successful vaginal delivery.By treating C-section as right-censoring, we were modeling the subhazard distribution for successful vaginal delivery, which is the primary interest in this analysis.Originally 629 subjects were involved in the case-control study.In this analysis, 442 subjects were selected to increase the homogeneity of the sample based on the following 4 criterions: (a) at least one observation within 500 min (≈ 8 hr) of delivery or C-section; (b) the delivery process took less than 1500 min (≈ 1 day); (c) observed dilation measurements are greater than or equal to 3; and (d) maternal age is available.Among the 442 subjects we  have 356 (80.5%) successful vaginal deliveries (events) and 86 (19.5%) ruptures or C-sections (right-censoring).
We assume the dilation trajectories follow a curve registration model (1), and the times to vaginal delivery follow a functional CRAFT model (3).To fit the curve registration model (1), we scaled the observational time t ij into a sub-interval of [0,1] by mapping t min = min i,j {t ij } to 0.2 and t max = max i,j {t ij } to 0.9; and the time points in between were mapped as interior points of [0.2, 0.9] proportionally.The common shape function μ(τ ) is approximated by a B-spline expansion with one equalspaced interior knot, where the number of knots is pre-selected by AIC following Elmi et al. (2011).The registration function g i (τ ) is approximated by another B-spline expansion with 2 equal-spaced interior knots.
For the functional CRAFT model, a natural definition of the time origin would be the first time when "cervical dilation > 0 cm".However, this time typically happened many hours (even days) before hospitalization when a participant was first observed.Estimation of this time requires extensive extrapolation and is numerically unstable.Therefore, we propose to define the time origin by "cervical dilation = 2 cm" which can be more accurately estimated.In our data, all subjects reached at least 3cm dilation.So we did not lose any women by using this time origin.Specifically, we have τ 0 = μ −1 (2).Because μ is updated in each EM iteration, this τ 0 is also updated accordingly which converges to around 0.4 in the estimation.The maternal age is used as a covariate in the functional CRAFT model.The estimated coefficient for maternal age is 0.0059 (SE = 0.0019, p = 0.002), implying that each additional year of maternal age is associated with a 0.6% increase in the total length of time to vaginal delivery.
The observed and aligned dilation trajectories are shown in Figure 7: panel (a) shows the raw observed dilation trajectories using the hospital entry as time zero; panel (b) shows the estimated registration functions; panel (c) shows the registered trajectories aligned to the common shape function, where the between-subject variability of the dilation trajectories is greatly reduced after registration.The fitted common standard delivery curve captures the two-stage nature of the delivery process, which does not follow a simple parametric curve.
Although the functional CRAFT model is proposed for the overall laboring time S i = T i −g i (τ 0 ), we can also use this model to estimate the remaining time to vaginal delivery from hospital entry as Ti = ĝi (τ 0 ) + Ŝi , where ĝi (τ 0 ) = E u i [g i (τ 0 )|O i , θ ] as defined in Equation ( 7) in the E-step of the EM algorithm, and Ŝi is defined by (8) in Section 3.4.We can use the fitted remaining times to check the goodness of fit of the functional CRAFT model, by plotting them against the observed follow-up times.3) fits well to the observed times to vaginal delivery from hospital entry.As to the censoring times, we purposely ignored the fact that the actual remaining times are greater than the censoring times, by not using the truncated distribution.However, the fitted remaining times still reflect this trait properly by estimating longer remaining times than censoring times for most of the censored subjects.This demonstrates the robustness of the functional CRAFT model (3) in time-to-event estimation.
As a sensitivity analysis, we repeated our analysis using 0 cm to define the time origin.The results largely remained the same (data not shown).But because we have very sparse observations in dilations between 0 cm and 2 cm, the estimation of the longitudinal curves and survival times using 0 cm as time origin are less accurate than using 2 cm.

Discussion
We have introduced a new joint modeling framework for timeto-event analysis with unknown time origins when a longitudinal biomarker trajectory is available during the follow-up time.The unknown time origins are modeled as random time points that are further used as the link to jointly model the longitudinal and survival data, and are integrated out in the joint likelihood function.Therefore, the unknown time origins are not required to be observable in our method, but rather serve as a conceptual reference point for comparable survival analysis.We further introduce a functional survival model that can be used to estimate the time to event based on the registration function.Alternatively, one can also use the original subjectspecific biomarker trajectory as the predictor, if it is of more interest to examine the direct association of the biomarker trajectory with the time to event.However, the interpretation of the two functional coefficients are different.Such model would also involve more complex computations for which future research is required.
We demonstrate by simulation that considering the unknown time origins explicitly by a CRAFT model is superior than the traditional baseline biomaker adjustment method, in terms of better efficiency and more adequate bias correction.In particular, the baseline adjustment method implicitly assumes the biomarker trajectory follows a log-linear function of time, and is unbiased only when the left-censoring time is independent of the covariate of interest.Moreover, we have shown that the two-step estimation method and model misspecification with restrictive parametric assumptions on the longitudinal data can also lead to biased inference.
In this article, we have not included any covariates in the curve registration for ease of exposition as the main focus of this paper is on the time-to-event analysis.It is possible to incorporate covariates into the curve registration by means of nonlinear mixed-effect modeling (Pinheiro and Bates 2000) where u i can be modeled as a combination of fixed and random effects.Similarly, we can add covariates into the modeling of the time origin t i,0 .The computational details and theoretical aspects of these models deserve careful further investigation.

Figure 1 .
Figure 1.Cervical dilations plotted against the number of minutes from arrival to the hospital for 20 women from the VBAC study that experienced uterine rupture (dashed) or had a successful vaginal delivery (solid).

Figure 2 .
Figure 2. The estimation results for Example 1. Left panels are for scenario 1 and the right panels are for scenario 2: KL(S, Ŝ) = the estimation error of the survival distribution of ε i by Kullback-Leibler distance.UA: Un-adjusted AFT model; BA: Baseline adjusted AFT model; 2S: CRAFT model estimated by two-step method; JM: CRAFT model estimated by joint modeling.

Figure 3 .
Figure 3. Estimation results for Example 2. CR: the proposed functional CRAFT model estimated by joint modeling (method 1); LN: the functional CRAFT model with linear common shape function estimated by joint modeling (method 2); 2S: the functional CRAFT model estimated by two-step method (method 3).

Figure 4 .
Figure 4. Longitudinal daily egg trajectories of the left-censored medfly data: (a) trajectories aligned at the left-censored first "observational" time; (b) registered trajectories: the thick curve in the center represents the fitted common standard daily egg curve.

Figure 5 .
Figure 5. Three examples of the BLUP estimates of the registration function g i (τ ) (observed fragments; left panels) and the registered curves μ(g −1 i (t ij )) (right panels).The "observational time" is obtained by shifting the original time from birth by the artificial left-censoring time, where time 0 represents the left-censoring time on the original time scale.This mimics the delayed entry in observational studies.
(a) A two-step estimator following the idea of DeGruttola, Lange, and Dafni (1991).Step 1: recover the time origin by a reduced model (1) with linear registration function, and define a new time to event based on the estimated time orgin.Step 2: estimate the survival probability based on the new time to event.This is the closest estimator in the literature to the proposed CRAFT model.(b) A traditional estimator using the observed time to event (r i − t i1 ) as the survival outcome, adjusting for the baseline number of eggs observed at t i1 as a covariate.This corresponds to the most prevalently used method in medical literature.(c) The proposed CRAFT model with joint modeling.

Figure 7 .
Figure 7. Dilation trajectories and the estimation registration functions of the VBAC data.(a) Observed dilation trajectories: each curve pertains to a study participant.(b) Estimated registration functions.(c) Aligned dilation trajectories: the thick curve in the center represents the fitted common standard labor curve.

Figure 8 .
Figure 8. Estimation results of the functional CRAFT model for the VBAC data.(a) Estimated coefficient function β(τ ) (solid curve), and the point-wise 95% confidence intervals (dotted curves).(b) Observed follow-up times plotted against the fitted Ti .Crosses: subjects with observed time-to-delivery; circles: ruptures (right-censor).

Figure 8
Figure 8(a) shows the estimated functional coefficient β(τ ) (panel a). Figure 8(b) shows the estimated Ti plotted against the observed times to vaginal delivery from hospital entry (events) or rupture (right-censoring).It shows that model (3) fits well to the observed times to vaginal delivery from hospital entry.As to the censoring times, we purposely ignored the fact that the actual remaining times are greater than the censoring times, by not using the truncated distribution.However, the fitted remaining times still reflect this trait properly by estimating longer remaining times than censoring times for most of the censored subjects.This demonstrates the robustness of the functional CRAFT model (3) in time-to-event estimation.As a sensitivity analysis, we repeated our analysis using 0 cm to define the time origin.The results largely remained the same (data not shown).But because we have very sparse observations in dilations between 0 cm and 2 cm, the estimation of the longitudinal curves and survival times using 0 cm as time origin are less accurate than using 2 cm.
Figure 8(a) shows the estimated functional coefficient β(τ ) (panel a). Figure 8(b) shows the estimated Ti plotted against the observed times to vaginal delivery from hospital entry (events) or rupture (right-censoring).It shows that model (3) fits well to the observed times to vaginal delivery from hospital entry.As to the censoring times, we purposely ignored the fact that the actual remaining times are greater than the censoring times, by not using the truncated distribution.However, the fitted remaining times still reflect this trait properly by estimating longer remaining times than censoring times for most of the censored subjects.This demonstrates the robustness of the functional CRAFT model (3) in time-to-event estimation.As a sensitivity analysis, we repeated our analysis using 0 cm to define the time origin.The results largely remained the same (data not shown).But because we have very sparse observations in dilations between 0 cm and 2 cm, the estimation of the longitudinal curves and survival times using 0 cm as time origin are less accurate than using 2 cm.