Causal inference with a mediated proportional hazards regression model

Abstract The natural direct and indirect effects in causal mediation analysis with survival data having one mediator is addressed by VanderWeele. He derived an approach for (1) an accelerated failure time regression model in general cases and (2) a proportional hazards regression model when the time-to-event outcome is rare. If the outcome is not rare, then VanderWeele did not derive a simple closed-form expression for the log-natural direct and log-natural indirect effects for the proportional hazards regression model because the baseline cumulative hazard function does not approach zero. We develop two approaches to extend VanderWeele’s approach, in which the assumption of a rare outcome is not required. We obtain the natural direct and indirect effects for specific time points through numerical integration after we calculate the cumulative baseline hazard by (1) applying the Breslow method in the Cox proportional hazards regression model to estimate the unspecified cumulative baseline hazard; (2) assuming a piecewise constant baseline hazard model, yielding a parametric model, to estimate the baseline hazard and cumulative baseline hazard. We conduct simulation studies to compare our two approaches with other methods and illustrate our two approaches by applying them to data from the ASsessment, Serial Evaluation, and Subsequent Sequelae in Acute Kidney Injury (ASSESS-AKI) Consortium.


Introduction
Mediation analysis has become increasingly popular in biomedical and public health research.It is a powerful tool to explain the mechanism of direct and indirect effects of a treatment or exposure on an outcome, such as death or disease onset.A mediating variable lies in the pathway from exposure to outcome and it can have one or more links.Rather than studying a direct causal relationship between the independent variable and the dependent variable, mediation analysis proposes that the independent variable has an effect on the mediator variable, and then the mediator variable has an effect on the dependent variable (MacKinnon, Fairchild, and Fritz 2007).There are two main approaches for the study of mediation.In the social sciences, such as psychometrics, researchers define direct and indirect effects with a given model (Duane and Robert 1975;James and Brett 1984;Baron and Kenny 1986).There are two ways to estimate the indirect effect.Judd and Kenny (1981) suggested the "difference method," whereas the other method to compute the indirect effect is the "product method" (Sobel 1982;Baron and Kenny 1986).The "product method" is the approach used in structural equation modeling (SEM), which is a very powerful multivariate technique (Fiske, Kenny, and Taylor 1982;Ten Have and Joffe 2012;Gunzler et al. 2013).In order to render the analysis more valid, causal inference of direct and indirect effects are introduced.Causal inference with a counterfactual model does not depend on the specification of a particular statistical model (Robins and Greenland 1992;Pearl 2001;VanderWeele and Vansteelandt 2009;Imai, Keele, and Tingley 2010;Ten Have and Joffe 2012).Pearl (2001) provides definitions of direct and indirect effects that are appropriate even when there are interactions between the exposure and mediator on the outcome.Pearl (2001) defined "direct effects" as the influence that is not mediated by other variables in the model."Indirect effects" cannot be defined specifically like "direct effects," but are the part of the exposure effect which is mediated by a given set of potential mediators.Pearl (2001) also provided the definition of the controlled effect and the natural effect.
In addition to the direct and indirect effects, there are alternative approaches for the decomposition in mediation analysis.VanderWeele (2014) describes a 4-way decomposition which covers the mediation and interaction.The first component is the effect of the exposure in the absence of the mediator, which represents the effect of neither mediation nor interaction.The second component is the interactive effect when the mediator is left to what it would be in the absence of exposure, which represents the effect of just interaction.The third component is a mediated interaction, which represents the effect of both mediation and interaction.The fourth component is a pure mediated effect, which represents just mediation.The assumptions about no confounding, conditional on the covariates, are provided in VanderWeele (2014).The 4-way decomposition can cover the interaction and mediation, which the basic SEM cannot accommodate.
There are a number of papers discussing mediation from a counterfactual perspective with continuous outcomes.However, since time-to-event outcomes can offer more information than simply the binary outcome of whether or not an event occurred, they are common in medical research.To deal with time-to-event outcomes, including censored observations where the event was not observed during the follow-up period, survival analysis methods are the optimal choice.VanderWeele (2011) discussed different effect measures of direct and indirect effects in survival analysis, and he showed an approach with a rare outcome using proportional hazards regression models or with general situations using accelerated failure time regression models.
VanderWeele (2011) provided the decomposition with the accelerated failure time (AFT) regression model and the proportional hazards regression model, and he considered both the mediation and interaction.The assumptions about no confounding, conditional on the covariates, are made by Lange andHansen (2011). VanderWeele (2011) derived the log-natural direct effect and log-natural indirect effect for the proportional hazards regression model as approximations based on a closed-form equation of the model parameters when the outcome is rare.Also, in VanderWeele (2011), the equations of the log-natural direct effect and the log-natural indirect effect are not a function of time T, so the effects are constant at different time points.However, if the outcome is not rare, then the proportional hazards regression model does not have a simple closed-form expression for the log-natural direct and the log-natural indirect effects.Therefore, we develop a computer-intensive approach for such a situation.When the cumulative baseline hazard condition with exposure A ¼ 0, mediator M ¼ 0, and covariates is not sufficiently small, and the expressions of the natural direct and indirect effects should include K T t j 0, 0, 0 ð Þ : Tchetgen Tchetgen (2011) proposed a theory of estimation of natural direct and indirect effects in a proportional hazards regression model.His approach does not require the rare outcome theory.However, he assumed that the natural direct hazard ratio and the indirect hazard ratio both agree with the proportional hazards assumption of the total effect, which is a strong assumption.Although the approach provided the estimation for both the direct and indirect effects, the additional assumption in this approach is difficult to satisfy in practice.Wang and Albert (2017) provided a mediation formula approach in which parametric models are invoked to approximate the baseline log cumulative hazard function and it does not rely on the rare outcome assumption.Those parametric models include fractional polynomials model and restricted cubic splines model.In order to estimate the cumulative baseline hazard based on the method from VanderWeele (2011), we develop two approaches.Because the baseline hazard function in the Cox proportional hazards regression model is not estimated, we can (1) apply the Breslow method in the Cox proportional hazards regression model to estimate the unspecified cumulative baseline hazard; or (2) assume a piecewise constant baseline hazard model to estimate the baseline hazard and cumulative baseline hazard for a parametric model.Then we obtain the natural direct and indirect effects by calculating k We not only provide estimates for natural direct and indirect effects, but also the confidence intervals which are obtained from a bootstrap algorithm.
We also compare our estimates for natural direct and indirect effects to VanderWeele's estimates in the plots.Our results agree with VanderWeele's estimates when the cumulative baseline hazard is very small, and they differ as the cumulative baseline hazard increases as time increases.

Concepts and definitions
Similar to VanderWeele (2011), let A denote the exposure of interest, T the time-to-event outcome, M the mediator, and C a set of covariates.We suppose A is a binary exposure and it could have two possible levels, designated as a and a Ã : For mediators, we suppose the mediator is continuous, and it is normally distributed.Let M a be the counterfactual value of the mediator when the exposure variable has been set to a: Similarly, M a Ã is the counterfactual value of the mediator when the exposure variable has been set to a Ã : Let T aM a denote an individual's event time if the exposure has been set to value a and the mediator has been set to the level when the exposure is at value a: Let T aM a Ã denote an individual's event time if the exposure has been set to value a and the mediator has been set to the level when the exposure is at value a Ã :

Assumptions
The assumptions about no confounding relationships, conditional on covariates, are the same as those in Lange and Hansen (2011), which are also used in VanderWeele ( 2011): (1) The effect of the exposure A on the time-to-event variable T is not confounded when it is conditioned on the set of covariates C: (2) The effect of the mediator M on the time-to-event variable T is not confounded when it is conditioned on the exposure variable A and the set of covariates C: (3) The effect of the exposure variable A on the mediator M is not confounded when it is conditioned on the set of covariates C: (4) All the confounders with the mediator M and the time-to-event T variable are not affected by the exposure variable A: The regression model of the mediator is M is normally distributed with mean and variance (l, r 2 Þ, The proportional hazards regression model for the time-to-event outcome with exposure, A, mediator, M, and confounder variables, C, is where k T t j 0, 0, 0 ð Þ is the baseline hazard in which the binary exposure variable, the mediator, and the set of covariates are set to 0.
Then we have that where f T aM a Ã t j c ð Þ and S T aM a Ã t j c ð Þ denote the conditional density and survival functions, respectively, for T aM a Ã : This leads to where the cumulative baseline hazard function is VanderWeele (2011) simplified these expressions by assuming that the outcome event is rare so that the cumulative baseline hazard function K T t j 0, 0, 0 ð Þapproaches 0, leading to By definition, the natural indirect effect and the natural direct effect on the log-hazard function from VanderWeele (2011) are Then, the natural indirect effect and natural direct effect on the hazard function are, respectively,

Breslow estimator of cumulative baseline hazard in cox proportional hazards model
In many circumstances, the outcome in epidemiological research is common and the assumption of a "rare outcome" may not be realistic.However, it is not straightforward to simplify the expressions for the natural direct and indirect effects in the proportional hazards regression model.Nevertheless, we can calculate an exact value for the hazard function in specific cases.
Although the baseline hazard is not specified in a Cox proportional hazards regression model, we can use the Breslow method to estimate the baseline hazard function and the cumulative baseline hazard function to calculate the natural indirect and direct effects.
Then the model we have for the mediator is M is normally distributed with mean and variances (l, r 2 Þ, The Cox proportional hazards regression model for the time-to-event outcome with exposure, A, mediator, M, and confounder variables, C, is For parameter estimation in the Cox proportional hazards regression model, we use the Breslow approximation in the maximum likelihood function to deal with the tied event time points. Because the Cox proportional hazards regression model is semi-parametric, k T t j 0, 0, 0 ð Þis not a parameter that can be estimated.However, we still can estimate it after we have all the parameter estimates.According to Breslow (1972), the estimator of the baseline hazard is where s i is the i th event time point, d i is number of events at s i and k represents the k th person who is at risk at time s i : For each event time, it has an estimate kT t ð Þ and KT ðtÞ, We obtain the value for k T aM a Ã t j c ð Þ by computing the integral for the numerator and denominator in the formula above.k T a Ã M a Ã t j c ð Þ and k T aMa t j c ð Þ can be obtained in the same way.By definition, the indirect effect on the hazard function is Piecewise constant baseline hazard model Natural direct and indirect effects obtained from the Breslow estimator of the cumulative baseline hazard only are available for event times because kT t ð Þ ¼ 0, if T ¼ t is not an event time.In order to estimate the natural direct and indirect effects for both event times and censored times, we use a piecewise constant baseline hazard model.Compared to the standard proportional hazards regression model, the piecewise constant baseline hazard model is parametric and has more than one constant baseline hazard level, which depends on the time-to-event outcome, T : k T t j 0, 0, 0 ð Þ¼ k j , if t jÀ1 t < t j , with j ¼ 1, :::J and t 0 ¼ 0 (7) where j is j th piece of time.
Then the model we have for the mediator is M is normally distributed with mean and variance (l, r 2 Þ, The proportional hazards regression model for the time-to-event outcome with exposure, A, mediator, M, confounder variables, C, and piecewise constant baseline hazard is where k j t j 0, 0, 0 ð Þis the baseline hazard in which the binary exposure variable, the mediator, and the set of covariates are set to 0 when time T falls in the interval ½t jÀ1 , t j : Because the model is fully parametric, we can apply maximum likelihood estimation to estimate the parameters and where D j ðtÞ ¼ and j is j th piece of time. For the number of intervals, Friedman (1982) recommended that a model start with 5-7 intervals and then examine the results for sharp changes in the coefficients for baseline hazard in each interval and their estimated standard error.We construct the model with 6 intervals first and attempt different numbers of intervals around 6. We also use the model selection criteria, such as the Akaike information criterion (AIC) to guide in the selection of the appropriate model.There are no obvious ways to choose break points for parameterizing the baseline function in terms of a piecewise constant function.In the SAS procedure, PROC PHREG in the presence of right-censored data chooses a set of points such that the resulting time intervals contain approximately equal numbers of event times.Here, we follow suit to make each time interval have the same number of event times.
For example, when we calculate k T aM a Ã t j c ð Þ, we have We obtain the value for k T aM a Ã t j c ð Þ by computing the integral for the numerator and denominator in the formula above.k T a Ã M a Ã t j c ð Þ and k T aMa t j c ð Þ can be obtained in the same way.The natural indirect effect on the hazard function is

Simulations
Here we conduct simulation studies to compare our two approaches with VanderWeele's method and Wang and Albert's method on hazard scale.Event times were generated from different parameterized Weibull distributions.For the censored rate, we consider 10%, 60% and 85%.We also consider different parameters to generate the mediator variable.There are 7 different sets of parameters and each set of parameters generates 500 datasets.Each dataset contains 1000 observations.In each dataset, we estimate the natural indirect and direct effects at three event time points, which are 5 percentile event time, median event time and last event time, and we can compare the performance of each method when the time is increasing.
The true natural indirect and natural direct effects on hazard are calculated by the definition with the true parameters and true parametric baseline hazard functions.The simulation results are given in Tables 1-3.Table 1 shows simulation results for the 5th percentile event time under 7 different scenarios.At this time point, the cumulative baseline hazard is relatively small and the rare outcome assumption is not violated.The bias of VanderWeele's estimate is not high.Tables 2 and 3 are the results for the median event time and last event time.The bias of VanderWeele's estimate is increasing when time is increasing.At the last event time, VanderWeele's estimate is not valid because of the violation of rare outcome assumption.Wang and Albert's method works well at the 5 th percentile event time and median event time, but it does not have a good estimate of the natural direct effect at the last event time point.The bias of our two approaches is stable and small at 3 different time points.

Application to a real data set
We now apply the two approaches described above to a real clinical data set.There are 1538 patients recently discharged from the hospital who enrolled in the multi-center, prospective study from the ASsessment, Serial Evaluation, and Subsequent Sequelae in Acute Kidney Injury (ASSESS-AKI) Consortium, with one-half of the adults experiencing AKI during the index hospitalization matched the other one-half not experiencing AKI (Ikizler et al. 2021).The matching criteria, defined in the aforementioned reference, renders the matched AKI and non-AKI individuals as similar as possible and adjusts for confounders.AKI is a common complication in hospitalized patients, and has been associated with de novo or worsening kidney function in terms of adverse clinical outcomes such as end-stage renal disease (ESRD) and also heightened risk of death.Previous studies have shown that patients who experience one episode of AKI are at particularly high risk for experiencing repeat episodes and mortality.Here we used recurrent AKI as the event of interest and the time of the first recurrent AKI is either event occurred or right-censored.Having, or not having, AKI during the index hospitalization is the exposure variable.Since urine neutrophil gelatinase-associated lipocalin (uNGAL) has shown promise as a biomarker for the early detection of AKI in fixed models of injury, uNGAL is selected as the mediation variable.To render the variables normally distributed, we applied a logarithmic transformation of uNGAL.Female status, diabetes and chronic kidney disease status before entry into the study (baseline CKD) are covariates.There were five missing values for uNGAL, so our final data set consists of 1,533 adults.Figure 1 shows the causal mechanism.

Breslow estimator of cumulative baseline hazard in cox proportional hazards models
We fit the model for the mediator and the Cox proportional hazards regression model.After we achieve the parameter estimates, we calculate the baseline hazard and cumulative baseline hazards for each event time and then obtain k Analogous to the setting for the piecewise constant baseline hazard model, the three covariates are set to baseline CKD (c 1 ¼ 1), female (c 2 ¼ 1) and diabetes (c 3 ¼ 1).
We select an event time T ¼ 2, and at this time point, Then by integration we have at 2 months We have the natural indirect effect and natural direct effects on the hazard at 2 months are When the event time T ¼ 85 months, Þby integration, we have the natural indirect effect and natural direct effect on hazard are By VanderWeele (2011), the natural indirect effect and the natural direct effect on the hazard for all the time points, assuming that the cumulative baseline hazard function is close to zero: We also compared our estimate and VanderWeele's estimate.Similar to the results from the piecewise constant baseline hazard model, when T ¼ 2, our estimate of the natural indirect and direct effects on the hazard function are very close to VanderWeele's.However, when T ¼ 85 months, there is a discrepancy between our estimate and VanderWeele's, due to the violation of the assumption that K T t j 0, 0, 0 ð Þapproaches 0. The estimates of the natural indirect and direct effects on the hazard function from VanderWeele (2011) are consistent in the piecewise constant baseline hazards regression model and the Cox proportional hazards regression model when the assumption that K T t j 0, 0, 0 ð Þ approaches 0 is satisfied.Also, when T ¼ 85 months, the assumption is not satisfied, the estimates from the piecewise constant baseline hazard model and the Cox proportional hazards model are consistent.
To make the comparison more accurate and clear, we use a bootstrap method to generate confidence regions.We constructed 500 bootstrap samples by sampling with replacement from the data and each sample included 1533 observations (there were five missing values for uNGAL).In the original data set, we have 344 event time points, but each bootstrap sample may not include all the time points because of replicates in each sample and each sample may have different event time points.We have the estimates of the natural indirect and direct effects for every event time point within each sample.After we have all the estimates in the 500 bootstrap samples, we calculate the means of the indirect and direct effects for all the 334 event time points.We also construct the confidence intervals using the percentile method.The 95% confidence interval is the range of points that cover the middle 95% of bootstrap sampling distribution.The statistical computing code is given in eAppendix, supplementary material.Unlike the plots of piecewise constant baseline hazard model, the estimates of natural indirect and direct effects on hazard of the Cox proportional hazards model are not continuous and smooth.It is because the Cox proportional hazards models are semi-parametric, and we only can estimate the baseline hazard kT t ð Þ for event time points.For those censored time points, kT t ð Þ ¼ 0 and there is no value for k T aM a Ã t j c ð Þ, k T aMa t j c ð Þ and k T a Ã M a Ã t j c ð Þ: Therefore, the indirect and direct effect estimates shown on the plots are event time points.Figures 2 and 3 compare our method with VanderWeele's method.At the early time, since the cumulative hazard is very small, the assumption K T t j 0, 0, 0 ð Þapproaching 0 can be satisfied and the approximation is valid.There is no significant discrepancy between the two methods estimates at the beginning of the time axes from the plots.As K T t j 0, 0, 0 ð Þ increases, the difference between our estimates and VanderWeele's estimates increases.

Piecewise constant baseline hazard model
Next, we fit the model for the mediator and piecewise constant baseline hazard model.We partition the time axis such that each piece has roughly the same number of events.We cut the time into pieces by percentiles of the event times and each piece of time share the same constant baseline hazard.The median follow-up time is 57 months and the maximum follow-up time is 92 months.Here we selected 8 pieces of time, namely, [0, 4.1], (4.1, 7.65], (7.65, 12.7], (12.7, 18.15], (18.15, 22.75], (22.75, 35.55], (35.55, 47.6] and (47.6, þ1) and correspondingly, we have 8 baseline hazards denoted as k 1 , k 2 , k 3 , k 4 , k 5 , k 6 , k 7 and k 8 : After estimating all the parameters, we calculate k T aM a Ã t j c ð Þ, k T aMa t j c ð Þ and k T a Ã M a Ã t j c ð Þ with setting a value to time T: The exposure "baseline AKI" is a binary variable with a ¼ 1 and a Ã ¼ 0: To calculate k T aM a Ã t j c ð Þ, the value of the exposure variable is set to a in the piecewise constant baseline proportional hazards model and it is set to a Ã in the model for the mediator.To calculate k T aMa t j c ð Þ, the value of the exposure variable is set to a in the piecewise constant baseline proportional hazards model and the model for the mediator.To calculate k T a Ã M a Ã t j c ð Þ, the value of the exposure variable is set to a Ã in the piecewise constant baseline proportional hazards model and the model for the mediator.The three covariates are set to baseline CKD (c 1 ¼ 1), female (c 2 ¼ 1) and diabetes (c 3 ¼ 1).
In order to compare with the Cox proportional hazards model, when the time T ¼ 2 months, it falls in the first piece of time.The baseline hazard k T 2 j 0, 0, 0 ð Þ¼ k 1 ¼ 0:0023, and the cumulative baseline hazard is 0:0023Ã2 ¼ 0:0046: Then by integration we have Next, we have the natural indirect effect and natural direct effect on the hazard ratio are natural indirect effect ¼ When the time T ¼ 85 months, it falls in the 8 th piece of time The baseline hazard k T 85 j 0, 0, 0 ð Þ¼ k 8 ¼ 0:0012, and the cumulative baseline hazard is We have that the natural indirect effect and natural direct effect on the hazard ratio are By VanderWeele (2011), the natural indirect effect and natural direct effect hazard ratio for all the time points, assuming that the cumulative baseline hazard function is close to zero, are: We compared our method and VanderWeele's method.When T ¼ 2 months, our estimate of the natural indirect and direct effect on the hazard ratio are very close to VanderWeele's.However, when T ¼ 85 months, there is a discrepancy between our estimate and VanderWeele's.This occurs because the cumulative baseline hazard increases as time increases and K T t j 0, 0, 0 ð Þ approaching 0 is violated.The natural indirect effect and the natural direct effect on the hazard function, calculated based on the approximation, are not valid.Next, we constructed 500 bootstrap samples by sampling with replacement from the data and each sample included 1533 observations.For each sample, we run the model and have an estimate of the natural indirect and direct effects on the hazard for 1 to 92 months.We obtain the mean of the natural indirect and direct effects on the hazard function from the 500 bootstrap samples, and we also construct the confidence region using the percentile method.The 95% confidence interval is the range of points that cover the middle 95% of bootstrap sampling distribution.Also, we display VanderWeele's estimate for each bootstrap sample and construct its confidence interval.The statistical computing code is given in eAppendix, supplementary material.Figures 4 and 5 compare our method with VanderWeele's method.At an early time point, since the cumulative hazard is very small, the assumption K T t j 0, 0, 0 ð Þapproaching 0 can be satisfied and the approximation of is valid.Similar to the plots displayed in the Cox proportional hazards model with Breslow estimator of cumulative baseline hazard, under this assumption there is no significant difference between our method and VanderWeele's method and they both show a consistent estimate.As time increases, the difference in the effect means from the two methods is increasing, and it indicates that VanderWeele's estimate does not work very well when K T t j 0, 0, 0 ð Þis increasing.

Discussion
VanderWeele (2011) assumed rare outcomes for mediation analysis with the proportional hazards regression model.However, VanderWeele's approach in the general setting, for example when the time-to-event outcome is common, does not have a clear causal interpretation as a measure of effect.In our two approaches, we do not need the assumption of a rare outcome, and we can perform statistical inference with a specific time point.In the methodology section, we use Breslow estimates in the Cox proportional hazards regression model and the piecewise constant baseline hazard model and to estimate the baseline hazard and all the other parameters for calculating the hazard for a specific time point.We use the Breslow method to estimate k T t j 0, 0, 0 ð Þ and K T t j 0, 0, 0 ð Þ after the parameter estimates are obtained.Then the natural direct and indirect effects for different time points can be calculated by integration.The piecewise constant baseline hazard model is parametric, so the baseline hazard k T t j 0, 0, 0 ð Þfor each time piece is a parameter which can be estimated.Since k T t j 0, 0, 0 ð Þin each time piece is a constant, the cumulative baseline hazard for each time point also can be estimated.The Cox proportional hazards regression model is semi-parametric, and k T t j 0, 0, 0 ð Þ is not estimated as a parameter.In the simulation study, our two approaches performed well at different time points and the bias is more stable and smaller than VanderWeele's method and Wang and Albert's method.We also applied the two approaches described to a real medical data set which has 1533 patients recently discharged from the hospital who enrolled in the multi-center, prospective ASSESS-AKI study, with one-half experiencing AKI during the index hospitalization.In order to compare our two approaches to VanderWeele's method, we construct graphs to display the hazard ratios as time increases.It shows that when the cumulative hazard is very small, the assumption K T t j 0, 0, 0 ð Þapproaching 0 can be satisfied and there is no significant difference between our two approaches and VanderWeele's method.As the cumulative hazard increases as time increases, the difference in the effect means from our method and VanderWeele's approach is increasing, and it indicates that VanderWeele's estimate does not work very well when K T t j 0, 0, 0 ð Þis increasing.Compared to Tchetgen Tchetgen (2011), our methods do not require the additional assumption that the natural direct hazards ratio and the indirect hazards ratio both agree with the proportional hazards assumption, which is very restrictive.Compared to Wang and Albert (2017), we not only provide new approaches for the estimates of the natural direct and indirect effects, but also make a comparison with VanderWeele (2011), and show the difference between the estimates as the time increases.
An R function including these two approaches is provided at (https://github.com/huizeng-7/mediation_cox_piecewise), provides the estimate of natural indirect and natural direct effects for all event time points and also the plots with confidence interval.In future research, we will develop models that not only include one mediator, but also two mediators under different scenarios.When the model extends to having two mediators, we need to consider the relationship between these two mediators and also how these two mediators interact with the exposure variable.The two situations we plan to investigate are as follows: (1) the two mediators are independent; (2) the two mediators are not independent.When the situation becomes more complex, it will yield more parameters that need to be estimated and it introduces the difficulty for both of our two approaches.Obviously, the numerical integration will be much more complicated.
according to the definition.Without the rare outcome assumption, we calculate these values by integration with a specific time point.The expressions for kT aM a Ã t j c ð Þ, k T aMa t j c ð Þ and k T a Ã M a Ã t j c ð Þinclude the baseline hazard function and also the cumulative hazard function, which can be determined by the baseline hazard function (SAS Institute Inc 2009):

Figure 1 .
Figure1.Mediation with exposure variable (having or not having AKI at baseline), outcome variable (recurrent AKI), mediator variable (biomarker uNGAL), and a set of covariates (gender, diabetes and chronic kidney disease status before entry into the study).

Figure 2 .
Figure 2. Indirect effect from Breslow estimator of cumulative baseline hazard in Cox proportional hazards models compared with VanderWeele's approach.

Figure 3 .
Figure 3. Direct effect from Breslow estimator of cumulative baseline hazard in Cox proportional hazards models compared with VanderWeele's approach.

Figure 4 .
Figure 4. Indirect effect from piecewise constant baseline hazard model compared with VanderWeele's approach.

Figure 5 .
Figure 5. Direct effect from piecewise constant baseline hazard model compared with VanderWeele's approach.

Table 1 .
Simulation statistics of the estimated natural indirect and natural direct effects at 5th percentile event time on hazard for survival data under different scenarios, n ¼ 1000 per group.

Table 2 .
Simulation statistics of the estimated natural indirect and natural direct effects at median event time on hazard for survival data under different scenarios, n ¼ 1000 per group.

Table 3 .
Simulation statistics of the estimated natural indirect and natural direct effects at last event time on hazard for survival data under different scenarios, n ¼ 1000 per group.