Bayesian Approaches for Handling Hypothetical Estimands in Longitudinal Clinical Trials With Gaussian Outcomes

Abstract The International Council for Harmonisation (ICH) recently published an E9(R1) addendum that requires the estimand associated with the study objective in clinical trials to be clearly defined. One of the challenges in defining an estimand is the estimand’s handling of intercurrent events (ICEs) that affect the collection or interpretation of the data for the study. Among the strategies for handling ICEs, sponsors may prefer to examine hypothetical strategies that assess the theoretical or attributable efficacy of test drugs or biologic products. We first look at several estimands under different hypothetical treatment conditions of interest. For these estimands, the data after ICEs are ignored and treated as missing. Analyses are carried out with missing data assumptions under missing at random, control-based imputation, and return-to-baseline imputation. With the explicit forms of these hypothetical estimands derived, we investigate Bayesian approaches to obtain corresponding point and interval estimates and propose a Bayesian sensitivity analysis which avoids the information positive problem. The methods are illustrated with applications to three clinical trials.


Introduction
For longitudinal clinical trials with continuous outcomes, a common objective is to estimate treatment effects at the end of these trials while accounting for missing observations. Naïve deletion of patients whose data contain missing values can lead to biased inferences (Little and Rubin 2020). In 2010, a National Research Council panel (NRC 2010) made 18 recommendations that highlight the need to reduce missing data in trial design and to apply proper statistical methods in conducting statistical analysis. The panel also recommended that sponsors carefully define the estimands (the population parameters to be estimated) and their corresponding statistical analysis methods in the study protocol. In an effort to provide more clarity and guidance to address missing data, the recent International Council for Harmonisation (ICH) E9(R1) addendum titled "Estimands and Sensitivity Analysis in Clinical Trials" proposed a structured framework to define estimands and to align planning, design, conduct, analysis, and interpretation (ICH 2019). The addendum provides a description of five attributes that are used to construct an estimand: • The treatment condition of interest to which the comparison is made; for example, the comparison may include treatment interruption and/or addition of other therapy as part of the condition of interest. • The population of patients targeted by the scientific question. • The variable (or endpoint) to be obtained for each patient that is required to address the scientific question. • Specification of how to account for intercurrent events (ICEs) to reflect the scientific question of interest. • A population-level summary for the variable that provides the basis for a comparison between treatment conditions.
Considerations for choosing an estimand should depend on the study objectives. These considerations involve coping with missing data and specifying the strategy to address ICEs that occur during the trial and result in uncollected data or influence the interpretation of the data collected after the ICEs . ICH E9(R1) also distinguishes between ICEs and missing data. The handling of treatment condition and data following ICEs should be addressed in the estimands' definition (e.g., collection and using of data after treatment discontinuation or use of rescue medication). Missing data that occur when data should have been collected based on the definition of the estimand but could not be collected (e.g., an analysis conducted before all subjects have completed follow-up) are addressed in the analysis models (e.g., missing at random (MAR) assumption). ICH E9(R1) proposed five estimand definition strategies to handle ICEs including treatment policy, hypothetical, composite, principal stratum, and while-on-treatment. The choice of a strategy depends on the stage of development and the interests of sponsors, regulators, patients, physicians, and payers (Keene et al. 2020;. From a sponsor's perspective, the hypothetical strategy under several envisioned scenarios can provide biologically important estimates for the active treatment that are not confounded with other treatments taken after ICEs. Our research will focus on the two common hypothetical estimands under the conditions: (1) patients who experience ICEs continue the assigned treatment to the end of the study; and (2) patients who experience ICEs discontinue the active treatment and do not take any other treatment (such as rescue medication). Under the first condition, the treatment effect may be evaluated by using a mixed model for repeated measures (MMRM) analysis under the MAR assumption (Mallinckrodt et al. 2008). For the second condition, a multiple imputation (MI) approach is often used to handle missing data after ICEs under different missing not at random (MNAR) assumptions, including control-based imputation (CBI) (Carpenter, Roger, and Kenward 2013;Liu and Pang 2017;Mehrotra, Liu, and Permutt 2017) and a returnto-baseline (RTB) approach (Zhang, Golm, and Liu 2020).
With the advancement of Bayesian software packages such as PROC MCMC in SAS (SAS 2017), Stan (STAN 2019), and WinBUGS (Lunn et al. 2000), developing Bayesian methods and MI approaches to address missing data is less complicated. Although there are multiple approaches to the hypothetical strategy, this article focuses on the CBI and RTB assumptions and on implementing those assumptions using a Bayesian framework. In addition, we compare this approach to likelihood-based methods and MI. The article proceeds as follows. Section 2 defines the hypothetical estimands explicitly and discusses conventional analysis methods. Section 3 describes Bayesian methods to obtain point and interval estimates. A Bayesian sensitivity analysis is proposed that avoids the information-positive problem as discussed in Cro, Carpenter, and Kenward (2019). Section 4 provides analyses of three clinical trials to illustrate the applications of these methods. Section 5 presents discussion and conclusions.

Hypothetical Estimands, Missing Data Assumptions, and Estimators
Consider a longitudinal trial for two treatment groups. Let Y ijk be the outcome of interest for patient i receiving treatment j at time k; where i = 1, . . ., n; j = 0, 1; k = 1, . . ., T; and T is the end-of-study time point for the primary treatment comparison. Let X i = (x i1 , . . . , x iL ) be a collection of L baseline covariates (e.g., baseline measure Y ij0 before the trial starts). We assume that the expectation of Y ijk is where α jk is the intercept at time k for treatment group j and β k = (β k1 , . . . , β kL ) is a set of slope coefficients at time k.
In addition, we assume that the repeated measures Y ij = Y ij1 , . . . , Y ijT follow a multivariate normal distribution, Y ij ∼ N μ j , , where N is the normal distribution. We further assume that X i are centralized, so the intercept α jk becomes the treatment effect at the mean level of the baseline covariates X (= 0). All the parameters γ = {α jk , β k , } are defined for an ideal trial in which all patients complete the study according to the protocol. When there are patients who deviated from the protocol, we will use these parameters to define the hypothetical estimands.
In longitudinal clinical trials, we encounter a mixture of two types of missing data patterns: intermittent and monotone missingness. Intermittent missingness refers to missing data that occur between observed data while patients stay in the study with their assigned therapy. Monotone missingness occurs when data are observed at all time points until a patient drops out of the study and all follow-up observations for that patient are missing. Common causes of intermittent missing data include missed visits and data collection or processing errors. Frequently, the probability of missing values is likely to be independent of the missing data, making the MAR assumption plausible. Therefore, the intermittent missing values could be handled in analysis models that rely only on observed data. Under the MAR assumption, for example, the intermittent missing data can be imputed first to produce datasets with monotone missingness (see chap. 4 in O'Kelly and Ratitch 2014). The monotone missing data mechanism can be more challenging because patients may discontinue the assigned treatment (an ICE) and their unobserved outcomes may differ from the ones that would have been observed if they stayed on the assigned treatment. Here, we examine hypothetical estimands associated with the monotone missingness pattern.

Hypothetical Estimands
We consider two hypothetical estimands under different envisaged treatment conditions and outcome distributions following ICEs. We are interested in these two hypothetical estimands because they evaluate the effects of the test treatment without confounding stemming from rescue medications that may be given after ICEs in practical clinical trials. For both estimands, the outcomes after ICEs, even collected, are ignored and treated as missing data.

Hypothetical Theoretical Estimand
Consider a hypothetical estimand that measures the theoretical or pure drug effect assuming all patients complete the trials with their assigned treatment. We refer to this hypothetical estimand as the theoretical estimand, which corresponds to the de jure estimand described in White, Joseph, and Best (2020). For this estimand, the data after ICEs are ignored and treated as missing in the analysis models. For estimation of this estimand, one might assume the post ICE data are MAR. For a patient i who has data observed up to time p 1 < p < T and dropout after that, the mean response before and after time p is modeled as μ ijk = α jk + X i β k , j = 0, 1; k = 1, · · · , T (see first row in Table 1).
Note that the MAR assumption here is just one potential option. In general, the missing data mechanism cannot be verified from the observed data. For example, the MAR assumption may be violated if the ICE is a discontinuation of assigned treatment because of adverse events which influence the outcomes.

Hypothetical Attributable Estimands
The hypothetical theoretical estimand addresses a "pure" treatment effect under a hypothetical scenario that patients who Hypothetical attributable (De facto) No treatment CBI: CR, J2R, CIR drop out of the study continue with their assigned treatment. This scenario is unrealistic in many practical situations, because patients who drop out would not continue taking the assigned therapy. A more realistic attributable treatment effect is to obtain the effect for patients who drop out under a "what if " condition where the patients discontinue their assigned therapy and do not start any other therapies. These estimands correspond to the attributable de facto estimands described in White, Joseph, and Best (2020). In placebo-controlled trials, it is reasonable to assume that patients who drop out of the placebo group would have similar outcomes to patients who stayed in the study because there is no biological difference between taking and not taking a placebo. Thus, the MAR assumption can be applied to patients who drop out of the control group. Carpenter, Roger, and Kenward (2013) proposed three approaches to handle the missing data in the active drug group, based on different CBI assumptions. The three approaches are (i) copy reference (CR), which replaces the expectation profile of patients receiving the active drug who drop out with the expectation profile of the control group at all time points including times before the ICEs; (ii) jump-to-reference (J2R), which replaces the expectation profile after a patient drops out with that of the control group only for post-dropout times; and (iii) copy increments in reference (CIR), where the increment mean change after dropout is the same as the increment mean change of the control group. Using the notations in Model (1), the missing data assumptions for the CR, J2R, and CIR on the mean profile μ i1k at each time point k for a patient in the active drug group who drops out after time p (i.e., with observed data up to and including time P) are defined in Table 1. The RTB is a different possible assumption for handling missing data. RTB assumes that the mean response after a patient drops out will be the same as the their mean at baseline (Zhang, Golm, and Liu 2020). The key assumption of RTB is that all treatment effects (in both the active drug and control groups) that occur before discontinuation would disappear by the primary analysis time point. For example, this may be reasonable in therapeutic areas where standard care medication is used as control and discontinued patients may start rescue medication, thereby the assumed attributable effects without taking any rescue medication may return to baseline after discontinuation for both groups. In situations where Y ijk represents the change from baseline, its expected value is equal to 0 after dropout. Table 1 summarizes the assumptions about treatment condition and the mean profiles for missing data for these hypothetical estimands.

MI-Based Methods
MI techniques provide an intuitive approach to coping with missing data by explicitly specifying imputation models under various assumptions. Their ability to allow separate imputation and analysis models offers flexibility that is highly valued in defining and handling ICEs for hypothetical estimands. Based on Model (1), a MCMC-based imputation may apply data augmentation sampling procedure iteratively and draw samples of (i) the missing data conditional on the parameters and (ii) the model's parameters conditional on the complete data (Tanner and Wong 1987). Upon convergence of the MCMC algorithm, this process produces the posterior samples for the parameter {α, β, } in Model (1) under MAR. The missing data are imputed multiple times to get complete datasets. The conventional MI-based estimators can be obtained as combining the analysis results from those imputed complete datasets using Rubin's rule (Rubin 1987).
Under the multivariate normal model, the response vector Y ij can be partitioned into the observed part, Y o ij , and the missing oo om . For the hypothetical attributable estimands under CBI or RTB missing data approaches, the imputation for the missing data can be obtained using posterior samples of the parameters {α, β, } under MAR, and then impute the missing data using the mean profiles of CR, J2R, CIR, or RTB as specified in Table 1. The MI-based estimators can be obtained by combining the results from each imputed complete datasets using Rubin's rule. For the MI-based estimators under the MAR, CR, J2R, and CIR assumptions, SAS macros are available from the DIA missing data working group (https://www.lshtm.ac.uk/research/centresprojects-groups/missing-data).
The conventional MI approach with common combination rules, however, can produce biased variance estimates when the imputation and analysis models are uncongenial (Meng 1994). The MI is congenial under the MAR assumption; therefore, the conventional MI analysis for the hypothetical theoretical estimand under MAR produces appropriate sampling error and interval estimates with nominal coverage. The MI is not congenial for the CBI and RTB methods. The conventional MI can overestimate the variances compared to the variability which is expected if the trial and the analysis were performed again. This overestimation may lead to overly conservative analyses (Ayele et al. 2014;Zhang, Golm, and Liu 2020). One approach for handling the variance estimation for such estimators where uncongeniality occurs is to use bootstrapping (Bartlett and Hughes 2020). Alternative approaches based on maximum likelihood and delta methods were also previously proposed (Lu 2014;Tang 2015;Liu and Pang 2016).

Likelihood-Based Method
In a likelihood-based approach, the hypothetical parameters as defined in Model (1) are first estimated from a likelihood-based method, and then the treatment effects for the hypothetical attributable estimand under different missing data assumptions are derived from the parameters of Model (1). The treatment effect for the MAR-based analysis is estimated from the parameters of Model (1) For CBI-based analyses, let f jp be the proportion of patients in group j who drop out at time p, and let f jT = 1 − T−1 p=1 f jp be the proportion of patients in group i who complete the trial. The average effects at time point T for the active treatment group (evaluated at X = 0), under different CBI assumptions, are derived by Liu and Pang (2016) as follows: where α 1k and α 0k are treatment effects at time k for the active drug and control groups, respectively; α o 1p and α o 0p are subvectors of first P elements in α 1 = {α 1k } and α 0 = {α 0k }; and is the element of the subvector corresponding to time point T. The treatment differences are Note that the dimension of the covariance matrices mo and oo varies over the missing data pattern (depending on the length of the observed vector) for p = 1, . . .T − 1 in Equations (2) and (3).
Similarly, we can construct estimands under the RTB assumption.
The point estimates for MAR, CBI, and RTB estimators can be calculated from the MMRM estimates {α jk , j = 0, 1; k = 1, . . . , T} andˆ , and from the observed proportions of dropouts over time {f jk , j = 0, 1; k = 1, . . . , T}. The corresponding sampling variance for an estimateθ under CBI or RTB can be obtained using the variance formula: wheref = f 01 , . . . ,f 0T ,f 11 , . . . ,f 1T is the observed proportions of dropouts over time for both treatment groups. The conditional variance of the first term (var θ |f ) can be computed using variance estimates from the MMRM model (e.g., using the estimated covariance obtained from the LSMEANS statement in SAS PROC MIXED analysis output). The second term can be calculated using the point estimates ofθ and var f = v jkl , where n j is the sample size of group j.

Bayesian Methods for MAR, CBI, and RTB Estimators
The Bayesian approach treats unobserved values as parameters and provides a natural path to estimate the model parameters while accounting for the uncertainty that arises from the missing values. A major difficulty in using Bayesian methods is their computational complexity. This difficulty has been reduced by the advancement of efficient Markov chain Monte Carlo (MCMC) techniques. The abundance of available Bayesianfocused software, such as WinBUGS, PROC MCMC in SAS, and Stan, reduces many implementation difficulties in sampling from posterior distributions under the MAR or MNAR assumption. The Bayesian paradigm enables myriad ways of combining the missing data imputation with sampling from the posterior distribution of the parameters under the various hypothetical assumptions.
Using the notations in Section 2, we partition the response vector Y ij into its observed part, Y o ij , and its missing part, oo om for the MMRM analysis under the MAR assumption. For Bayesian inference of Model (1), we assigned independent conjugate and diffused prior distributions. Specifically, we used diffused Normal priors for α = {α jk } and β = {β k }, and assigned an inverse Wishart prior distribution with T degrees of freedom and an identity inverse covariance matrix for .
Formally, p (α, β, ) , α jk ∼ N 0, 10 6 , β km ∼ N 0, 10 6 , and ∼ Inv-Wishart (T, I). Sampling from the joint posterior distribution of the missing values can be accomplished by iterating through the following data augmentation steps (Tanner and Wong 1987): At each iteration, we can compute the parameter of interest θ MAR = α 1T −α 0T to obtain its posterior distribution. Note that the sampling Step 1 requires the MAR assumption such that the conditional distribution for the missing data only depending on the observed data and model parameters.

Calculate θ
These procedures assume that model parameters { α 1 , α 0 , } and the proportions of patients who drop out at each time point are independent. This is reasonable because the missing data mechanism is ignorable under the assumptions of MAR and that { α 1 , α 0 , } and { f 11 , . . . , f 1T } or { f 0T , f 1T } are a-priori independent (Rubin 1976). Sampling of { f 11 , . . . , f 1T } or { f 0T , f 1T } can be done in the same MCMC process as for the MMRM, or it can be done in a separate step by using matrix manipulation. For example, the matrix call functions in SAS PROC MCMC can be used to implement the computation by using the posterior samples of { α 1 , α 0 , } obtained from MMRM. A SAS macro and STAN code for implementing the CR, J2R, and CIR are provided in the supplemental material online.

Bayesian Sensitivity Analysis
In the CBI estimators described earlier, the expected profile after dropout for a patient in the active treatment group is defined using the parameters of the control group. Therefore, some of the terms cancel out when the treatment difference is calculated, as shown in Equations (2) and (3). The treatment effects are estimated using a plug-in approach based on estimates from the MMRM. The sampling variance for these plug-in estimators is obtained from Equation (4) or by using Bayesian posterior samples as described in Section 3.1. Research and simulations have shown that these sampling variance estimates are less biased than the variances from MI that are calculated using common combination rules (Ayele et al. 2014;Lu 2014;Tang 2015;Liu and Pang 2016).
Under the CBI assumption, the estimates and their sampling variance from this plug-in approach or from Bayesian methods are more efficient than those from MI combining rules. However, the statistical literature cautions that sampling variances of these plug-in estimators can decrease as the proportion of dropout in the active treatment group increases. Cro, Carpenter, and Kenward (2019) proposed a concept of informationanchored sensitivity analysis and demonstrated that the plug-in J2R analysis was not information-anchored sensitivity analysis for the MAR estimand. They showed that the J2R analysis based on MI procedure was information-anchored sensitivity analysis and linked the J2R analysis to a δ-adjusted imputation method. Their approach assumes that the true expected parameters in the active drug group after dropout are "different" from the expected parameters in the control group, and shows that the analysis is similar to using a δ-adjustment for the imputed values under MAR.
Following Cro, Carpenter, and Kenward's approach, we propose a Bayesian sensitivity approach for the CBI estimators. The general idea is to use a prior distribution for the assumed expected parameter for dropouts in the active drug group instead of assuming that this expected parameter equals that in the control group. For simplicity, we describe the method for J2R first. We assume that α m 1T ∼ N α 0T , τ 2 instead of α m 1T = α 0T as in Equation (2). Then the Bayesian J2R estimator becomes Compared to the plug-in J2R estimator, this Bayesian J2R estimator has an extra term that accounts for the potential difference between the expected mean for patients who drop out and the mean of the control group. The estimator can be obtained from the Bayesian MCMC samples as described earlier, with an additional draw from the prior distribution a m 1T −a 0T ∼ N 0, τ 2 . When τ = 0, this analysis is equivalent to the plugin J2R analysis, which is considered as the primary analysis for the hypothetical attributable estimand under J2R. Examining different values of τ > 0 results in a series of sensitivity analyses for this primary estimator. The expectation of θ J2R B is the same as the expectation of θ J2R , but its variance increases with the additional variation from the prior distribution. This sensitivity analysis avoids the information-positive problem that is discussed in Cro, Carpenter, and Kenward (2019). When the plug-in J2R analysis (i.e., τ = 0) is significant, a tipping point can also be found when we increase τ such that the sensitivity analysis becomes insignificant.
Because the focus of treatment comparison is at the last time point, we take a simplified approach to conducting sensitivity analysis for CR and CIR, in which α m 1j = α 0j for j = 2, . . ., T − 1, and we consider the prior distribution α m 1T ∼ N α 0T , τ 2 only for the expected parameter at the last time point, T. It can be shown that the Bayesian estimators for CR and CIR are θ CR 1T − a 0T , respectively, where a m 1T − a 0T ∼ N 0, τ 2 with a given variance of τ 2 .
A few special values of τ may be of interest. When τ = 0, the Bayesian analysis corresponds to the plug-in CBI analysis. Another reference value for τ is V α 1T , which is the estimated standard error (SE) for the expected response from the MMRM analysis. This assumes that the mean profile for patients who drop out of the active treatment varies around the expected control group with a variation equal to the estimated variation for the mean from the MMRM analysis. Because the prior distribution is independent of the data, the variance of the estimated mean effect for the active treatment group, Thus, we consider a third reference value for τ such that V θ CBI Solving this equation, we get We can estimate this τ by using the variance estimates under MAR and CBI, and the observed proportionf 1T . With this value of τ , the variance for the treatment difference of the Bayesian sensitivity analysis would be similar to the one from the CBI analysis based on MI with common combining rules, and it has the same interpretation as a δ-adjusted tipping point analysis (Liu and Pang 2017). We illustrate these methods in three case studies in Section 4.

Antidepressant Trial Data (DIA Missing Data Working Group)
This publicly available dataset from the DIA missing data working group (https://www.lshtm.ac.uk/research/centres-projectsgroups/missing-data) is based on a longitudinal study of an antidepressant drug. The study randomized 171 patients (one patient with intermittent missing values was removed) to an active test drug (n = 83) and a placebo (n = 88). The primary efficacy is assessed using the Hamilton Depression 17item total score (HAMD-17) in terms of change from baseline at week 6. The HAMD-17 was collected at baseline and weeks 1, 2, 4, and 6. Overall, approximately 24% of patients in the active drug group and 26% of patients in the placebo group dropped out before week 6. We consider an MMRM model to define the hypothetical parameters of the expected change from baseline over time for each treatment group and adjust for the baseline values at each time point. The results from MMRM and CBI using the Bayesian and MI approaches for the treatment difference are presented in Table 2. As expected, the MMRM results from likelihood-based method are very similar to that from the Bayesian analysis with non-informative prior. For CBI estimands, the variance estimates from the Bayesian approach are smaller than those obtained from the MI with common combination rules. Sensitivity analysis is conducted with a few choices of τ values, as discussed in Section 3.2. The estimated SE for the mean HAMD-17 change from baseline at the last time point for the active drug group in MMRM is approximately 0.8. The solved τ = 2.19, 2.97, 2.05 for CR, J2R, and CIR, respectively, using Equation (5). The corresponding Bayesian sensitivity analysis results are shown in the last section of Table 2. The point estimates remain similar to those in the other CBI analysis, but the SE is close to the SE obtained for the MI analysis. For J2R, the Bayesian sensitivity analysis had slightly small variance than that from the MI which made the upper bound of the 95% CI slightly less than 0 as compared to a positive upper bound from the MI analysis. Figure 1 shows the mean and credible intervals with values of τ from 0 to 4 to explore the sensitivity of the CBI assumptions. The results become insignificant when τ is between 3.0 and 3.6 for the J2R, CR, and CIR. These tipping points are higher than the reference values of 0.8 and those calculated from Equation (5), implying that the results are robust against additional variation in the CBI assumptions.

Schizophrenia Trial Data
This dataset was created from a multicenter, randomized, double-blind clinical trial involving patients who were diagnosed as having schizophrenia , and missing values follow a monotone missing data pattern. For simplicity, only data from the active treatment and placebo groups were used, because the test drug showed no efficacy. The study randomized 44 and 76 patients to active treatment and placebo groups, respectively. The primary efficacy is assessed with the Positive and Negative Syndrome Scale (PANSS) total score, which was measured at baseline, on day 4, and in weeks 1, 2, 3, and 4 after randomization. Overall, approximately 18% of the patients in the active treatment group and 25% of those in the placebo group dropped out before week 4. We consider the conventional MMRM model to define the hypothetical parameters of mean change from baseline over time for each group and the slopes for baseline at each time point. The results from MMRM and CBI using the Bayesian and MI approaches for the treatment difference are presented in Table 3. For CBI estimands, the variance estimates from the Bayesian approach are smaller than those obtained from the regular MI.
Using τ as calculated in Equation (5), we present the results for the sensitivity analysis for CBI in the last section of Table 3. Because no results for the CBI analyses are significant, no graph is presented for the sensitivity analysis in this example.

A Subset from an Antidepressant Study
In this example, we took a random subset of 200 patients (100 each in the active drug group and the placebo group) from another antidepressant study. The primary efficacy is assessed using MADRS in terms of the change from baseline at the last time point. The MADRS was collected at seven post-baseline time points. Overall, approximately 15% of the patients in the active drug group and 19% of those in the placebo group  dropped out before the end of the study. We consider the MMRM model to define the hypothetical parameters of the mean change from baseline over time for each group and the slopes for baseline at each time point. The results from MMRM and CBI using the Bayesian and MI approaches for the treatment difference are presented in Table 4. Similarly, the CBI analysis based on Bayesian approach have smaller variance estimates than those using regular MI. The estimated SE for the mean MADRS change from baseline at the last time point for the active drug group in MMRM is approximately 0.65. The solved τ = 1.52, 1.95, 1.38 for CR, J2R, and CIR, respectively, using Equation (5). The corresponding Bayesian sensitivity analysis results are shown in the last section of Table 4. We can see that the point estimates remain similar to those in the other CBI analysis, but the SEs are close to the SEs from MI with common combination rules. The results are still significant for CR, J2R, and CIR with the τ values calculated from Equation (5). Figure 2 shows the mean and CI for values of τ ranging from 0 to 13. The results are insignificant when τ is between 10 and 12.5 for the J2R, CR, and CIR. These values are higher than the reference values of 0.65 and those calculated from Equation (5), implying that the analysis results are robust against the additional variation in the CBI assumptions.

Discussion
ICH E9(R1) provides a framework to clarify trial objectives and estimands in handling ICEs and missing data in longitudinal clinical trials. In this article, we focused on two estimands under the hypothetical strategy proposed in ICH E9(R1). These hypothetical estimands may help trial sponsors understand the effects of an active treatment when there is no confounding from rescue medications. Each hypothetical estimand corresponds to a population parameter of interest under different assumptions about the values for patients who drop out before the primary analysis time point. The first estimand corresponds to a pharmacologic effect of the active drug under the hypothetical condition that all patients in the study continue the treatment up to the primary analysis time point. For estimation of this estimand, one might assume the post ICE data are MAR. The second estimand evaluates the attributable treatment effect from the active treatment, assuming that patients who drop out of the study would continue in the trial with the control treatment and have outcomes similar to those of patients in the control group (for CBIs), or patients who drop out would not be taking any medication (e.g., no alternative therapy is available) and the treatment effects prior to discontinuation would be eliminated such that the patients would return to their baseline status. This estimand aims to assess the effect of the test drug without confounding from other medication. Data after ICEs are ignored and assumed to be missing. This estimand is different from the treatment policy estimand where the data after ICEs may be collected and used in the analysis. For the treatment policy estimand, partial data after ICE may be used in the analysis, which will lead to larger SE as compared to the primary estimator of J2R. This is because of the assumption that the "true" mean profile after patients drop out in the active treatment group is equal to the mean profile of the placebo group for J2R. This assumption is strong but possibly conservative. It should be noticed that all assumptions for the missing data (i.e., MAR, CBI, and RTB) are untestable with observed data.
We describe Bayesian methods to implement the analysis for these estimands and missing data handling using available software, such as SAS PROC MCMC and STAN. The methods are applied to three case studies. The results show that the variance estimates from the Bayesian approach are smaller than those obtained from the conventional MI with common combination rules. Another advantage of using a Bayesian approach is the ability to conduct sensitivity analyses for the CBI analysis. The introduction of a prior distribution for the mean parameter for patients who drop out relaxes the assumption that these mean parameters for patients on active treatment after dropout are equal to those of control-group patients. The sensitivity analysis can be evaluated by increasing the variance of the prior distribution, so that the mean for dropouts can vary around the mean parameter. A few reference values may be considered for the additional variability parameter τ in the prior, for example, the square root of the estimated variance from the MMRM analysis, or a value such that this Bayesian sensitivity analysis produces variance estimate for a treatment effect that is similar to the variance obtained from MI with common combination rules. This Bayesian sensitivity analysis also ensures that the variance of the treatment difference will increase as τ increases, avoiding the information-positive problem discussed in Cro, Carpenter, and Kenward (2019).
A possible limitation of the Bayesian analysis is the need to define prior distributions for model parameters. These distributions could have significant influence when the number of units in the trial is small. However, this limitation exists with any "Bayesianly proper" MI procedure. We have illustrated the Bayesian method as primary and sensitivity analyses for the CBI-related hypothetical attributable estimand. A similar approach can be applied to the RTB-related estimand and to other methods, such as carried last expected value forward (Carpenter, Roger, and Kenward 2013). In addition, a δ-adjustment can be considered for the mean of the prior distribution-i.e., replacing a 0T with δ + a 0T .
This allows the assumed mean for patients who drop out to be worse or better than the mean of the control group, depending on the conditions of the trial. In conclusion, the proposed Bayesian methods are flexible tools for handling missing data in longitudinal clinical trials. In this article, we have limited our discussions to continuous endpoints. Further research is needed for other types of endpoints, such as binary, categorical, and time to event.