Multiply Robust Weighted Generalized Estimating Equations for Incomplete Longitudinal Binary Data Using Empirical Likelihood

Abstract In clinical trials, missing data may lead to serious misinterpretation of trial results. To address this issue, it is important to collect post-randomization data (such as efficacy measurement data and adverse event onset data). Such post-randomization data are called auxiliary variables and they can be useful for constructing missingness and imputation models. A multiply robust estimator using an empirical likelihood method was previously proposed by Han and Wang and by Han. However, that estimator was developed for cross-sectional data and situations in which no auxiliary variables are missing. This is contrary to actual clinical trial settings, in which some auxiliary variables will invariably be missing. Consequently, to apply Han’s method to longitudinal data, missing auxiliary variables need to be imputed. This article proposes a new method that extends Han’s method to a longitudinal outcome model by applying weighted generalized estimating equations with new weights. Monte Carlo simulations of a repeated binary response with missing at random dropouts demonstrated that the proposed estimator is multiply robust and exhibits better performance than that of augmented inverse probability weighted complete-case estimating equations under several simulation scenarios. We also successfully applied the proposed method to plaque psoriasis study data.


Introduction
Consider a randomized, placebo-controlled, double-blind, parallel-group, phase III trial that collects longitudinal data and assesses the drug efficacy at a planned last time point.The estimand is the adjusted odds ratio between the improvement proportions of the treatment groups when all participants complete the study, which corresponds to a parameter of the treatment group in a fitted regression model.In such clinical trials, some of the participants are lost to follow-up, resulting in missing primary endpoints.If the missing data mechanism is missing at random (MAR) or missing not at random, completecase analysis may lead to a biased estimate of the treatment effect (Little and Rubin 2002).
To address this issue, Han and Wang (2013) and Han (2014) proposed a multiply robust estimator using an empirical likelihood (EL) method (Owen 2001).Han and Wang (2013) and Han (2014) derived a weight based on the EL method and solved the weighted estimating equations to obtain a multiply robust estimator.This method has two advantages.First, it prevents the weight values from being extreme.In the inverse probability weighting (IPW) method (Horvitz and Thompson 1952), the probability of measurements being observed is modeled, and the complete-case data are weighted by the inverse of that probability; however, this method has low estimation efficiency, especially when the estimated probability of being observed is close to zero (i.e., an extreme weight) (Kang and Schafer 2007).Second, the multiply robust estimator allows multiple missingness and imputation models to mitigate the impact of model misspecification; if a correct model is included among these models, the treatment effect estimator is consistent.In the missingness and imputation models, post-randomization data (e.g., an efficacy measurement and an onset of adverse event) are sometimes included as explanatory variables, that is, auxiliary variables.These variables are possibly influenced by treatment, correlated with missingness on an outcome and/or outcome itself, and improving efficiency (Mallinckrodt 2013).Han and Wang (2013) and Han (2014) assumed the response variable to be a scalar and that all auxiliary variables are observed.However, in actual clinical settings, some auxiliary variables are usually missing.
In this article, we propose a new method that extends Han's method to a longitudinal outcome model by applying weighted generalized estimating equations (wGEE) with new weights.The resulting estimator is multiply robust, in the sense that the treatment effect estimator is consistent if any one of the specified missingness or imputation models is correct.Our method accounts for the within-subject correlation structure of the repeated measurement.Thus, the proposed estimator achieves better performance than that of Han in situations where some of the auxiliary variables are missing (i.e., in coarsened longitudinal data analysis).
A few doubly robust methods for application to longitudinal data with dropout have been proposed.Bang and Robins (2005) and Seaman and Copas (2009) proposed an approach that uses augmented inverse probability weighted complete-case estimating equations (AIPW).The estimator is constructed by the inverse probability weighted GEE and augmentation term imputed with Paik's imputation method (1997), and allows the use of only one missingness model and one imputation model, thus, achieving double robustness.Others (Han, Song, and Wang 2015;Han 2016) also proposed estimators using the EL method, which considers longitudinal data with dropouts.The estimator proposed by Han, Song, and Wang (2015) is based on the numerical implementation of the conditional EL method, and allows the use of only one missingness model and one imputation model, thus, achieving double robustness.The estimator proposed by Han (2016) is based on a calibrating method for missingness probability using data from past visits, and allows the use of only one missingness model and multiple imputation models.When the missingness model is correctly specified, intrinsic efficiency is guaranteed.However, the method targets estimation of a mean of a response at the end of a longitudinal study instead of estimation of a parameter (such as a treatment effect); thus, the estimator most recently proposed by Han (2016) was developed under different conditions compared to Han and Wang (2013), Han (2014), and our proposed method.
The remainder of this article is organized as follows.In Section 1.1, we describe the plaque psoriasis trial used as a case study.In Section 2, we introduce the proposed method and relevant notation.In Section 3, we outline Monte Carlo simulation studies conducted to evaluate the performance of the proposed method.In Section 4, we describe the application of the proposed method to the plaque psoriasis study data.Finally, we provide a brief discussion and the conclusions of the study in Section 5.

Case Study
The efficacy and safety of the study drug M518101 were evaluated in a randomized, placebo-controlled, double-blind, parallel-group, multi-center, phase III study in participants with plaque psoriasis (registered at https://clinicaltrials.gov as NCT01878461).Efficacy assessments were taken at baseline and Weeks 2, 4, 6, and 8.The primary efficacy endpoint was the success rate based on the Investigator Global Assessment (IGA) score at Week 8.The proportions of missing data at Week 8 were 14.5% (53/365) and 13.0% (23/177) in the M518101 and placebo groups, respectively.If, for example, only the participants who showed improvement completed the trial, the complete-case analysis could be biased.In this study, because the estimand was the treatment effect supposing that all subjects adhered to the treatment and completed the study, it was also necessary to consider the influence of the dropout mechanism.

Notation
Let y ij denote the binary efficacy outcome of a participant i(= 1, . . ., N) at a time point j(= 1, . . ., T), and Y i = (y i1 , . . ., y iT ) T , where stochastic independence of y i 1 j and y i 2 j , i 1 = i 2 , is assumed.Let X i and R ij denote the design variables (e.g., treatment, baseline covariates) and the indicator variable of observing y ij , respectively.The post-randomization data (i.e., auxiliary variables) are represented as Z ij , and To simplify an explanation, we assume that Z ij includes only y ij .
If both y ij and Z ij are observed, R ij = 1; otherwise, R ij = 0.The situation is reasonable because it is common that these data are observed simultaneously at each subject-visit.We assume a monotone missing data pattern and R i1 = 1.Thus, if R ij = 0, then R ik = 0 (j < k).Let n i denote the number of observed measurements of participant i.

Models and Estimation
Let E y ij |X i = μ ij β 0 denote the outcome model with an arbitrary link function (e.g., logit link), where β 0 represents the u-dimensional coefficients of the mean regression of y ij on X i .
Our main interest is to estimate the parameter of the treatment effect at the last visit of the longitudinal study (j = T).The proposed method allows multiple missingness and imputation models, derives a weight including the multiple model information based on the EL method, and then solves the empirical likelihood weighted estimating equations for β 0 .Let β EL denote the proposed estimator for β 0 .If any one of the multiple models is correct, β 0 can be consistently estimated.In this study, S missingness models and K imputation models are considered, respectively, where α s and γ k are the corresponding parameters.Note that the imputation models can utilize the postrandomization data (i.e., auxiliary variables Z i,(j−1) ) and are specified at each time point; thus, they differ from the outcome model μ ij β 0 .In the proposed method, two types of (w)GEE for β 0 were performed, other than β EL , to derive an EL weight.
Let β s wGEE and β k GEE denote the parameters of the (w)GEE models.The parameters β s wGEE , β k GEE , and β EL are identical for the correct missingness or imputation model.A detailed explanation of each model can be found in Sections 2.3 and 2.4.
Using π s ij αs , the wGEE analysis (Robins, Rotnitzky, and Zhao 1995) is performed.The estimating equations for β s wGEE for s = 1, . . ., S are as follows: where as a diagonal matrix factor, V i is the working covariance matrix of Y i which is assumed as i , where A i is a T × T diagonal matrix whose jth diagonal element is the variance functions of μ ij βs wGEE , and R(ρ) is a T × T working correlation matrix with ρ jj = 1 and ρ jj as a correlation between time points j and j'.The predicted value μ i,(j−1) βs wGEE is used to derive a weight ŵij (see Section 2.5).When the missingness model includes y i,(j−1) as Z i,(j−1) , a missing y i,(j−1) is imputed with the predicted value μ i,(j−1) βs wGEE .

Imputation Model
For the imputation model, to derive the expected value of y ij for each participant and time point, we use the sequential imputation method of Paik (1997), in which it is assumed that (2) Following (2), under the MAR assumption, a missing y ij can be imputed using a regression model for This model is fitted to y ij , which is either observed or has already been imputed.Because a monotone missing data pattern is assumed, the maximum number of missing patterns is T. As shown in Figure 1, a T × T table is constructed, where the rows are the missing patterns and the columns are the time points j.Each y ij is assigned to each cell, and the shaded cells are missing y ij .Sequential imputation is performed as in the following procedure.
Let φ k ijd (γ k jd ) (k = 1, . . ., K) denote a set of multiple outcome regression models for estimating a predicted value of y ij , where d(= 1, . . ., D) is the notation for a diagonal pattern.
Using φ k ijd γ k jd , the GEE analysis is also performed.The estimating equations for β k GEE for k = 1, . . ., K are as follows: Note that βk GEE is estimated as if the entire data were available, except for the fact that missing y ij are replaced by φ k ijd γ k jd .The estimate βk GEE differs from βs wGEE , which is estimated using information on missingness model π s ij α s .Thus, βs wGEE and βk GEE are derived from different informations (i.e., missingness models and imputation models).After βk GEE is estimated, the function An observed value y i1 is replaced with ŷi1 , which is the estimate of the expected value of y i1 conditioned on X i .In addition, each observed value y ij (for j = 2,…, T) is replaced with φ k ij1 γ k j1 , which is derived from the imputation model described by (3) for the first diagonal cell.

Derivation of Weight ŵij Using EL Method
The proposed method assigns a weight w ij to each observation at all time points (using the element-wise EL method) (Wang, Qian, and Carroll 2010).To derive a weight ŵij for the proposed estimator, the following constraints are imposed: αs , βs wGEE / (NT) , and The equation can be also described as follows: When j ≥ n i + 1, w ij is defined as zero.Vector ĝij includes S functions for missingness model π s ij αs and K functions for GEE S k ij βk GEE , γ k jd ; that is, the missing data are imputed using a set of K imputation models.Similar to Han (2014), our proposed method employs the calibration method (Lumley, Shaw, and Dai 2011) are estimated by the unweighted sample averages over all the participants at all the time points (j = 1, …, T).The weight ŵij : i = 1, . . ., N, j = 1, . . ., n i is a set of weights assigned to the biased sample y ij : i = 1, . . ., N, j = 1, . . ., n i .Through (5), w ij corrects for the selection bias so that certain population quantities may be consistently estimated based on the biased sample.When a missingness model includes y i,(j−1) as Z i,(j−1) , the missing y i,(j−1) are imputed using μ i,(j−1) βs wGEE derived from (1) ŷi,(j−1) = R i,(j−1) y i,(j−1) + 1 − R i,(j−1) μ i,(j−1) βs wGEE .
Let m = N i=1 T j=1 R ij denote the total number of observed measurements over the time points.By maximizing N i=1 n i j=1 w ij subject to the constraints in (5) using Lagrange multipliers, the following set of equations can be obtained: where λ is an (S + uK)-dimensional Lagrange multiplier vector, corresponding to the functions ĝij α, βGEE , βwGEE , γ jd (Owen 2001).To estimate λ, we define Here, the existence and uniqueness of λ are guaranteed by the strict convexity of F m (λ) on D m .Let π 1 ij α 1 denote a correctly specified missingness model and α 1 0 denote the true value of α 1 .As in the method of Han (2014), λ = λ1 , . . ., λS+uK T can be rewritten where κ is an (S + uK)-dimensional Lagrange multiplier vector defined in Appendix A and κ converges to 0 in probability when any one of the missingness models is correctly specified.Thus, ŵij can be also described as follows: The details are provided in Appendix A

Proposed Estimator: EL wGEE
Using the weights ŵij described in the previous section, we define the EL wGEE as As in (1), W EL,i is a T × T diagonal matrix with the jth diagonal element being R ij ŵij .The solution to (8) can be obtained using the Newton-Raphson method.As in Han (2014), the proposed estimator has multiple robustness property.
Theorem 1.The proposed estimator βEL is consistent if any one of a set of S missingness models or K imputation models is correctly specified (see Appendixes A and B).
As several models need to be defined and estimates need to be prepared to execute the proposed method, the procedure is summarized as follows: 1  Han (2014), the first-order Taylor expansion of the EL wGEE around 0 T , α T * , β T GEE, * , γ T jd, * , β T 0 is a possible approach (Tsiatis 2006), where T , and the parameter β 0 is defined as the true value.We derived two formulas for the asymptotic variance of βEL under the assumption that κ, α, and γ jd are known/unknown parameters.The detailed explanation is presented in Web Appendix A of supplementary materials.
Under the assumption that each parameter is an unknown parameter and all auxiliary variables are observed, we derived the asymptotic variance estimator V emp1 and confirmed that the comparison of efficiency between βEL and βwGEE is inconclusive (see Web Appendix A of supplementary materials).
As the asymptotic variance estimator V emp1 is valid only when any one of the missingness models is correctly specified, we additionally derived the asymptotic variance estimator V emp2 under the assumption that κ, α, andγ jd are known parameters.
Theorem 2. Under the assumption that κ, α, and γ jd are known parameters, where We also suggest a bootstrapping method as alternative to calculate the variance of βEL .

Simulation Studies
We performed Monte Carlo simulations to evaluate the performance of the proposed method and estimated the bias, standard error (SE), mean square error (MSE), and Type-I error rate.We set the sample size n to 75 or 150 in each treatment group with 1000 Monte Carlo simulations.In the simulation study, we primarily investigated the following considerations: (a) whether EL wGEE exhibits multiple robustness compared to the bias of logistic regression and GEE; (b) whether the efficiency of EL wGEE is comparable to that of wGEE when the missingness model is correctly specified; and (c) whether EL wGEE exhibits adequate performance compared to that of AIPW when multiple models are specified.

Simulation Datasets
We generated the full dataset for correlated binary outcomes y ij (j = 1, 2, 3, 4) using Lee's method (1997).The marginal model for y ij was defined as logit P(y ij = 1) = β 0 + β TRT X TRT,i + β 1j + β 2j X TRT,i + β 3 X BL,i .The model had four independent design variables: the treatment X TRT,i (group A: 1, group B: 0), time points, interactions between the treatment and time points, and baseline covariate X BL,i , which was generated from N(0, 2).The parameters β TRT = 1.5, β 14 = β 24 =0, β 3 = 0.2, and the other parameters were obtained from the given marginal response rates over time, and were considered under two scenarios as follows:

Scenario
Response rate at time points 1, 2, 3, 4 Low-response 0.25 0.31 0.38 0.48 for group A 0.14 0.15 0.17 0.17 for group B High-response 0.27 0.43 0.73 0.82 for group A 0.18 0.21 0.50 0.50 for group B The within-subject correlation structure was set as the autoregressive (1) type, and the correlation parameter ρ was set to 0.5, which was set based on the plaque psoriasis study data.In this simulation, our primary objective was to evaluate the estimation efficiency (bias, SE, and MSE) of the treatment effect at j = 4, which corresponds to β TRT = 1.5.In addition, a Type-I error rate was evaluated by additionally generating nullhypothesis datasets for β TRT = 0, β 21 = β 22 = β 23 = β 24 = 0.
For comparison with the proposed estimator (EL wGEE), β TRT was also estimated using a logistic regression model (CLUDE), GEE (Liang and Zeger 1986), wGEE (Robins, Rotnitzky, and Zhao 1995), and the AIPW method (Seaman and Copas 2009).In CLUDE, the outcome model was defined as

Simulation Results
For each of the 48 ways that data were generated (Low/Highresponse rate × six missing data mechanisms × 15%/30% missing data proportion × two pattern of sample size for 75/150 per treatment group), we compared the five methods of analysis.Of these six missing data mechanisms, all analysis methods using the coarsened datasets from (pt5) α = (α 0 , 1, 1, 1) and (pt6) α = (α 0 , 0, 1, 1) had unbiased estimates of β TRT , thus, the results for these two scenarios are not shown.
When estimating the parameters of missingness/imputation models, quasi-complete separation occurred in some simulation datasets.Those datasets were therefore excluded from the analysis.The number of quasi-complete separations is shown in Web Appendix B of supplementary materials.We consider that the impact of excluding these datasets on the performance evaluation of each analysis method is small.We checked the distribution of the remaining datasets (i.e., this excluded the datasets in which quasi-complete separation occurred) and found that the mean values of βTRT in the complete datasets were similar among the 1000 Monte Carlo datasets and the remaining datasets (the difference in the mean values of βTRT ranged from 0.0001 to 0.001).This implies that there are no characteristic differences among the 1000 Monte Carlo datasets and the remaining datasets.In addition, the quasi-complete separations in each scenario (except missing mechanism of pattern 2) were approximately less than 10 out of 1000 datasets; thus, we consider that the performance of each estimator could be evaluated with the remaining datasets.We report the simulation results for the scenario of 30% missing data proportion and n = 150 because the pattern of bias, SE, MSE, and Type-I error rate in the scenario of 15% missing data proportion and n = 75 were similar, but the bias was smaller in the scenario of 15% missing data proportion.These results for the scenario of 15% missing data proportion and n = 75 are summarized in the Web Appendix.As discussed by Emerson and Owen (2009) and Han (2014), the empirical likelihood method may be problematic when the sample size is small, in which case 0 may not be in the convex hull of ĝij α, βGEE , βwGEE , γ jd .In the simulation studies, there was no difficulty with estimating parameter λ.
Figure 2 reports on the bias of βTRT for n = 150 and 30% missing data proportion.The notation of these estimators includes the names of the estimators and the model specification, which have the form "method-0000. " Each digit in the four-digit number, from left to right, indicates whether or not jd is included, where "1" indicates "included" and "0" indicates "not included." In the simulation results, EL wGEE exhibited better performance when a correct model was included among the missingness or imputation models compared to when only misspecified models were included, excluding pt3 of the low-response rate datasets.
wGEE exhibited similar performance to that of EL wGEE under the correctly specified missingness model.
AIPW exhibited similar performance to that of EL wGEE in most of the scenarios, especially when one missingness model and one imputation model are specified and both models are correctly specified (i.e., EL wGEE-1010 and AIPW-1010).The similarity of the estimators is explained in Appendix C.
jd is included, where "1" indicates "included" and "0" indicates "not included." When estimating the parameters of the missingness/imputation models, quasi-complete separation occurred in some of the simulation datasets.Those datasets were excluded from the analysis.The detailed values are shown in Web Appendix B of supplementary materials.
Interestingly, the bias for AIPW was larger than that for EL wGEE when only the imputation model was correctly specified (i.e., AIPW-0110) in the pt1, pt3, and pt4 of the high-response rate datasets.In addition, the bias for AIPW was larger than that for EL wGEE when both missingness and imputation models were misspecified (AIPW-0101).When there is a higher risk of misspecification for the missingness model, EL wGEE may be more robust than AIPW and wGEE.
Although GEE resulted in biased estimates in most of the scenarios, as expected, GEE interestingly exhibited better performance than EL wGEE in pt3 of the low-response rate datasets and pt2 of the high-response rate datasets.The pt2 (α = (α 0 , −1, 1, 1)) describes a situation in which subjects who are treated with an active drug are vulnerable to dropout because of an adverse event and an outcome is assumed to be an indicator of improvement (i.e., if y ij = 0, the subject is at a higher risk of dropout).The pt3 (α = (α 0 , 1, −1, −1)) describes a situation in which subjects who are treated with a placebo is vulnerable to dropout because of lack of efficacy and an outcome is assumed as an indicator of worsening of symptoms (i.e., if y ij = 1, the subject is at a higher risk of dropout).In addition, GEE also exhibited better performance than wGEE when the missingness model was misspecified.Thus, in some scenarios, GEE might be applied in missing longitudinal data analysis.
Web Tables 5-8 of supplementary materials report on the SEs of βTRT 1000 i=1 SE βTRT,i /1000 .The SEs of EL wGEE were calculated based on (9).The SEs of AIPW were also calculated using the sandwich estimator under the assumption that both the missingness and imputation models are correct (Seaman and Copas 2009, p. 944).In addition, the empirical SE, which is the standard deviation in a 1000 replication parameter estimate βTRT , is shown to confirm the validity of these SEs.Although the SEs of EL wGEE, AIPW, and wGEE were similar to or slightly larger than those of CLUDE and GEE, the difference became smaller in the scenario of 15% missing data proportion (Web Tables 6 and 8).We expected that the efficiency (SEs) of the EL wGEE method would be improved owing to use of the longitudinal response variable; however, adding weights to the estimating equations may have a greater influence on the estimation efficiency.Overall, although the results of MSEs were similar to that of SEs, the MSEs of EL wGEE were comparative to those of CLUDE even under 30% missing data proportion.Other results are summarized in Web Tables 9 to 12 of supplementary materials.
The results indicate that the degree of inefficiency of EL wGEE relative to the CLUDE or GEE estimator depends on the proportion of missing data.At least, the impact on efficiency was small in applying the proposed methods in this setting.Figure 4 reports on the Type-I error rate (using nullhypothesis datasets for β TRT = 0, β 21 = β 22 = β 23 = β 24 = 0).In general, the Type-I error rates of EL wGEE approximately ranged from 4% to 6% and were similar to those of other methods in each scenario.

Application: Plaque Psoriasis Phase III Study
In this section, we describe the application of the proposed method to the Plaque Psoriasis Phase III Study.In total, 542 participants were randomized in a 2:1 ratio to the M518101 or placebo group.Efficacy assessments were taken at baseline, Weeks 2, 4, 6, and 8.The primary endpoint was the success rate based on the IGA score at Week 8.In other efficacy outcomes, we evaluated the modified Psoriasis Area and Severity Index (mPASI, in the range 0-no disease to 64.8-maximal disease) and itch score (on a scale from 0-None to 3-Severe).The treatment effect βTRT was estimated using the CLUDE for the complete case, GEE, wGEE, AIPW and the proposed estimator (EL wGEE).Let X TRT denote the treatment (M518101 = 1, placebo = 0) and X BL the baseline mPASI score.We used three auxiliary variables Z = (Z 1 , Z 2 , Z 3 ), where Z 1 , Z 2 , and Z 3 indicate the success rate according to the IGA score, mPASI score, and itch score at a previous visit, respectively.
It is desirable to include all the components of the auxiliary valuables and the interactions in one missingness model and one imputation model; however, this was not feasible in this case, due to the quasi-complete separation for the binary outcome model.Nevertheless, the proposed estimator was able to address this issue via multiple modeling.The missingness model was , where the two missingness models (s = 1, 2) were set as follows: π 1 ij α 1 with auxiliary variables Z 1 , Z 2 , and X TRT × Z 2 , and π 2 ij α 2 with an auxiliary variables Z 3 , and X TRT × Z 3 .The imputation model for y ij was logit P( As with the missingness model, the two imputation models (k = 1, 2) were set as follows: φ 1 ijd (γ 1 jd ) with auxiliary variables Z 1 , and Z 2 , and φ 2 ijd (γ 2 jd ) with an auxiliary variable Z 3 .The GEE analysis was performed using the same settings as in Section 3.
As running example, the proposed method was performed as follows: 1.Estimated the parameter of missingness models for ν 1 ij α 1 and calculated values of π 1 ij α1 in each participant and time point i = 1, . . ., N, j = 1, . . ., n i .The same analysis was performed with ν 2 ij α 2 , and calculated values of π 2 ij α2 .

Performed wGEE analysis with a combination of weight
and outcome Z 1 , where the outcome model was defined as logit imputation was performed using the imputation model φ 2 ijd γ 2 jd .Similar to φ k ijd γ k jd , missing Z 2 were imputed with the predicted values of Z 2,ij using the imputation model for Z 2,ij with auxiliary variables y i,(j−1) and Z 2,i,(j−1) ; missing Z 3 were imputed with the predicted values of Z 3,ij using the imputation model for Z 3,ij with auxiliary variables y i,(j−1) and Z 3,i,(j−1) .7. Performed the GEE analysis, where the outcome model was defined as logit P( 11.Estimated λ using (7), and calculated weight ŵij using (6).12. Performed an EL wGEE analysis with weight ŵij , and obtained βEL .
GEE analysis (Liang and Zeger 1986) and wGEE analysis (Robins, Rotnitzky, and Zhao 1995) correspond to the EL wGEE analysis in step ( 12), where the weight ŵij is replaced by identity The results are summarized in Table 1.In Table 1, the odds ratio of CLUDE was 4.62, while that of EL wGEE was 3.83 to 4.08, and that of AIPW was 3.93 to 4.05, respectively.We speculated that these estimators mitigated the bias of CLUDE.In addition, these findings were more deeply evaluated by the multiple modeling of EL wGEE-1111, which included all components of auxiliary valuables.Therefore, the proposed method was useful for evaluating the stability of the conclusion.

Discussion and Conclusions
The proposed method extends Han's method to the longitudinal outcome model.The simulation results indicate that the EL wGEE method achieves better performance if any one of the specified missingness or imputation models is correct (multiple robustness).
In these simulation studies, we also evaluated the performance of this method when multiple missingness and imputation models were used.Greenland and Pearce (2015, p. 99) state that "Methods that model both outcome and exposure (including doubly robust methods) avoid having to make a choice, but at the cost of more modeling effort.They have the The names of the estimators have the form "method-0000, " with each digit of the four-digit number, from left to right, indicating whether or not jd is included, where "1" indicates "included" and "0" indicates "not included." In the results of the proposed method and AIPW method, only two or three models are shown; EL wGEE-1111 for the multiply robust model and EL wGEE-1010, and EL wGEE-0101 (or AIPW-1010, AIPW-0101) for the doubly robust model.
option of using more data information with potential accuracy gains as a result." Although diagnostic measures have been widely used in model assessment, it is difficult to apply them to incomplete outcome and auxiliary data; consequently, there is significant concern regarding model misspecification in such a situation.In addition, when a dimension of auxiliary variables is high (i.e., there are many auxiliary variables), there is a risk of (quasi-) complete separation for the binary outcome model.In such cases, multiple lower dimensional models can be used to resolve the issues.However, there have been only a few studies on the performance evaluation of the proposed method when misspecified models are included; thus, it might not be preferable to use multiple models without a strong reason.We recommend specifying one missingness model and/or one imputation model for the primary analysis and additional models for the sensitivity analysis.The need for sensitivity analysis regarding missing data is established, and its importance is emphasized in the ICH E9 (R1) (2019).Notably, the multiple modeling method allows for a wide range of sensitivity analyses.
Although the imputation approach in the proposed method was under MAR assumption, the approach can be modified to be under missing not at random (MNAR), such as "jumpto-reference (J2R)" to perform a sensitivity analysis.In J2R (Mallinckrodt, Molenberghs, and Rathmann 2017), missing data for reference group are imputed assuming MAR; missing data for drug-treated group are imputed assuming MNAR such that the benefit from the drug immediately disappears after discontinuation of the study drug.Therefore, parameter γ jd was estimated using data only for the reference group.Note that since this analysis is under MNAR, specifying missingness models π s ij αs under MAR is not reasonable.In some simulation datasets, especially those with lowresponse rate and n = 75, quasi-complete separation occurred.In the scenario of pt2 (α = (α 0 , −1, 1, 1)) and low-response rate, the phenomenon was often observed because most subjects drop out after y ij = 0.In this situation, where it is difficult to estimate the parameters of the missingness/imputation models, we may choose logistic regression or the GEE method-for which there is no need to specify missingness/imputation models.
In Web Appendix C, we additionally compare the performance of EL wGEE to that of Han's method, in which we specify an imputation and missingness model similar to that of the proposed method and use the imputation method of Paik.If explanatory variables y i,(j−1) and z i,(j−1) were missing in the missingness model of Han's method, we imputed them using last observation carried forward (LOCF) method to calculate θ 1 = N i=1 π 1 i4 α1 /N .The EL wGEE method achieved better performance with Bias, as compared to that of Han's method with LOCF in such an eventuality.It appears that the coarsened explanatory variables Z ij (i.e., y i3 and z i3 ) affect the accuracy of βTRT .When a subject i drops out at or before time point 3, the π 1 i4 α1 cannot be estimated, because the explanatory variable Z i3 (i.e., y i3 and z i3 ) is missing.In addition, the observation probability π 1 i4 α1 is estimated for all of the participants to calculate θ 1 = N i=1 π 1 i4 α1 /N .To cope with this problem, π 1 i4 α1 was estimated substituting y i2 or y i1 in place of y i3 in Han's method with LOCF in our simulation studies (the same applies to z i3 ), but that might introduce bias.In contrast, the proposed method can impute the missing Z i3 (i.e., y i3 and z i3 ) with the predicted values derived from wGEE.The proposed method can therefore be used for longitudinal data analysis.
In the psoriasis clinical trial data analysis, although it was not feasible to include all the components of the auxiliary valuables as well as the interactions caused by the quasi-complete separation of the binary outcome model, such as in wGEE and AIPW.The proposed estimator was able to address this issue via multiple modeling and enable the further evaluation of the validity of the conclusions derived.
Although we assumed that the outcome is binary data and that Z ij includes only y ij to simplify the explanation presented in Section 2, the proposed method can easily be extended to continuous and count data outcomes and used to handle multiple auxiliary variables.In terms of multiple robustness, the proposed method is expected to be more robust and useful than existing methods when the dimension of the auxiliary variables is high, and/or reasonable multiple candidate models are available.Future work should further investigate the relative performance of the other methods under a wider range of settings.Equation (C.1) is similar to (4) of the AIPW described in Han, Song, and Wang (2015).
Calibration weights are defined by matching the values of the calibration variables based on the sampled participants to the corresponding known population values.In this setting, probability limits of αT = α1 T , . . .,

Figure 2 .
Figure 2. Bias of βTRT in each estimator obtained using 1000 Monte Carlo datasets (n = 150 in each treatment group; proportion of missing data at j = 4 is 30%).CLUDE: logistic regression; wGEE: weighted generalized estimating equations; AIPW: augmented inverse probability weighted complete-case estimating equations (Seaman and Copas 2009); EL wGEE: proposed estimator.The results were multiplied by 100 to reduce the number of decimal places.The notation of these estimators in the table includes the names of the estimators and the model specification, which have the form "method-0000. " Each digit in the four-digit number, from left to right, indicates whether or not π1 ij α 1 , π 2 ij α 2 , φ 1 ijd γ 1 jd , or φ 2 ijd γ 2jd is included, where "1" indicates "included" and "0" indicates "not included." When estimating the parameters of the missingness/imputation models, quasi-complete separation occurred in some of the simulation datasets.Those datasets were excluded from the analysis.The detailed values are shown in Web Appendix B of supplementary materials.

Figure 3
reports on the MSEs of βTRT for n = 150 and

Figure 3 .
Figure 3. Mean square errors (MSE) of βTRT in each estimator obtained using 1000 Monte Carlo datasets (n = 150 in each treatment group; proportion of missing data at j = 4 is 30%).CLUDE: logistic regression; wGEE: weighted generalized estimating equations; AIPW: augmented inverse probability weighted complete-case estimating equations (Seaman and Copas 2009); EL wGEE: proposed estimator.The results were multiplied by 100 to reduce the number of decimal places.The detailed values are shown in Web Appendix B of supplementary materials.

Figure 4 .
Figure 4. Type-I error rates in each estimator obtained using 1000 Monte Carlo datasets (n = 150 in each treatment group; proportion of missing data at j = 4 is 30%).Full dataset: logistic regression with full dataset; CLUDE: logistic regression; wGEE: weighted generalized estimating equations; AIPW: augmented inverse probability weighted complete-case estimating equations (Seaman and Copas 2009); EL wGEE: proposed estimator.The results were multiplied by 100 to reduce the number of decimal places.The detailed values are shown in Web Appendix B of supplementary materials.
Calculated predicted values of Z 1 using the wGEE model.4. The procedures described in (2) and (3) performed again with a combination of R ij /π 1 ij α1 and Z 2 as well as R ij /π 2 ij α2 and Z 3 to calculate predicted values of Z 2 and Z 3 .5. After missing Z 1 , Z 2 , and Z 3 were imputed with the predicted values of (3) and (4), π s ij αs , βs wGEE , s = 1, 2 for all the participants at all the time points i = 1, . . ., N, j = 1, . . ., T were calculated.These estimates π s ij αs , βs wGEE , s = 1, 2 were used in (10).6. Performed Paik's imputation method using the imputation model φ 1 ijd γ 1 jd and calculated predicted values φ 1 ijd γ 1 jd of y ij in each participant and time point.The same Paik's y ij ⊥ R ij X ij , Z i,(j−1) and θ 1 α1 ,

Table 1 .
Parameter estimates βTRT in the plaque psoriasis study.