A Semiparametric Approach for Analyzing Nonignorable Missing Data

In missing data analysis, there is often a need to assess the sensitivity of key inferences to departures from untestable assumptions regarding the missing data process. Such sensitivity analysis often requires specifying a missing data model which commonly assumes parametric functional forms for the predictors of missingness. In this paper, we relax the parametric assumption and investigate the use of a generalized additive missing data model. We also consider the possibility of a non-linear relationship between missingness and the potentially missing outcome, whereas the existing literature commonly assumes a more restricted linear relationship. To avoid the computational complexity, we adopt an index approach for local sensitivity. We derive explicit formulas for the resulting semiparametric sensitivity index. The computation of the index is simple and completely avoids the need to repeatedly fit the semiparametric nonignorable model. Only estimates from the standard software analysis are required with a moderate amount of additional computation. Thus, the semiparametric index provides a fast and robust method to adjust the standard estimates for nonignorable missingness. An extensive simulation study is conducted to evaluate the effects of misspecifying the missing data model and to compare the performance of the proposed approach with the commonly used parametric approaches. The simulation study shows that the proposed method helps reduce bias that might arise from the misspecification of the functional forms of predictors in the missing data model. We illustrate the method in a Wage Offer dataset.


Introduction
Missing data arise frequently in studies across different disciplines, including public health, medicine, economics, and social sciences. Missingness can be due to nonresponse in household surveys, attrition in longitudinal studies, or patient noncompliance in experimental studies and clinical trials. In missing data analysis, ignorability has been a standard assumption regarding the missing data mechanism (Rubin 1976). Under ignorability, a valid likelihood/Bayesian inference can ignore modeling the missing data mechanism. In the likelihood/Bayesian inference, ignorability holds under the assumptions of missing at random (MAR) and parameter distinctness (Rubin 1976, Heitjan andRubin 1991).
Although ignorability is a convenient and useful assumption, it is usually an approximation to reality when missingness is not by design. There are important situations where this assumption can be questionable. In the analysis of potentially nonignorable missing data, the selection model is a popular class of models, where one augments the model for the complete data with a missing data model. In practice, a parametric binary regression model has been commonly employed for modeling the missing data process. In situations where there are continuous missingness predictors, misspecifying the functional forms of these predictors by the parametric model can lead to severe bias in the inference of the primary parameters of interest. In this paper, we propose a data-driven procedure to adaptively choose the functional forms of the continuous predictors. Specifically, we propose using the generalized 2 additive model (GAM) to relax the linearity assumptions in the missing data model.
One can perform a direct estimation of such a semiparametric joint selection model (Chen and Ibrahim 2006). The direct estimation can yield valid inferences when the model is correctly specified, although its computation is very heavy and requires specialized programming. More importantly, such a joint selection model is often weakly or non-identified (Little 1995, Troxel 1998, Chen and Ibrahim 2006, and the results can be highly sensitive to untestable model assumptions (Kenward 1998). To tackle this problem, the use of a nonignorable selection model has been proposed as a tool for assessing sensitivity of inference to nonignorable missingness (Little 1995, Vach and Blettner 1995, Copas and Li 1997, Scharfstein et al. 1999, Copas and Eguchi 2001. A global sensitivity analysis usually involves repeatedly fitting the nonignorable model for a range of magnitudes of nonignorability, which can be computationally burdensome. To avoid the computation burden, we utilize an index approach to local sensitivity (Troxel et al. 2004) that uses a Taylor series expansion to approximate the estimates in the neighborhood of the MAR model. The index method has been applied in various settings Heitjan (2004, 2009), Ma et al. 2005, Xie (2008, 2009), Qian and Xie 2010, Zhang and Heitjan (2006, 2007).
The local sensitivity method proposed in the literature utilizes the linear logistic regression for modeling the missing data process. In this article we relax the linearity assumption and investigate the usage of a generalized additive model for a more robust and flexible modeling of the missing data mechanism. Furthermore, the proposed method is computationally less complex than the alternative global sensitivity method. Specifically, our approach avoids fitting any complicated semiparametric joint selection model. Only estimates from a MAR analysis of the outcome model and a MAR GAM for the missing data process are required to evaluate sensitivity. Both estimates can be obtained using standard software packages such as SAS or S-Plus/R. For instance, the MAR GAM for the missing data process can be fitted using PROC GAM in SAS or the S-Plus/R function gam. In summary, the proposed approach renders sensitivity analysis for nonignorable missingness (1) simple to perform by avoiding excessive additional computation and (2) robust to model misspecification by automatically adjusting for potentially complex missing data mechanisms.
The rest of the paper is organized as follows. In Section 2 we describe the semiparametric joint selection model. In Section 3 we review the ISNI (index of local sensitivity to nonignorability) methodology. In Section 4, we investigate the use of GAM to model the missing data process. We also consider the possibility of a nonlinear relationship between missingness and the potentially missing outcome, and present specific formulas of the sensitivity index in the Appendix when the relationship follows a quadratic form. In Section 5, we conduct simulation studies to compare the performance of the parametric and semiparametric approaches for modeling the missing data process with respect to their ability to reduce bias of the MAR estimates. In Section 6, we apply the methodology to an application on estimating wage offer function. We conclude with a discussion in Section 7.

Selection Model for Nonignorable Missingness
Consider that we have data (Y i , X i , Z i , G i ) from the unit i, i = 1, · · · , n. The underlying ideal outcome, Y i , arises independently from a distribution with a probability density function f θ (Y i |X i ), where X i contains a set of fully observed covariates. We are interested in drawing inferences on θ or a subset of it. Here for the purpose of this presentation, we restrict our attention to the case where Y i is a univariate. For various reasons, Y i is subject to missingness. Let G i be 1 if y i is observed and 0 if y i is missing. We assume the following h is the inverse of a monotonic link function; Z i is a set of fully observed predictors for missingness; η γ 0 (·) and η γ 1 (·) are smooth functions, and their functional forms are unspecified now; Let γ = (γ 0 , γ 1 ) where γ 0 is a vector of parameters that associates the probability of missingness with observed data, and γ 1 associates the probability of missingness with potentially unobserved data. In the model, η γ 1 (y) represents the form of nonignorable missingness: when η γ 1 (y) is constant in y, the missing data mechanism is MAR; when η γ 1 (y) depends on y, it becomes missing not at random (MNAR).
Let (Y, X, Z, G) be the stacked data over all the units. We rewrite Y as (Y obs , Y mis ), where Y obs refers to the observed components of Y and Y mis refers to the missing components of Y . The covariates X and Z are considered as fixed, and the conditioning on them in f θ (y i |x i ) and f γ (g i |y i , z i ) will be suppressed for notational simplicity. The data to be modeled are thus (Y obs , G). The correct log-likelihood of the model parameters is L C (θ, γ; y obs , g) ∝ ln f θ,γ (y obs , g) = ln f γ (g|y obs , y mis ) f θ (y obs , y mis ) dy mis . (2) Under the MAR condition, f γ (g|y obs , y mis ) = f γ (g|y obs ) and can be moved out of the integral. With the parameter distinctness, this results in a simpler log-likelihood for θ L I (θ; y obs ) ∝ ln f θ (y obs ) = ln f θ (y obs , y mis ) dy mis .
In practice, the simpler log-likelihood L I is often used because it avoids modeling the missingdata mechanism. However, in the general case of nonignorable missingness, L I is not proportional to L C and thus the inference based on L I is potentially biased.

ISNI Methodology
As indicated above, for a dataset with missingness, the correct log-likelihood is L C (θ, γ; y obs , g).

5
One can then vary γ 1 in a plausible range and investigate how the other parameter estimates in the model are affected. In the existing literature, such sensitivity analysis commonly assumes a linear binary regression model for missing data process, that is, a restriction that η γ 0 (z i ) = γ T 0 z i and η γ 1 (y i ) = γ 1 y i . In this selection model, the likelihood L C (θ; γ 1 ) is proportional to the likelihood L I (θ) for all θ ∈ Ω θ when γ 1 = 0.θ(0) is then the MLE of θ in the ignorable model. The difference betweenθ(0) andθ(γ 1 ) is a sensible measure of the sensitivity of the MLE when γ 1 is perturbed around the ignorable model. The idea of a local sensitivity analysis is to approximateθ(γ 1 ) by a Taylor series expansion as follows: where ∂θ(γ 1 ) measures the changing rate ofθ as a function of γ 1 , and this quantity is referred to as the index of local sensitivity to nonignorability (ISNI) (Troxel et al. 2004). In this approximation,θ(0) is obtained by maximizing the simpler log-likelihood L I . As shown in Troxel et al. (2004), a simple formula for ∂θ(γ 1 ) with The first term ∇ 2 L θ,θ is the observed Hessian matrix of the ignorable model that is usually readily available. The second term evaluates the orthogonality of θ and γ 1 . One limitation of ISNI, as a local sensitivity method, is that the above local approximation might not be sufficiently accurate for extreme nonignorability (i.e. |γ 1 | is large). Thus, ISNI is most useful for moderate nonignorability (e.g. a rich set of observed predictors for missingness has been conditioned on so that the remaining nonignorability is not extreme).

Extending ISNI using a Generalized Additive Model
When some components of Z are continuous, a linear predictor, as considered above, may not be adequate, and the misspecification of the functional forms for Z may lead to severe bias in the estimation of θ, the parameters of primary interest. It is thus desirable to use extended models that describe a wider range of selection mechanisms. In this section, we investigate the use of GAM for a robust and flexible modeling of the missingness probability.
That is, rather than prespecifying η γ 0 (Z i ) as a linear form, we let η γ 0 (Z i ) follow a GAM (Hastie and Tibshirani 1990): where Z i is composed of m missingness predictors, (Z 1 , ..., Z m ), and η 0j is an arbitrary smooth and mean zero function for the jth covariate Z j , j = 1, ..., m. Because of the additivity in the nonparametric component, the model is termed as generalized additive model.
Another important modeling decision is to specify η γ 1 (y i ). Note that y i is missing whenever G i = 0, and thus an attempt to estimate η γ 1 (y i ) would inevitably require imposing untestable assumptions or using external data. Because of the lack of information from the data at hand for identification, a feasible approach is to perform sensitivity analysis with respect to η γ 1 (y i ). In the existing literature, it is common to assume a linear form for η γ 1 (y i ), i.e., η γ 1 (y i ) = γ 1 y i . One benefit of this parametrization is the ease of interpreting the sensitivity analysis result. Although the linearity assumption is reasonable in many practical applications, it is by no means universally applicable. Thus, we will base our development on a more general functional form of η γ 1 (y i ) as Q q=1 γ 1q y q i , where Q is a user-specified order. One can consider using the following penalized log-likelihood of the resulting semi-7 parametric nonignorable selection model for sensitivity analysis: where λ j ≥ 0 is a smoothing parameter whose value can be adjusted to avoid overfitting.
The optimization can be heavy (e.g. taking a long time to converge) and it also requires specialized programming. Moreover, this optimization needs to be repeatedly performed for a range of γ 1 values, which further compounds the computational burden.
In contrast, the ISNI method substantially reduces the computational workload. As the parameters γ 0 = (γ 00 , η 01 , · · · , η 0m ) in the missing data model are orthogonal to the parameter θ in the complete data model when γ 1 = 0 ( i.e., missingness is MAR), one can show that for a vector γ 1 = (γ 11 , · · · , γ 1Q ), we have where h i = h(γ 00 + m j=1η 0j (Z ij )) is the predicted probability of being observed under the MAR model, and h(·) is the inverse of the logit link. The above calculation requires only the MAR estimates, which are obtained by optimizing the following log-likelihood: In particular, the calculation of the extended ISNI requires the estimation of η 0 (Z) under the MAR model, i.e., γ 1 = 0. The fit maximizes the following penalized log-likelihood: The conventional algorithm for the estimation of a GAM is the local scoring procedure (Hastie and Tibshirani 1990), which maximizes the above PELL. Equation (6) shows that a missing observation is given more weight in the calculation of the ISNI if its predicted probability of nonmissingness, h i , is large ( i.e., unexpected missing). The quantity h i is related to the missing data mechanism and plays an important role in assessing the sensitivity.
Using a generalized additive model to describe the missing-data mechanism is useful here because we need accurate and robust estimates of these probabilities of being observed.
For ease of interpretation, one can reparameterize η γ 1 (y) = γ 11 Q q=1 r q y q , where r q = γ 1q /γ 11 , and then define ISNI r = ∂θ(γ 11 , r) ∂γ 11 where r = (r 1 , · · · , r Q ) T . Given a user-specified r, one can approximate the potential change ofθ(γ 1 ) when γ 11 is perturbed from 0 to a given value as follows: Using the extended ISNI method, it is convenient for a data analyst to entertain plausible choices of Q and r to explore the sensitivity with respect to the functional forms of η γ 1 (y).
In the Appendix, we derive explicit ISNI r formulas for Q = 2 ( i.e., a quadratic function) when the outcome is modeled by a generalized linear model.

A Comparison Using Simulated Data
In this section, we conduct simulation studies to compare the performance of the parametric and semiparametric approaches for modeling the missing data process. Specifically, we simulate data from both linear and nonlinear missing data models, and then investigate if the ISNI based on a GAM missing data model provides a more faithful and robust adjustment of an MAR estimate than those based on various linear logistic missing data models. We follow the steps below to perform the simulation studies.
Step 1: Generate the hypothetical complete data, (Y i , X i ), independently from the following bivariate normal distribution: where i = 1, · · · , n and the sample size n = 500. The parameter ρ takes the value of -0.5, 0, or 0.5. We are interested in the conditional distribution of where β 1 = ρ is the parameter of interest.
Step 2: Generate the missingness pattern. Y i is subject to missingness with the probability of nonmissingness given by the following missing data model: According to the exact form of η γ 0 (x), we have the following configurations: • Case 1.
Step 3: With each generated dataset, we compute the MAR estimate,β 1 (0), by applying the least-square fitting of the regression model, using only the cases with observed y i . Based on Equation (4), we calculate three ISNI-adjusted estimates as follows: The three ISNIs, ISNI L , ISNI P , and ISNI G , are listed in order of increasing generality to model the missing data process. The most constrained one is ISNI L , whose calculation assumes a priori that η 01 (x i ) in Equation (9) is linear in x i as η 01 (x i ) = γ 00 + γ 01 x i . A more general one, ISNI P , increases modeling flexibility by manually adding higher-order polynomial terms for x i (i.e., quadratic term, cubic term, · · · ). This process stops when adding the next higher-order term of x i into the missing data model does not significantly improve the model fit at the 0.05 level, where the improvement in model fitness is measured by the difference in model deviance. This analysis strategy represents a common parametric approach to seek more acceptable models for the missing data mechanism. The most general one, ISNI G , uses the GAM method to estimate the missing data model. It uses a nonparametric scatterplot smoother, such as a smoothing spline method, for the estimation of η γ 01 (x i ) and lets data tell the functional form of x i . As compared with ISNI P , ISNI G enjoys two advantages. GAM is more general as it applies to arbitrary smooth functions whereas the parametric additive model applies to a priori specified parametric family (e.g. a polynomial family). Another important benefit with ISNI G is the automation of the procedure which avoids manually increasing the model complexity For our simulation model, the formula for ISNI is derived to be is the MAR estimate of the residual variance, andβ 0 (0) andβ 1 (0) are the MAR estimates of the β 0 and β 1 , respectively; x i = [1, x i ] T is the vector of predictors for the unit i; h i is the predicted probability of G i = 1 under the MAR model.
where J is the selected order of the polynomial function of x i . For ISNI G , h i = h(γ 00 (0) +η γ 01 (x i )), whereη γ 01 (x i ) is adaptively estimated by a smoothing spline under the MAR assumption.
The gam function in S-Plus with a default degree of freedom of 4 is used for smoothing.
In practical applications, one can calculate the ISNI-adjusted estimates in Equation (10) for a plausible range of γ 1 values, and investigate the sensitivity of the MAR estimates to nonignorable missingness. In the simulation studies, we plug in the true value of γ 1 . The performance of these ISNI-adjusted estimates can then be evaluated in terms of their ability to reduce bias of the MAR estimates, for various scenarios of missing data mechanisms.
Step 4: Repeat Step 1 to Step 3 for 300 times for the same values of ρ and γ 1 . Using the resulting sample of estimates, we compute the mean squared error (MSE), bias, and standard deviation (SD) for each of the four estimators of β 1 :β 1 (0),β 1L (γ 1 ),β 1P (γ 1 ),β 1G (γ 1 ). We then repeat Step 4 for other configurations of ρ and γ 1 . This indicates that ISNI is an accurate sensitivity index and can effectively reduce the bias of the MAR estimate when the missing data mechanism is correctly specified.
Though all three adjusted estimates remove the bias of the MAR estimator when η γ 01 (x) is linear in x, the effectiveness in doing this can be very different for the other forms of η γ 01 (x). We study the simulation results in the following three key aspects. (1) If η γ 01 (x) is actually quadratic or cubic,β 1L (γ 1 ) has a significant amount of bias under case 2. This can be seen from the V-shaped bias function ofβ 1L (γ 1 ) for Quadratic and Cubic in Figure   1 (b). In comparison, bothβ 1P (γ 1 ) andβ 1G (γ 1 ) perform much better in removing the bias, as shown by their flat bias functions at a close-to-zero value in these figures. This shows that the misspecification of the missing data model can lead to large bias for the adjusted parameter estimates in the complete data model, and it can be important to choose a proper missing data model. (2) Interestingly, in case 1, the bias ofβ 1L (γ 1 ) is almost the same as those ofβ 1P (γ 1 ) andβ 1G (γ 1 ), even for the quadratic and cubic form of η 01 (x i ). This shows thatβ 1L (γ 1 ) has a certain degree of robustness with respect to the misspecification of the missing data model. (3) When η 01 (x i ) follows a sine form, bothβ 1L (γ 1 ) andβ 1P (γ 1 ) have sizable biases, andβ 1G (γ 1 ) performs best. This shows that ISNI G is most general, and it can be important to use a data-driven approach, such as a GAM, to model the continuous predictors in the missing data model.
These findings from the simulation studies suggest that the adjusted estimator based on ISNI L , which assumes a linear logistic regression, has a certain degree of robustness to misspecification of missing data mechanism. There can be, however, situations where ISNI L is seriously affected by the misspecification of functional forms in the missing data model.
In this case, both ISNI P and ISNI G are useful to protect one from having a misleading assessment of the potential change of the estimates. In particular, ISNI G performs better due to its modeling generality and more robust and automated process to discover the complex missing data process. Due to the availability of standard software for fitting a GAM, the success of ISNI G in reducing the bias of the MAR estimates depends less on the experience of the data analyst to detect model misspecification, as compared with ISNI P .
6. An Application Mroz (1987) used a wage offer dataset to demonstrate the sensitivity of empirical econometric analysis to various economic and statistical assumptions. Many of these assumptions, though useful, are often untestable and thus it is insufficient to base conclusions solely on a single analysis. A more prudent approach is to compare the analysis with those obtained under alternative assumptions. If the conclusions are reasonably robust, one can have more confidence about the conclusions drawn. To demonstrate our method, we will mainly focus on the potential misspecification of functional forms in the missing data mechanism.
The interest of the empirical application is to estimate the wage offer outcome as a function of education level and experience, after controlling for other observed characteristics of an woman. That is, one is interested in estimating the following linear regression model: where i = 1, · · · , 753, lwage is the logarithm of the wife's wage; educ is the wife's years of schooling; exper is the wife's labor market experience and expersq is the square term of exper; age is the wife's age; nwif einc is the non-wife family income; kidslt6 is the number of children less than 6 years and kidsge6 is the number of children between 6 and 18 inclusive.
Not every married woman in the sample had her wage outcome observed. Among the 753 married women in the sample, 325 married women did not participate in the labor force and as a result their wage outcomes, if employed, were missing. It is possible that the participation of a married woman in the labor force depends on her potential wage outcome, even after conditioning on the other observed variables. In order to account for this potential nonignorability, we assume the following model for self-selection to employment: where G i is the indicator variable for participation in the labor force. As a comparison, we will also consider the following linear logistic labor participation model: logit(P (G i = 1)) = (intercept, educ, exper, age, nwif einc, kidslt6, kidsge6) T i γ 0 + lwage i γ 1 . ISNI/σ |, where S.E. is the standard error of a parameter estimate under the MAR model and σ is the standard deviation of Y . The c value implies that for ISNI to be equal to one S.E., |γ 1 | needs to be at least c/σ, which under the logit link corresponds to a magnitude of nonignorability such that a change of σ/c in Y is associated with a change of e 1 = 2.7 in the odds of being observed. Thus, the c statistic represents the critical magnitude of nonignorability, above which the bias due to nonignorable missingness is larger than the sampling error and therefore causes concern. The smaller the c statistic is, the larger the sensitivity to nonignorability is. Following Troxel et al. (2004), we suggest using c = 1 as a cutoff value for important sensitivity as this implies that the bias will be in the same size as the sampling error for a moderate nonignorability where a change of one−σ in Y is associated with a change of 2.7 in the odds of being observed.
The c statistics summarized in Table 1 show that the MAR estimates of both educ and exper are sensitive to nonignorable missingness in the outcome. Both MAR estimates have c statistics less than 1 and this is so when the missing data model is either logistic or GAM.
Thus, the conclusions regarding the sensitivity of these two important estimates are robust to the choice of missing data model. The conclusions regarding expersq and nwif einc, however, depend on the choice of the missing data model. With the linear logistic model, we find that the MAR estimates for expersq and nwif einc have c statistics of less than 1, indicating that both MAR estimates are sensitive to nonignorable selection for labor force participation. With the GAM model, these MAR estimates have c statistics of larger than [hp]  1, indicating that these two MAR estimates are not sensitive to nonignorability.
Using ISNI, we can also calculate the adjusted estimates when γ 1 , the parameter for nonignorable selection, is perturbed from zero. A positive value of γ 1 is plausible because it is highly unlikely that one will decline a job offer when the offered wage is high. Here we consider γ 1 = 1/σ, which corresponds to a magnitude of nonignorability where a change of one standard deviation in lwage corresponds to the odds ratio of labor force participation being 2.7. In the wage offer dataset, the MAR estimate of σ is 0.72. Therefore, as the offered wage changes by a fold of e 0.72 = 2.1, the odds of labor force participation change by 2.7. This seems to be a moderate nonignorability. The resulting adjusted estimates for this moderate nonignorability are reported under the column "MAR Est. + ISNI/σ" in Table 1. With this γ 1 value, we see that the adjusted estimates for educ and exper become larger than the corresponding MAR estimates, which implies that the MAR estimates likely underestimate A chi-square test shows that this nonlinear trend is statistically significant (p-value= 0.01). It is plausible that this nonlinear relationship between experience and labor force participation drives the difference in the ISNI values.
The above ISNI analysis assumes that a logit transformation of the probability of missingness depends on lwage in a linear form. In the section S.1 of the online Supplement, we conduct additional analyses where the missingness depends on lwage in a quadratic form.
The analysis shows somewhat smaller assessment of sensitivity for some parameter estimates.

Discussion
It has been recognized that measuring the sensitivity of the inference to alternative missing data assumptions is an important component of data analysis. Such analysis often requires positing a missing data model. There typically exists little information to test the assumptions in the missing data model. Thus, it is desirable to utilize a model that covers a wide range of selection mechanisms. In this article, we propose using a semiparametric approach to adaptively choose the functional form of the continuous predictors for missingness.
We have investigated the consequences of misspecifying a nonignorable missing data model using the simulation study and real data analysis. Specifically, we investigate the performance of ISNI, a recently proposed local sensitivity index of nonignorability, under the misspecification of missing data model. We found that ISNI has some robustness to misspecifying the functional form of the predictors for missingness. There exist, however, important situations where the consequence of misspecification in the missing data mechanism can be significant. In these cases, using more flexible missing data models can help protect the analysis from such misspecification. We recommend the semiparametric sensitivity index that uses a GAM approach for modeling missing data process, due to its modeling generality and the automated feature of the procedure. The semiparametric index enables us to model a larger class of missing data mechanisms than the usual linear logistic model or parametric nonlinear additive model does. The automation of the procedure is also an important benefit, especially when many continuous predictors for missingness exist, and how they affect missingness is not well understood. In these situations, it is cumbersome, if not infeasible, to manually choose proper higher-order terms and/or transformation for each continuous predictor. The more automated fitting of the missing data mechanism that uses a GAM substantially reduces the time and efforts invested in such a modeling exercise.
This is particularly helpful in light of the fact that modeling the missing data mechanism is usually not of primary interest for a study, but has to be properly dealt with in order to draw correct conclusions about the main interest of the study.
The sample sizes in our analyses are reasonably large and are commonly seen in practice.
When data are sparse, GAM, as a non-parametric method, might not perform well. In this scenario, one may consider using recently-developed sparse additive model techniques (Ravikumar et al. 2009), that combines the idea from sparse modeling and additive nonparametric regression.
The proposed semiparametric index is substantially easier to compute than the alternative global sensitivity method because there is no need to fit any nonignorable model. Thus, it can be ideal for quickly and robustly measuring the sensitivity of a standard analysis to nonignorable missingness. If the sensitivity is small, then the standard analysis is considered trustworthy. Otherwise, one might need to collect more data to better understand the missing data mechanism (Hirano et al. 2001, Qian 2007. The semiparametric index can be useful to robustly identify the situations where one may need to take the route.
In this article, we have also extended ISNI to situations where missingness depends on the missing outcome through a polynomial function. We have derived explicit ISNI formulas when the nonignorable missingness follows a quadratic form and illustrated its use in the wage offer dataset. This extension makes the index applicable to a broader range of applications where investigators suspect that the nonignorable missingness might be of a 21 22 complex relationship and would like to investigate the sensitivity under such a belief.
The proposed method can be generalized to multivariate outcomes with nonignorable missingness. Qian and Xie (2010) develop local sensitivity methods for various types of longitudinal data with both dropout and intermittent missingness, resulting in a general pattern of missingness. In their application, the predictors for the missingness are all categorical variables. In other longitudinal applications where the missingness predictors contain continuous variables, a linear logistic missing data model may lead to erroneous conclusions. In this case, the proposed semiparametric index method can be extended to provide a more robust method to measure the impact of nonignorable missingness in longitudinal data analysis.
η γ 1 (y) = γ 11 y + γ 12 y 2 . Specifically, we develop these formulas when the outcome Y i follows a generalized linear model (GLM), which assumes that Y i is independent with density f θ (y i ) = exp where λ i is the canonical parameter; functions b(·) and c(·, ·) determine a particular distribution in the exponential family; a(τ ) = τ /w, where τ is the dispersion parameter and w is a known weight. Note that the quadratic function of η γ 1 (y) does not apply to binary outcomes.

Normal Distribution
For a normal linear model, Y i ∼ N (x T i β, τ ), where τ = σ 2 . Then E(Y 2 )=E 2 (Y ) + τ , and according to Equation (7), for a given value of r 2 , the index for the regression parameter is: (1 + 2r 2μi )x i h i .
whereμ i = x T iβ , andβ andτ are the MAR estimates of β and τ , respectively.

Poisson Distribution
For Poisson outcome, we have E(Y 2 )=E 2 (Y ) + E(Y ). Assuming the canonical log link: ln E(Y i ) = ln µ i = x T i β, and the dispersion parameter τ = 1, then according to Equation (7), for a given value of r 2 , the index for the regression parameter is: (1 + r 2 + 2r 2μi ) exp(x T iβ )x i h i .
whereμ i = exp(x T iβ ), andβ is the MAR estimate of β.