Empirical likelihood and estimation in a partially linear varying coefficient model with right censored data

In this paper, we study the empirical likelihood and estimation of parameters of interest in a partially linear varying coefficient model with right censored data. Two cases are considered: censoring is independent of the covariates, and censoring depends on the covariates. The bias-corrected empirical log-likelihood ratios for the regression parameters are presented, Wilks' theorem is proved, and thus the confidence regions of the regression parameters are constructed. The estimators of parametric and nonparametric components are constructed, their asymptotic distributions are obtained, and the consistent estimators of the asymptotic variances are also given. Furthermore, the partial profile empirical log-likelihood ratios for each component of the regression parameters and the coefficient functions are constructed, and show they are asymptotically chi-squared. The obtained results can be directly used to construct the confidence regions/intervals for the regression parameters and the pointwise confidence intervals for the coefficient functions. Our approach is to directly calibrate the empirical log-likelihood ratio so that the resulting ratio is asymptotically chi-squared, undersmoothing of the coefficient functions is avoided, and existing data-driven methods can be effectively used to select the optimal bandwidth. Simulation studies compare the empirical likelihood method with one based on normal approximation and perform real data analysis.


Introduction
The semiparametric varying coefficient model is an important class of statistical models in statistics.A striking feature of such models is that they avoid the curse of dimensionality because the varying coefficients are all univariate functions.Consider the partially linear varying coefficient model where Y is a scalar response variable, which may be a known monotone transformation of the survival time of interest, X and Z are p × 1 and q × 1 vectors of the covariates, respectively, U is a random variables taking values on the closed interval I, a(u) is a q × 1 vector of the unknown coefficient functions on I, β 0 is a q × 1 vector of the unknown regression parameters, and ε is a random error with E(ε|U, X) = 0 almost surely.For simplicity, without loss of generality, we assume that I = [0, 1].This paper assumes that the response Y is right censored.That is, instead of Y, we observe V = min{Y, C} and the indicator = I(Y ≤ C) of the event (Y ≤ C), where C is a censoring variable.Suppose that, given W = (U, X T , Z T ) T , C is independent of Y.If C is independent of the covariate W, the distribution function of C is G(t) = P(C ≤ t).If C depends on the covariate W, the conditional distribution function of C given W is G(v|w) = P(C ≤ v|W = w).Let Y has the distribution function F(y) = P(Y ≤ y), and let F(y), G(t) and G(v|w) be continuous.For the distribution function G(t), let Ḡ(t) = 1 − G(t) and b G = sup{t| G(t) < 1}.The two symbols can also be used for other distribution functions.Suppose that b F < b G in this paper.Model (1) is one of the most widely used class of regression models in statistical analysis.This model has both parametric and nonparametric components, thus providing strong explanatory power and flexibility.Model ( 1) is widely used in many fields, such as biology, medicine and economics.For cases with no censored data, there are some statistical methods and theoretical results in the existing literatures.Here are some of the main results.Li et al. [1] proposed a local least-squares method with a kernel-weight function.Zhang et al. [2] proposed an estimation procedure based on local polynomial fitting.Ahmad et al. [3] presented an efficient estimation for model (1).Fan and Huang [4] proposed a profile least-squares technique to estimate the parametric components in model (1) and applied the generalized likelihood techniques to the test problem of nonparametric components.Zhao and Xue [5] studied variable selection in the partial linear varying coefficient errorsin-variables model.Zhang et al. [6] studied the profile inference of partial linear varying coefficient errors-in-variables models under restricted condition.It is interesting to model the right censored data using model (1).Zhao [7] presented an adjusted empirical likelihood (EL) ratio for the parametric components of model (1.1) with censored data and proved that the ratio is asymptotically standard chi-squared.The basic idea of his approach is to use an adjustment factor multiplied by an EL ratio to obtain an adjusted EL ratio, whose asymptotic distribution is the standard chi-squared distribution.Bravo [8] considered the estimation and test for model (1) when the response variable is subject to random censoring.They proposed an estimator and some test statistics for parametric and nonparametric components and proved their asymptotic properties.Chen et al. [9] studied the quantile regression analysis of right censored and length-biased data, and presented a semiparametric varying-coefficient partially linear model.To estimate the regression parameters, they developed a three-stage procedure using the inverse probability weighting technique and established the asymptotic properties of the resulting estimators.Since model (1) with right censored data has a vector of the unknown coefficient functions, while the distribution function G(t) and the conditional distribution function G(v|w) of the censoring variable C are also unknown, we need to replace them with their estimators when constructing the EL ratio of β 0 .This creates a bias in this ratio.If this bias is not removed, the constructed EL ratio is asymptotically a weighted sum of the independent chi-squared variables, each with 1 degree of freedom and an unknown weight.Thus, this ratio cannot be directly used for the statistical inference of β 0 .There are two common ways to solve this problem: one is to estimate the unknown weights to simulate the distribution of this ratio; and another approach is to use an adjustment factor multiplied by this ratio to obtain an adjusted EL ratio, which asymptotic distribution is the standard chi-squared distribution.However, neither of these methods embodies the essential nature of EL.Therefore, a new method to construct the EL ratio is needed.That is our motivation to study this question.This paper aims to propose a bias-corrected method to construct the EL ratio and estimation of the parameters of interest in model (1) with right censored data.Two cases are considered: the censoring variable C is independent of the covariate W, and the censoring variable C depends on the covariate W. We propose bias-corrected empirical log-likelihood ratios and construct confidence regions for the regression parameters.We also construct estimators of both parametric and nonparametric components.We present their asymptotic distributions and construct consistent estimators of the asymptotic variances.Furthermore, we construct a partial profile empirical log-likelihood ratio for each component of the parameters of interest, and show that each ratio is asymptotically standard chi-squared.The obtained results can be directly used to construct the confidence regions of the regression parameters and the pointwise confidence intervals of the coefficient functions.The following two desired features deserve mentioning.First, we directly calibrate the EL ratio so that the resulting EL ratio is asymptotically standard chi-squared.The ratio does not need to be multiplied by an adjustment factor.This avoids estimating the unknown adjustment factor, and thus improving the accuracy of the confidence regions.Second, by using bias correction method in constructing EL ratios and estimators, undersmoothing of the coefficient functions is avoided, so that existing data-driven algorithm can be used to select the optimal bandwidth of the estimator for the coefficient function a(u).The structure of the rest of this paper is as follows.Section 2 is methodology.For the regression parameters, we construct a bias-corrected EL ratio, a maximum EL estimation and a partial profile EL.For the coefficient functions, we construct local linear estimates, residual-adjusted EL ratios and partial profile EL ratios.Section 3 presents some theoretical results.Section 4 presents some simulation studies and real data analysis.Section 5 is the conclusion remarks.The proofs of theorems are placed in the Appendix.

Methodology
In this section, we investigate how to construct the EL ratio, and estimators of the regression parameter β 0 and the unknown function a(•).The proposed method includes two scenarios: the censoring variable C is independent of the covariate W, and the censoring variable C depends on the covariate W. Section 2.1 presents a bias correction method to construct the EL ratio of β 0 .Section 2.2 constructs the estimators of β 0 and a(•).The EL ratio and the maximum EL estimate of a(•) are shown in Section 2.3.In Section 2.4, we use the results of Sections 2.1-2.3 to construct confidence intervals for each component of β 0 and the pointwise confidence intervals for each component of a(u).Throughout this paper, we assume that the sample {

The bias-corrected EL ratio of β 0
In this section, we construct the bias-corrected EL ratio of β 0 in two cases, that is, the censoring variable C is independent of the covariate W, and the censoring variable C depends on the covariate W.

The censoring variable C is independent of the covariate W
Our main goal is to construct the EL ratio of the regression parameter β 0 when the censoring variable C is independent of the covariate W. For this purpose, we need to use the estimators of unknown functions G(•) and a(•).For G(v), we use its Kaplan-Meier estimator where V (1) ≤ . . .≤ V (n) are the order statistics of the V-sample, and (i) is the associated with V (i) , i = 1, . . ., n.We apply a local linear fitting approach to estimate the coefficient function a(u) = (a 1 (u), . . ., a p (u)) T .For any U in a small neighbourhood of u, the function a j (U) can be locally approximated by a linear function This leads to the following weighted least-squares problem: find where where I p is the identity matrix of order p and 0 p is the p × p zero matrix.The estimator ã1 (u; β 0 ) can be written as ã1 Substituting ( 3) into (1), we can obtain the approximation form of model (1): We treat (7) as a linear model and introduce the auxiliary random vectors where φn (s; β) = y≥s w∈R p+q+1 ϕ n (w, y; β) F (dw, dy), ( 10) and The latter two items on the right side of (8) play a very important role for bias correction.It is mainly used to reduce the bias caused by G n (•) − G(•).See the proof of Lemma A.1 in the supplementary material.If this bias correction is not used, the resulting EL ratio is asymptotically the weighted sum of the independent chi-squared variables.See Theorem 1 of Zhao [7].Since the bias correction method is used to construct the EL ratio ˆl1 (β 0 ), the asymptotic distribution of ˆl1 (β 0 ) is the standard chi-squared distribution.The result is given by Theorem 3.1 in Section 3. We should point out that Zhao [7] used the auxiliary random vectors to construct the empirical log-likelihood ratio ˜l(β 0 ) of β 0 .He proved that ˜l(β 0 ) is asymptotic to a mixed chi-squared distribution.It cannot be directly used for the statistical inference of β 0 because the weights are known.Zhao [7] used an adjustment factor multiplied by ˜l(β 0 ) so that the resulting EL ratio is standard chi-squared.However, this method adjusts the EL ratio externally, and does not reflect the nature of EL.Our method is to directly construct a bias-corrected EL ratio of β 0 such that this ratio converges in distribution to a standard chi-variable.Therefore, our method is fundamentally different from that of Zhao [7].The bias-corrected EL ratio was first introduced by Zhu and Xue [10] for a partially linear single-index model.Since then, this approach has been used in other semiparametric settings.To construct a bias-corrected EL ratio, the key is to construct the auxiliary random vectors.Bravo et al. [11] proposed a two-step semiparametric inference method.They give a general form of the auxiliary random vector, that is, where Z is a random vector with values on S Z ⊆ R d Z , θ 0 ∈ ⊂ R p represents the finitedimensional parameter of interest, η 0 denotes the possibly infinite-dimensional nuisance parameter, taking values in a semimetric space E, g(•): S Z × × E → R p is a vectorvalued measurable known function with E{g(Z, θ 0 , η 0 )} = 0, φ(•) is the so-called pathwise derivative, m(•) is called as an influence function, and h 0 may include η 0 and other nonparametric objects that may appear in the influence function as a result of 'functional differentiation'.Bravo et al. [11] defined the bias-corrected EL ratio via m(Z, θ 0 , ĥ) where ĥ is a consistent estimator of h 0 .The pathwise derivative φ(•) plays a fundamental role in their method, as it is used to construct m(•).For different models, the used function φ(•) is different.See for example, [10][11][12][13][14][15][16][17][18].How to construct φ(•)?This is a question worth exploring.Bravo et al. [11] gave the form of φ(•) via four examples.But they do not give a general way to construct φ(•).We give the specific forms of φ(•) and m(•) in (8).Bravo et al. [11] did not consider model (1) under the right censored data, so their examples in Section 4 do not include our work.Furthermore, Bravo et al. [11] imposed assumption A on m(•) to prove Wilks' theorem for the bias-corrected EL ratio.Their assumption A implies conditions (A0)-(A3) of Hjort et al. [19], so their Theorem 3.1 can be obtained from Theorem 2.1 of Hjort et al. [19].However, it is difficult to verify that ηi (β) satisfies the assumption A of Bravo et al. [11], because this condition is too strong, and in the auxiliary random vectors ηi (β) defined by (8), there are five nonparametric estimators Ĝ(•), μ1 (•), μ2 (•), Ĥ(•) and Ĥ0 (•).Our approach is to use relaxed regularity conditions to directly prove Lemmas A.1 and A.2 in the Appendix, and then use Lemmas A.1 and A.2 to prove Theorems 3.1 and 3.2 in Section 3. The conditions of our Lemmas A.1 and A.2 are weaker than the corresponding conditions in [11].It should be noted that the results of Lemmas A.1 and A.2 are similar to the conditions (A1)-(A3) of Hjort et al. [19].

The censoring variable C depends on the covariate W
As the referee pointed out, the unconditional independence between the censored response and the censoring variable itself is often restrictive; conditional independence (given covariates) is often a more natural and appropriate assumption, see for example Kalbeisch and Prentice (2002).In this case, it would be interesting to derive the bias corrected EL ratio of β 0 .When the censoring variable C depends on the covariates, the conditional distribution function of the censoring variable C, G(v|w), can be estimated using the estimator in [20] or [21].In practice, however, it is often difficult to specify a parametric model.An alternative approach is to use a nonparametric approach based on the Kaplan-Meier estimator.Specifically, the function G(v|w) can be estimated by where K(•) is a kernel function on R p+q+1 , and b = b n is a positive bandwidth sequence that converges to 0. When nb 3 / log 3 n → ∞ and inf {1 − H(v|w)} > 0, it can be followed from Theorem 3.2 of Du and Akritas [22] that uniformly for w ∈ R p+q+1 and v ∈ R, (17) where The kernel estimators of H(v|w) and H(v|w) are defined as and respectively, where W * ni (w) is defined in (16).By substituting G(•), H(•) and H 0 (•) in (8) with G(•|w), H(•|w) and H 0 (•|w), respectively, we can obtain the auxiliary random vectors where and 3), ( 6) and ( 12), respectively.Therefore, the bias-corrected empirical log-likelihood ratio function of β can be also defined as Similar to the proof of Theorem 3.1 in the Appendix, it can be proved that the asymptotic distribution of ˆl2 (β 0 ) is a standard chi-squared distribution with q degrees of freedom.The result is given by Theorem 1 in Section 3.

The EL ratio and maximum EL estimate of a(u)
To construct the EL ratio of a(u), we use the auxiliary random variables where G(•) is defined in (2), K(•) is a kernel function and h = h n is a bandwidth.Therefore, we obtain an empirical log-likelihood ratio function for a(u), say L(a(u)).However, to prove that L(a(u)) is asymptotically standard chi-squared, we must use the condition nh → ∞ and nh 5 → 0. Since this bandwidth condition does not contain the order cn −1/5 of the optimal bandwidth, where c is a positive constant.This creates a difficulty in choosing the optimal bandwidth, unless the coefficient function is done undersmoothing.The reason for this problem is that there is the item a(U i ) − a(u) in (30).To remove the bias caused by this item, we introduce the auxiliary random variables where Ĝ(•), K(•) and h are defined in ( 2) and (30), respectively, â1 (u) = ( â1 (u), . . ., âp (u)) T , â1 (u) = ( â1 (u), . . ., âp (u)) T , and âj (u) and âj (u) are the local quadratic polynomial estimations of ȧj (u) and äj (u), respectively.Thus, the residual-adjusted empirical log-likelihood ratio function of a(u) is defined as It can be shown that R 1 (a(u 0 )) is asymptotically standard chi-squared under the optimal bandwidth.This result is given in Section 3. We can maximize {− R 1 (a(u))} to obtain an estimator of a(u), say ǎ1 (u), called the maximum EL estimate.According to Qin and Lawless [23], ǎ1 (u) is the solution of the estimating equation where Similarly, G(V i −) and β1 in (31) are replaced with G(V i |W i ) and β2 , respectively, the auxiliary random variables are generated so that the residual-adjusted empirical log-likelihood ratio function of a(u), R 2 (a(u)), can be constructed.We can also maximize {− R 2 (a(u))} to obtain an estimator of a(u), say ǎ2 (u).A calculation formula of ǎ2 (u) can also be obtained by using G(V i |W i ) and β2 instead of G(V i −) and β1 in (33).Its expression is no longer listed.

Partial profile EL ratio
It is interesting to construct the confidence interval for each component of β 0 .Therefore, we need to construct the partial profile EL ratio.By (22), we can obtain the estimator of the kth component β 0k of β 0 , that is, βk = e T k β1 for k = 1, . . ., q, where e k denote the unit vector of length q with 1 at position k.Let for r = 1, . . ., q, where ηi (β) and Â1 are defined in ( 8) and (23), respectively.Thus the partial profile empirical log-likelihood ratio function of β r is defined as Under some regular conditions, we can prove that the asymptotic distribution of ˆl1,r (β 0r ) is the standard chi-squared distribution with 1 degree of freedom.This result is given by Theorem 3.3 in Section 3.Because we used βk to replace β k in ηi (β), this creates the terms βk − β k for k = 1, . . ., q and k = r.To reduce the bias generated by βk − β k , we multiplied by Â−1 1 on the right of (35), which ensures that this bias can be reduced in ηi,r (β r ).See the proof of Theorem 3.3 in the Appendix.If the pointwise confidence interval for a component of a(u) is of particular interest, then the partial profile EL ratio can be applied.Let ẽr denote the unit vector of length p with 1 at position r for r = 1, . . ., p.The estimator of the jth component a j (u) of a(u) is âj (u) = ẽT j â1 (u) for j = 1, . . ., p, where â1 (u) is defined in (3).Let where η * * i (•) and ϒ(u) are defined in (31) and (34), respectively.Therefore, the residualadjusted partial profile empirical log-likelihood ratio function of a r (u) is defined as Under some regular conditions, we can prove that R 1,r (a r (u 0 )) is asymptotically standard chi-squared with 1 degree of freedom.This result is given by Theorem 6 in Section 3. We multiplied by ϒ −1 (u) on the right of (37) to ensure that the terms âj (u 0 ) − a j (u 0 ), j = r, j = 1, . . ., p, can be eliminated in η * * i,r (a r (u 0 )).It is very necessary to do so.See the proof of Theorem 3.6 in the Appendix.Similarly, the partial profile EL ratio functions of β r and a r (u), say ˆl2,r (β r ) and R 2,r (a r (u)), can also be constructed by using G(V i |W i ) and β2 replacing G(V i −) and β1 in ˆl1,r (β r ) and R 1,r (a r (u)), respectively.The details are omitted.

Asymptotic properties
The following regularity conditions are needed for our main results.
(C1) For all j, l = 1, . . ., p and k = 1, 2, the function a j (u) has continuous second derivative on (0, 1), the function ψ kjl (u) is continuous at u 0 ∈ (0, 1), the functions φ jl (u) and f U (u) are bounded away from zero and satisfy the Lipschitz condition of order 1 on (0, 1), where a j (u) is j-th component of a(u), ψ kjl (u) and φ jl (u) are the (j, l)-th elements of the matrices k (u) and (u) defined in Theorem 3.5, respectively, and The functions H(v|w) and H 0 (v|w) are continuous in (w, v), their 2(p + q + 1)order partial derivatives with respect to w exist, they are continuous and uniformly bounded in (w, v), and H(v|w) satisfies inf is the closed sphere with centre w 0 and radius r 0 .(C3) For k = 1, 2, sup (C4) The kernels K(u) and K(w) are the bounded and symmetric probability density functions with compact support, satisfying Using conditions (C1)-(C7), we present the main results of this paper below.The asymptotic properties of ˆl1 (β 0 ) and ˆl2 (β 0 ) defined by ( 14) and ( 20) are as follows.q is a chisquared variable with q degrees of freedom.
Because we use the bias correction method, the asymptotic distributions of ˆl1 (β 0 ) and ˆl2 (β 0 ) are the standard chi-squared distribution with q degrees of freedom.
Using Theorem 3.1, the approximate 1 − α confidence regions of β 0 are defined as where χ 2 q (1 − α) is the (1 − α)th quantile of the χ 2 q distribution and 0 < α < 1.The following theorem shows that the estimators β1 and β2 have asymptotic normality.By estimating the asymptotic covariance matrices of estimators, these results can be used to construct the confidence regions of β 0 .Theorem 3.2: Suppose that conditions (C1)-(C7) hold.Then is the first derivative of G(s|w) with respect to s, and F(y|w Note that the two terms in B 1 (β 0 ) and B 2 (β 0 ) arise as a result of censoring.If the data are uncensored and the model error ε is homoscedastic with mean 0 and variance σ 2 , the asymptotic variances of β1 and β2 reduce to a special case of the asymptotic variance expression derived by Fan and Huang [4].That is, and is an asymptotic covariance matrix of profile likelihood estimate βP .Expressly, for the case of p = 1 and X = 1, model (1) change as a partially linear model.This moment, = σ 2 [E{var(Z|U)}] −1 .It is the semparametric information bound of the estimator βP (see [24]).This result also applies to model (1) (see Chambberlain, 1992).Therefore, the profile likelihood estimate is semparametric efficient.This shows that the proposed estimator is very good.
To apply Theorem 3.2 to construct the confidence region of β 0 , we need to construct the consistent estimators for A, B 1 (β 0 ) and B 2 (β 0 ).When the censoring variable C is independent of the covariate W, the consistency estimator Â1 of A is defined by (23).When the censoring variable C depends on the covariate W, the consistent estimator Â2 of A is defined by (27).The consistent estimators of B 1 (β 0 ) and B 2 (β 0 ) are defined as and respectively, where ηi ( β1 ) and η * i ( β2 ) are defined in ( 8) and (18), respectively.From Theorem 3.1, we obtain that where I q is the identity matrix of order q.Therefore, the asymptotic confidence intervals of β 0 can be constructed by (43).From (43) and Theorem 10.2d in Arnold [25], we obtain that Using (44), we can construct the asymptotic 1 − α confidence regions of β 0 , namely, For the ratios ˆlk,r (β 0r ), k = 1, 2, we have the following theorem.It can be used to construct the confidence intervals for each component of β 0 .

When the density f U (u) of U has a bounded support, say (0, 1), the performance of regression smoother at boundary points usually differs from the performance at interior points. Theoretically, the rate of convergence at boundary points is slower. For example, the Watson-Nadaraya and Gasser-Müller estimators have boundary effects -bias of order O(h) instead of O(h 2 )
-and require boundary modifications (see [26]).But the local linear smoother (3) does not require such a modification.That is, it has the same rate of convergence both at the boundary points and at the interior points (see [27]).Now, we in the following theorem state the asymptotic normality of âk (u) and ǎk (u 0 ) defined by (24), ( 29) and (33), respectively, k = 1, 2. By estimating the asymptotic biases and covariance matrices of estimators, these results can be used to construct pointwise confidence intervals for each component of a(u 0 ) Theorem 3.5: Suppose that conditions (C1)-(C7) hold.Then where k In Theorem 3.5, if the condition (C5) is replaced by nh 2 / log n → ∞ and nh 5 → 0, then Theorem 3.5 shows that, in order to obtain the asymptotic normality of âk (u 0 ), we need to undersmooth the coefficient functions.While ǎk (u 0 ) does not require to undersmooth the coefficient function to obtain its asymptotic normality.This is because the bias correction for the EL ratio plays a major role.Using Theorem 3.5, we can construct pointwise confidence intervals for each component of a(u 0 ).However, we need to use the plugin estimations for the asymptotic bias and covariance of âk (u 0 ).Obviously, the asymptotic bias and covariance of â(u 0 ) depend on b(u 0 ), ϒ(u 0 ) and k (u 0 ), we need to estimate them.The estimator ϒ(u 0 ) of ϒ(u 0 ) is defined by (34), and the estimators of k (u 0 ) are defined as 3), ( 15), ( 22), ( 24), ( 26) and ( 29), respectively.We now consider the estimator of b(u 0 ).Note that Hence, the estimators of b(u 0 ) are defined as It is easy to prove that ϒ(u 0 ), k (u 0 ) and bk (u 0 ) are the consistent estimators of ϒ(u 0 ), (u 0 ) and b(u 0 ), respectively.Assume that ϒ(u 0 ) is invertible.Then ϒ −1 (u 0 ) can be consistently estimated by ϒ −1 (u 0 ).Finally, we can obtain the consistent estimator k (u 0 ) of Using (51), we can construct the pointwise confidence intervals for a component a r (u 0 ) of a(u 0 ), that is, for r = 1, . . ., p, where âk,r (u 0 ) and bk,r (u 0 ) are the rth components of âk (u 0 ) and bk (u 0 ), respectively, γk,rr (u 0 ) is the (r, r)th element of 1/2 k (u 0 ), and z 1−α/2 is the 1 − α/2 quantile value of the standard normal distribution.
If condition (C5) in Theorem 3.5 is replaced by nh 2 / log n → ∞ and nh 5 → 0, then by (48), we obtain that for r = 1, . . ., p, Formula ( 53) can also be used to construct the pointwise confidence intervals of a r (u 0 ).
The following theorem give the asymptotic properties of R k (a(u 0 )) and R k,r (a r (u 0 )) for k = 1, 2 and r = 1, . . ., p, where R 1 (a(u 0 )) and R 1,r (a r (u 0 )) are defined by ( 32) and (38), respectively, and R 2 (a(u 0 )) and R 2,r (a r (u)) are constructed by using G(v|w) replacing . ., p.The asymptotic chi-squareness of R k (a(u 0 )) and R k,r (a r (u 0 )) benefits from the bias correction technique.If there is no bias correction for the EL ratio, the resulting EL ratio will converge to a mixed chi-squared distribution.
By using Theorem 3.6, the approximate 1 − α confidence regions of a(u 0 ) are defined as where χ 2 p (1 − α) is the 1 − α quantile of χ 2 p for 0 < α < 1, and the approximate 1 − α pointwise confidence intervals of a r (u 0 ) are defined as

Simulations and application
In this section, we will present the simulation studies to evaluate the finite sample performance of the proposed methods.We also present an analysis of the real data.

Bandwidth selection
The choice of bandwidth is an important issue in the kernel smoothing methods of regression functions.Various existing bandwidth selection techniques for nonparametric regression function can be used (see, e.g., [28,29]).Here we use the modified multifold cross-validation criterion proposed by Cai et al. [28] to select the bandwidth h.That is, we choose h that minimizes the average mean square (AMS) error where m and L are two given positive integers, 2) and ( 15), respectively, β1 and β2 are defined by ( 22) and ( 26), respectively, and {â k,l (•)} are computed from the sample {(U i , X i , Y i ), 1 ≤ i ≤ n − lm} and formula ( 24) and ( 29) with bandwidth h[n/(n − lm)] 1/5 , 1 ≤ l ≤ L. Note that we rescale bandwidth h for different sample size according to its optimal rate, i.e., h ∝ n −1/5 .
In practical implementations, we may use m = [0.1n]and L = 4.The selected bandwidth does not dependent critically on the choice of m and L, as long as mL is reasonably large so that the evaluation prediction errors is stable.
The kernel function K(u) was taken as 0.75(1 − u 2 )I(|u| ≤ 1), and K(w) was taken as K(w 1 )K(w 2 )K(w 3 )K(w 4 )K(w 5 ), where w = (w 1 , w 2 , w 3 , w 4 , w 5 ) T = (u, x 1 , x 2 , z 1 , z 2 ) T .The AMS criterion in Section 4.1 was used to select the bandwidth h.The bandwidth b was taken as 0.35 when computing the estimator G(v|w)..The simulations were considered in the following four cases.
(I) The confidence regions of β 0 were computed by 500 Monte Carlo random samples of size n = 100, based on the bias-corrected EL (BCEL), the normal approximation (NA) and the adjusted empirical likelihood (AEL) proposed by [7].The simulation results are presented in Figure 1.
As can be seen in Figure 1, the confidence regions based on BCEL are smaller than those based on AEL and NA in both cases.For (a)-(d), the empirical coverage probabilities based on BCEL are 0.944, 0.932, 0.940 and 0.936, respectively, the empirical coverage probabilities based on AEL are 0.938, 0.928, 0.936 and 0.934, respectively, and the empirical coverage probabilities based on NA are 0.936, 0.926, 0.934, 0.930, respectively.The confidence region increases with increasing censoring rate, and the CIC based confidence regions are smaller than the CDC based confidence regions.In addition, under CIC, if (15) is used to calculate the estimate of G(t), the resulting confidence regions are larger than those used (2).Similarly, under CDC, if (2) is used to calculate the estimate of G(t), the resulting confidence regions are also larger than those used (15).This suggests that a false estimate of the censored distribution yields greater confidence region.
(II) Based on BCEL, AEL and NA, the confidence intervals for β 01 and β 02 and their corresponding coverage probabilities were calculated by 200 simulations with sample sizes n = 100, 150, 200, respectively.The simulation results are presented in Tables 1 and 2.
From Tables 1 and 2 we can see that, under case 1 and CR = 0.2, the interval lengths based on BCEL are slightly longer than those based on NA, but with larger coverage probabilities than based on NA; while to the other cases, the BCEL has shorter interval lengths and larger coverage probabilities than NA.Also, BCEL has shorter interval lengths and larger coverage probabilities than AEL.Furthermore, as n increases, all average lengths decrease and the empirical coverage probabilities increase, and as CR increases, all average lengths increase and the empirical coverage probabilities decrease.When the sample size is large, the coverage probability based on BCEL in near the nominal level 0.95.
(III) The biase, standard deviation (SD) and mean square error (MSE) of the estimators of β 0 were computed by 500 simulations when the sample sizes were 100, 150 and 200, respectively.The simulation results are presented in Tables 3 and 4. From Tables 3 and 4 we can see that, for the estimates of β 01 and β 02 , the bias, SD and MSE decrease with increasing sample size, and increase with increasing CR.In addition, under CIC, if (15) is used to calculate the estimate of G(t), the resulting bias, SD and MSE are larger than those used (2).similarly, under CDC, if (2) is used to calculate the estimate of G(t), the resulting bias, SD and MSE are also larger than those used (15).This suggests that a false estimate of the censored distribution yields greater bias, SD and MSE.
(IV) The performances of the estimate of a(u) was considered by 500 simulations with n = 100.The performance of two methods was compared in terms of the average widths of pointwise confidence intervals.The two methods are the residual-adjusted EL (RAEL) and the normal approximation (NA).The estimators âj (u) are assessed via the root mean squared errors (RMSE).That is, where âj (•), j = 0, 1, 2, is the jth component of â1 (u) or â2 (u) defined by ( 24) and ( 29), respectively, the number n grid of grid point is 15, and {u k , k = 1, . . ., n grid } are equidistant grid points.All RMSE is Approximate 95% pointwise confidence intervals and the boxplot of 500 RMSEs are computed under CIC.The simulation results are presented in Figures 2 and 3. Figures 2 and 3 show that RAEL gives shorter pointwise confidence intervals than NA.The coverage ranges are about 0.935 to 0.945 for two cases.Furthermore, the RMSE of the For case 1, approximate 95% pointwise confidence intervals of a j (u), j = 0, 1, 2, based on RAEL (dashed curves) and NA (dot-dashed curves) when n = 100.The solid curve is the true curve, and dotted curves is the estimated curve.estimators {â j (t)} are very small.This means that the local linear estimates â1 (u) and â2 (u) are effective.

Application to AIDS data
In this section, we present the analysis of a data set derived from the AIDS clinical trial group (ACTG) study.The data set contains viral load, base ribonucleic acid (RNA) virus, CD4 and CD8 cell counts from 48 valuable patients enrolled in the ACTG protocol 317.In this study, every patient was scheduled to be measured after initiation of antiviral therapy, with the number of observations per patient ranging from 2 to 91.Thus, there are a total of 317 observations, with 20.19% of observations censoring.This is a longitudinal data set.However, model (1.1) with censored data can be used to study this data set if there is no correlation.Here we ignored the correlation structure when computing the estimates, using the so-called working independence assumption.The working independence has some model-robustness advantages over estimation methods [30].For case 2, approximate 95% pointwise confidence intervals of a j (u), j = 0, 1, 2, based on RAEL (dashed curves) and NA (dot-dashed curves) when n = 100.The solid curve is the true curve, and dotted curves is the estimated curve.
It is one of the clinical investigator's interests to study the effectiveness of antiviral medicines.The purpose of this study was to investigate the relationship between virologic and immunologic responses in AIDS clinical trials.Our way to model the relationship among viral load, base RNA virus, CD4 and CD8 cell counts by using model (1.1).Let Y i be the ith patient's viral load, which is log-transformed, i is the censoring indicator, U i be the treatment time (day), X i1 be the ith patient's base RNA virus, which is centralized, and Z i1 and Z i2 be the ith patient's CD4 and CD8 cell counts, which are centralized, respectively.The logarithm transformation is commonly used in AIDS clinical trials.The advantage of using the centred covariates X i1 , Z i1 and Z i2 is that a 0 (u), a 1 (u), β 01 and β 02 have clear biological interpretations.Here a 0 (u) represents the baseline viral load, which can be interpreted as the mean viral load at u days after infection for a patient with an average preinfection CD4 and CD8 cell counts and an average base RNA virus at infection.Then a 1 (u) describes the effects of base RNA virus at infection on the postinfection viral load at time u, and β 01 and β 02 describe the effects of preinfection CD4 and CD8 cell counts, respectively.We used the Epanechnikov kernel and the AMS criterion in Section 4.1 to select the bandwidth h.The bandwidth b was taken as 0.35.We give maximum EL estimate β 0 and the local linear estimate â1 (u), and their corresponding approximate 95% pointwise confidence intervals using RAEL and NA methods.Therefore, we obtain the following empirical regression model: where β1 , â1 (U), β2 and â2 (U) are calculated by ( 22), ( 24), ( 26) and ( 29) respectively.
From Figure 4, we find that the mean baseline viral load for the population decreases rather quickly after initial antiviral treatment, but the rate of decrease appears to be slowing down after 50 days; the base RNA virus levels decreases after initial antiviral treatment.This indicates that the treatment has an obvious effect.Furthermore, we can see from Figure 4 that RAEL provides a narrower pointwise confidence intervals than NA method.This also demonstrates the high accuracy of RAEL over NA.
The determination coefficient R 2 new is used to evaluate the goodness of fit for nonlinear regression models with right censored data.For the case of CIC, R 2 new is defined as where G(•) is defined in (2) and Y i is a prediction value of Y i , i = 1, . . ., n.That is, , where β1 and â1 (•) are calculated by ( 22) and ( 24), respectively.For the case of CDC, just use G(V i |W i ), β2 and â2 (•) instead of G(V i −), β1 and â1 (•) in R 2 new .R 2 new and the determination coefficient R 2 in the linear model are identical in the sense of the goodness of fit.For the proposed prediction model (59), R 2 new in both cases of CIC and CDC are 0.9092 and 0.8731, respectively.This very high value of R 2 new indicates that our prediction model (59) is effective.

Concluding remarks
We in this paper proposed the bias-corrected empirical log-likelihood ratio for the regression parameters and the residual-adjusted empirical log-likelihood ratio for the coefficient functions in model ( 1) with right censored data, and showed that each of two ratios is asymptotically standard chi-squared.We constructed the estimators for both parametric and the nonparametric components and obtained their asymptotic distributions and the consistent estimators of asymptotic variances.We also constructed the partial profile empirical log-likelihood ratios for each component of the regression parameters and coefficient functions, and showed that each of two ratios is asymptotically standard chi-squared.
The obtained results can be directly used to construct the confidence regions/intervals of the regression parameters and the pointwise confidence intervals of the coefficient functions.The proposed method has two features: First, our method involves bias correction within the ratio rather than multiplying by an adjustment factor outside the ratio.We use the bias correction method to construct the EL ratios, and each of these ratios is asymptotically standard chi-squared.Second, the bias correction method is adopted to avoid the undersmoothing for the coefficient functions.Simulation studies and real data analysis demonstrate the superiority and utility of the proposed method.Our approach can also be used to study other semiparametric regression models with right censored data, such as partially linear single-index models, single-index varying coefficient models, etc.
It should be noted that Lemma A.1 gives the results when the censoring variable C is independent of the covariates.For the case where the censoring variable C depends on the covariates, the corresponding conclusions still hold.We have the following Lemma.where c n = {nh/ log(h −1 )} −1/2 + h.Note that κ 0 = 1 and κ 1 = 0. From (A4), we can obtain that uniformly for u ∈ (0, 1),

Lemma A.2: Suppose that conditions
where ⊗ is the Kronecker product.Using the same argument, we can also show that uniformly for u ∈ (0, 1), where Z = (Z 1 , . . ., Z n ) T .By using the fact for any invertible matrix A and any matrix B, it can be followed that uniformly for u ∈ (0, 1), Using (A5), (A6) and condition (C5), it yields that uniformly for u ∈ (0, 1), ).This completes the proof of the first equation of Lemma A. 4.
We now prove the second equation of Lemma A.
This completes the proof the second of Lemma A.4 for k = 1.Now we turn back to prove Theorems 3.1-3.5 using Lemmas A.1, A.2 and A.4.

Figure 2 .
Figure 2.For case 1, approximate 95% pointwise confidence intervals of a j (u), j = 0, 1, 2, based on RAEL (dashed curves) and NA (dot-dashed curves) when n = 100.The solid curve is the true curve, and dotted curves is the estimated curve.

Figure 3 .
Figure 3.For case 2, approximate 95% pointwise confidence intervals of a j (u), j = 0, 1, 2, based on RAEL (dashed curves) and NA (dot-dashed curves) when n = 100.The solid curve is the true curve, and dotted curves is the estimated curve.

Figure 4 .
Figure 4. Application to AIDS data.Estimated curves (solid curves) for the baseline viral load [(a) and (c)] and the effects of the base RNA virus on the viral load [(b) and (d)], and approximate 95% pointwise confidence intervals based on RAEL (dashed curves) and NA (dot-dashed curves).(a) and (b) Based on CIC; and (c) and (d) based on CDC.
n i=1

Table 1 .
For case 1, average lengths and empirical coverage probabilities (in parentheses) of the confidence intervals for β 01 and β 02 under different sample sizes n when nominal level is 0.95.

Table 2 .
For case 2, average lengths and empirical coverage probabilities (in parentheses) of the confidence intervals for β 01 and β 02 under different sample sizes n when nominal level is 0.95.

Table 3 .
For case 1, the bias, SD and MSE of the estimators for β 01 and β 02 by 500 simulations under different CR and sample sizes n.

Table 4 .
For case 2, the bias, SD and MSE of the estimators for β 01 and β 02 by 500 simulations under different CR and sample sizes n.