Parametric hypothesis tests for exponentiality under multiplicative distortion measurement errors data

Abstract In this paper, we proposed a parametric hypothesis test of the multiplicative distortion model under the exponentially distributed but unobserved random variable. The unobservable variable is distorted in a multiplicative fashion by an observed confounding variable. Firstly, some new test statistics are proposed to checking the exponential distribution assumption without distortion effects. Next, we proposed several test statistics when the variable is distorted in the multiplicative fashion. For the latter, the proposed test statistics automatically eliminate the distortion effects involved in the unobserved variable. The proposed test statistics with or without distortions are all asymptotical free, and the asymptotic null distribution of the test statistics are obtained with known asymptotic variances. We conduct Monte Carlo simulation experiments to examine the performance of the proposed test statistics. These methods are applied to analyze four real datasets for illustrations.


Introduction
In applications of data analysis, researchers are well aware that, among all variables of interest, some of them are measured imprecisely with various types of error, and some of them are in principle inaccessible and only some surrogates of them can be measured.For the estimation of parametric or nonparametric parameters, disregarding the measurement errors always results in asymptotically biased, inconsistent, estimators (Carroll et al. 2006;Fan 1991;Fuller 1987).For the hypothesis test problems, disregarding the measurement errors can produce erroneous conclusions due to the incorrect Type-I and Type II errors even in a large sample size.Various methods for measurement error correction are offered in literature to remove the bias of estimators or test statistics.These methods are based on some model assumptions on the measurement errors.In this paper, a multiplicative distortion errors-in-variable model is written as where X is a unobserved continuous variable of interest, and it is further assumed to be a positive random variable.The variables ð X, UÞ are both observable.In literature, X is usually named as distorted variable, and U 2 R 1 is called the confounding variable.The confounding variable U is independent of X for the sake of identifiability of the model (1.1).The distortion function wðuÞ is an unknown smoothing function to be estimated.It is noted that wðÁÞ distorts unobserved X in a multiplicative fashion.
Recently, there exists a large collection of works on mean or modal regression methodologies accounting for the distortion measurement errors model (1.1) in various parametric or semiparametric settings, for the example, linear or nonlinear regression with multiplicative distortions (Cui et al. 2009;S ¸ent€ urk and M€ uller 2006;Wei, Fan, and Zhang 2016;Zhang, Yang, and Li 2020;Zhang and Zhou 2020), semiparametric regressions with multiplicative distortion measurement errors (Nguyen and S ¸ent€ urk 2008;Zhang 2019Zhang , 2021;;Zhang, Lin, et al. 2020;Zhang, Yang, Feng, et al. 2020;Zhang et al. 2016), nonparametric regressions with multiplicative distortion measurement errors (Delaigle, Hall, and Zhou 2016;Feng, Gai, and Zhang 2019;Zhang and Cui 2021).Other topics on the multiplicative distortion measurement errors are about the correlation coefficient estimation (Zhang 2022;Zhang, Feng, and Zhou 2014;Zhang, Xu, et al. 2023), the symmetric detection of X (Zhang, Gai, et al. 2020), the study of modal regression (Zhang et al., 2021) and the model checking problems (Zhang, Li, and Feng 2015;Zhao and Xie 2018).Toward this end, there are no systematic studies on measuring the distribution assumption of X under the multiplicative distortion measurement errors setting.
As the variable X is observed with multiplicative distortion measurement errors, the test statistics for checking the exponentiality of X cannot be directly used by simply substituting the observed X with unobserved X in the existing test statistics.There is no literature to study the exponentiality of a continuous variable X under the model setting (1.1).In this paper, we tackle this important problem in our study.The exponential distribution is perhaps the one most used in statistical literature in lifetime testing, reliability theory and the theory of stochastic processes, where it is often assumed that the collected data are a random sample from the exponential distribution.Testing the exponential assumption has been a problem of continuous interest over time.For example, Epstein (1960aEpstein ( ,1960b) ) proposed several graphical and analytical procedures with numerical examples for testing the assumption that the underlying distribution of a random variable is exponential.Ebrahimi, Habibullah, and Soofi (1992) proposed a test of fit for exponentiality based on the estimated Kullback-Leibler information and Vasicek's estimator.Gail and Gastwirth (1978b) considered to use the sample Lorenz curve for the goodness-of-fit test of exponentiality.Gail and Gastwirth (1978a) studied a scale-free test of exponentiality by using the sample Gini index.Henze and Meintanis (2002b) used the empirical Laplace transform and Henze and Meintanis (2002a) extended it by using the characteristic function for testing the exponentiality.Gran e and Fortiana (2009) proposed the Hoeffding's maximum correlation to test the composite hypothesis that the data come from the two-parameter exponential family.Volkova (2010) tested the exponentiality based on Rossberg's characterization of the exponential law and studied the local Bahadur efficiency and local asymptotic optimality for some alternatives.Baratpour and Rad (2012) introduced a Kullback-Leibler divergence measure by using the cumulative residual entropy.Torabi, Montazeri, and Gran e (2018) propose goodness-of-fit statistics based on the discrepancy measure between two distribution functions: the empirical one obtained from the sample under study and the hypothesized.For other applications, Zamanzade (2019) used the empirical distribution function based test statistic for testing the exponentiality in the pair ranked set sampling design.Sankaran and Midhu (2016) proposed a non-parametric test of exponentiality based on mean residual quantile function in reliability studies, and Zardasht, Parsi, and Mousazadeh (2015) used the asymptotic behavior of the empirical cumulative residual entropy based on a right censored random sample from an unknown distribution, and further proposed a goodness-of-fit test for an exponential distribution.Recently, Villaseñor and Gonz alez-Estrada (2020) studied the goodness-of-fit tests exponentiality by using a new estimator of scale parameter.That is, the scale parameter is the same as covariance of a random variable X and its logarithmic transformation log ðXÞ when X follows an exponential distribution (Villaseñor and Gonz alez-Estrada 2015).With this property, Villaseñor and Gonz alez-Estrada (2020) proposed a test statistic and showed that the proposed test statistic asymptotically converges to a standard normal distribution when the distribution function of X is F k ðxÞ ¼ 1 À exp ðÀx=kÞ, x > 0.
Motivated from the work of Villaseñor and Gonz alez-Estrada (2020), we propose several test statistics by using CovðV, log ðVÞÞ ¼ 0:25 when V follows an uniform distribution U½0, 1: If X $ F k ðxÞ, we have F k ðXÞ $ U½0, 1 and also 1 À F k ðXÞ $ U½0, 1: Then, CovðF k ðXÞ, log ðF k ðXÞÞÞ ¼ Covð1 À F k ðXÞ, log ð1 À F k ðXÞÞÞ ¼ 0:25: The latter equation is equivalent to CovðX, exp ðÀX=kÞÞ ¼ 0:25: Based on this, we first proposed our test statistics ĵns 's by considering no multiplicative distortion measurement errors (wðUÞ 1).In other words, we assume that the variable X is observable without multiplicative distortion effects.These test statistics ĵns 's are constructed by using moment estimators of CovðX, exp ðÀX=kÞÞ: Under the null hypothesis H 0 , we consider to use maximum likelihood estimator of k and also the covariance based estimator k (Villaseñor and Gonz alez-Estrada 2020) in these test statistics ĵns 's.We study the asymptotic results of the proposed test statistics and show that these test statistics are all asymptotically distribution free.Next, we consider to construct test statistics under multiplicative distortion measurement errors.The test statistics are linked with ĵns 's by using the calibrated variable of X.An interesting finding is that the proposed test statistics under the multiplicative distortion measurement errors setting perform asymptotically efficient as if there are no distortion effects in unobserved X, moreover, the proposed test statistics with calibrated variable eliminate the multiplicative distortion effects in an asymptotic way.Consequently, the proposed test statistics under the multiplicative distortion measurement errors are all asymptotic free.Usually, the multiplicative distortion effects always exists in the asymptotic expressions of test statistics or asymptotic variances of estimators.However, the proposed test statistics are not affected by the multiplicative distortions in this paper.We conduct Monte Carlo simulation experiments to study the performance of the proposed test statistics and compare them with some existing test statistics in the literature.In Sec. 2, we propose the test statistics without distortions and give their asymptotic results.In Sec. 3, we propose the test statistics under the multiplicative distortions and give their asymptotic results.In Sec. 4, simulation results are presented.In Sec. 5, the analysis of four real datasets for the proposed test statistics.Technical proofs of the theorems are given in the online-supplementary material.

Test statistics without distortions
The cumulative distribution function (cdf) of the exponential distribution with scale parameter k > 0, denoted as Exp ðkÞ, is , where V has a uniform distribution on the (0, 1) interval, denoted as U(0, 1).Moreover, (2.1) Suppose k is known, from (2.1), we have Covð1 À F k ðXÞ, log ð1 À F k ðXÞÞÞ ¼ 1 4 and the correl- (2.2) For a given i:i:d: sample X i f g n i¼1 of size n from a continuous cdf F(x) with support on the positive real line, it is considered the problem of testing the hypothesis (2.3)Under the null hypothesis H 0 , the maximum likelihood estimator of k is ( )

:
Moreover, under H 0 , we also have exp Then, we propose the second test statistic as

:
Theorem 1.Let X 1 , :::, X n be i:i:d: copies of a random variable X.Under the null hypothesis H 0 , as n ! 1, we have Using Theorem 1, for a sample of size n, the proposed a-level asymptotic free tests of H 0 are At the significant level a, the rejection regions for the null hypothesis , where z 1Àa=2 is the 1 À a=2 quantile of the standard normal distribution, i.e.Uðz 1Àa=2 Þ ¼ 1 À a=2, a 2 ð0, 1Þ: Let c n ¼ c 0 ffiffi n p for some 0 < c 0 < 1 and G(x) is a distribution function, under the local alternative hypothesis Under H 1n , we have ĵn1 !L N # Gk , 13 216 , with Á , k 0 > 0 and x > 0 and then where UðÁÞ is the distribution function of a standard normal distribution.We have l Moreover, # Gk ¼ c 0 ð0:25 À exp ðÀ0:5ÞÞ % À0:3565c 0 when GðxÞ ¼ U xÀk k , and # Gk ¼ c 0 ð0:5 À 2 exp ð0:5ÞÞ % À2:7974c 0 when GðxÞ ¼ U x k À Á : For the test statistic ĵn2 , under the local hypothesis (2.4), as n ! 1, we have c 0 ð exp ðÀ0:5Þ À 0:5Þ % 0:1065c 0 when GðxÞ ¼ U xÀk k , and n Gk ¼ c 0 ð exp ð0:5Þ À 0:75Þ % À0:1435c 0 when GðxÞ ¼ U x k À Á : Under the null hypothesis H 0 , Villaseñor and Gonz alez-Estrada (2020) and Villaseñor and Gonz alez-Estrada (2015) obtain that k ¼ CovðX, log ðXÞÞ, and the moment based estimator of k is obtained as Using this estimator k, together with (2.2), we proposed the third test statistic ĵn3 , ( ) If we used the estimator k to estimate k, and substitute X with k in the statistic ĵn2 , we obtained the recently proposed test statistic (Villaseñor and Gonz alez-Estrada 2020 The asymptotic results of d VGE n have been well studied in Villaseñor and Gonz alez-Estrada (2020), and we present the asymptotic results of k and ĵn3 as follows.
Theorem 2. Let X 1 , :::, X n be i:i:d: copies of a random variable X.Under the null hypothesis H 0 , as n ! 1, we have where £ 2 ¼ p 2 96 þ log 2 8 þ 53 432 % 0:312137: Using Theorem 2, for a sample of size n, the proposed a-level asymptotic free test of H 0 is and its rejection region is ĵ?
Let L(x) is a distribution function with L(x) > 0 when x > 0 and L(x) ¼ 0 when x 0, under the local alternative hypothesis H 2n , , -Lk ¼ b Lk ð1, 1Þþkðð log kÞÀc À 1Þ À ðð log kÞ À cÞb Lk ð1, 0Þ À kb Lk ð0, 1Þ, and c is the gamma constant From Theorem 1 and Theorem 2, different estimators of k results in different asymptotic variances.Moreover, these asymptotic variances are all known constants under H 0 : Next, we consider the following six test statistics ( ) We present the asymptotic results of test statistics ĵns , s ¼ 4, :::, 9: Theorem 3. Let X 1 , :::, X n be i:i:d: copies of a random variable X.Under the null hypothesis H 0 , as n ! 1, we have where % 0:01385017: From Theorem 3, ĵn4 is asymptotically equivalent to ĵn6 , test statistic ĵn5 is asymptotically equivalent to ĵn1 , test statistic ĵn7 is asymptotically equivalent to ĵn9 , and test statistic ĵn8 is asymptotically equivalent to ĵn3 : Using Theorem 3, for a sample of size n, the proposed a-level asymptotic free tests of H 0 are and their rejection regions are ĵ? where,

Conditional mean calibration
In this subsection, we consider the case of distortion measurement errors setting.As the variable X is unobserved, we need calibrate it by using the observed i:i: To ensure identifiability of model (1.1), it is assumed that E wðUÞ ½ ¼1: (2.7) The identifiability condition (2.7) is also called the "no average distortion mean" condition, which is introduced by S ¸ent€ urk and M€ uller (2005).This condition is similar to the classical additive measurement error model: where W is error-prone and Z is errorfree (de Castro and Vidal 2019; Li, Zhang, and Feng 2016;Tomaya and de Castro 2018;Yang, Tong, and Li 2019).For the conditional mean calibration (CMC), the following Assumption A( 1) is assumed for the distorting function wðuÞ : Assumption A(1): wðuÞ > 0, for all u 2 ½U L , U R , where ½U L , U R denotes the compact support of U.
Assumption A( 1) is introduced in S ¸ent€ urk and M€ uller (2005,2006) and further used as the CMC estimation methods (Cui et al. 2009;Zhang, Gai, et al. 2020;Zhang, Li, andFeng 2015, Zhang, Zhu, andLiang 2012).Under the independence condition between U and X, the identifiability condition (2.7) and Assumption A(1) entail that Eð XjUÞ ¼ wðUÞEðXjUÞ ¼ wðUÞEðXÞ ¼ wðUÞEð XÞ, wðUÞ ¼ Eð XjUÞ Eð XÞ : The Nadaraya-Watson estimator (Nadaraya 1964;Watson 1964) can be used to estimate Eð XjU ¼ uÞ, and the estimator of wðuÞ is defined as Xi : Here, K h ðÁÞ ¼ h À1 KðÁ=hÞ, KðÁÞ denotes a kernel density function, and h is a bandwidth.It is noted that the CMC estimation is the same as the conditional absolute mean calibration (CAMC) proposed in Delaigle, Hall, and Zhou (2016); Zhang (2021); Zhang, Lin, et al. (2020); Zhao and Xie (2018).Under the positive assumption of X and wðuÞ > 0 in Assumption A(1), we have Eð X j j jUÞ ¼ Eð X UÞ and Eð X j j Þ ¼ Eð XÞ: So, the CMC estimation method used here is the same with CAMC estimation method.Using estimator ŵC ðuÞ, we obtain the CMC-variables XC,i È É n i¼1 as When some values of the estimated distortion functions ŵC ðU i Þ are closed to zero, one can add a small constant such as 1 n in practice, for example, XC,i ¼ XC,i ŵC ðU i Þþn À1 for some i.This is a commonly used remedy in the nonparametric regression estimation.For the estimator ŵC ðuÞ, the asymptotic bias with the convergence order Oðh 2 Þ exists.Under the Condition (C3) in the following of that nh 4 !0 (or ffiffiffi n p h 2 !0), the nonparametric bias Oðh 2 Þ in ŵC ðuÞ can be controlled by an asymptotically negligible convergence rate Oðh 2 Þ ¼ oðn À1=2 Þ: Thus, the bias converges to zero faster than the parametric convergence rate Oðn À1=2 Þ: Eventually, the nonparametric bias term ŵC ðuÞ À wðuÞ will not have an effect on the asymptotic results of root-n consistent of the estimators and test statistics as follows.
In previous section, the test statistics ĵnk 's or ĵ? nk 's involved different estimators of k (or E(X)) under the null hypothesis H 0 , but some of the test statistics are asymptotically equivalent.Under the multiplicative distortion measurement errors setting, under H 0 , we discuss the estimators of k (or E(X)) at first.The estimators are defined as We now list the assumptions needed in the following theorems under the multiplicative distortion measurement errors settings.
(C1) The distortion function wðuÞ has three continuous derivatives on u 2 ½U L , U R : Moreover, the density function f U ðuÞ of the random variable U is bounded away from 0 and satisfies the Lipschitz condition of order 1 on ½U L , U R : (C2) The kernel function KðÁÞ is a symmetric bounded density function supported on ½ÀA, A satisfying a Lipschitz condition.KðÁÞ also has second-order continuous bounded derivatives, satisfying 0 < Ð s 2 KðsÞds < 1: (C3) As n ! 1, the bandwidth h satisfies log 2 n nh 2 !0 and nh 4 !0: These are mild conditions that are satisfied in most practical situations.Condition (C1) is usually imposed to estimate the unknown distortion function with some smoothing conditions, see for example, Delaigle, Hall, and Zhou (2016); Zhang (2021Zhang ( , 2022)); Zhang, Chen, et al. (2023); Zhang, Lin, et al. (2020); Zhang, Lin, et al. (2023); Zhang, Xu, et al. (2023); Zhang and Zhou (2020); Zhou and Zhang (2022).Condition (C2) is the usual condition for the kernel function KðÁÞ: The Epanechnikov kernel satisfies this condition.Condition (C3) is the condition for the bandwidth h in the nonparametric kernel smoothing.The under-smoothing condition nh 4 !0 assures that the bias of the nonparametric estimation is asymptotically negligible and faster than the parametric convergence rate Oðn À1=2 Þ: Theorem 4. Suppose conditions (C1)-(C3) and the Assumption A(1) hold, under the null hypothesis H 0 , as n ! 1, we have From Theorem 4, it is seen that the estimators kC1 and kC2 are asymptotically equivalent since their asymptotic variances are equal.The estimators kC3 , kC4 are asymptotically equivalent as their asymptotic variances are also equal.Moreover, the estimators kC1 , kC2 are asymptotically better than kC3 , kC4 : The asymptotic variances of kC1 , kC2 are smaller than those of kC3 , kC4 : The estimator of VarðwðUÞÞ is constructed as Under H 0 , the ð1 À aÞ Â 100% (0 < a < 0:5) asymptotic confidence intervals of k based on the asymptotic normal (NA) results of kCs , s ¼ 1, :::, 4 can be constructed as kCk V1,a VarðwðUÞÞ n r z 1Àa=2 , t ¼ 3, 4. It is seen that the asymptotic intervals have a disadvantage such that V2,a and Ŝ2,a may be negative for a finite sample size.
To avoid it, we can use the variance-stabilizing transformation (VST) method.As an illustration, we have Using these asymptotic results, the asymptotic intervals are kCk Ŵ1,a From Theorem 4, it is seen that the asymptotic estimators kCk , k ¼ 1, 2 are asymptotically better than kCt , t ¼ 3, 4.Under the null hypothesis H 0 , Theorem 3 tells us that the test statistic ĵn1 is asymptotically equivalent to ĵn5 , ĵn3 is asymptotically equivalent to ĵn8 , ĵn4 is asymptotically equivalent to ĵn6 , and ĵn7 is asymptotically equivalent to ĵn9 : The different asymptotic behaviors of ĵns , s ¼ 1, :::, 9 are caused by the choices of estimators of k.Under the multiplicative distortion measurement errors setting, the estimators kC1 , kC2 are analogous to the "no distortions" estimator X, and the estimators kC3 , kC4 are also analogous to the "no distortions" estimator k: Analogous to ĵn1 and ĵn5 without distortions, we use the calibrated variables XC,i È É n i¼1 and construct our test statistics as Theorem 5. Suppose conditions (C1)-(C3) and the Assumption A(1) hold, under the null hypothesis H 0 , as n ! 1, we have Theorem 5 shows that the statistics ĵC,n1 , ĵC,n5 are both asymptotically equivalent to ĵn1 (or ĵn5 ) as if there are no distortion effects (wðUÞ 1).At the significant level a, the rejection regions for the null hypothesis H 0 are ĵ?
According to the test statistic ĵn2 and the d VGE n statistic without distortions, we proposed the following two statistics Theorem 6. Suppose conditions (C1)-(C3) and the Assumption A(1) hold, under the null hypothesis H 0 , as n ! 1, we have At the significant level a, the rejection regions for the null hypothesis H 0 are ĵ?
Analogous to test statistics ĵn3 , ĵn4 , ĵn6 , ĵn7 , ĵn8 and ĵn9 , we propose the following test statistics under multiplicative distortion measurement errors setting.

:
Theorem 7. Suppose conditions (C1)-(C3) and the Assumption A(1) hold, under the null hypothesis H 0 , as n ! 1, we have Similar to Theorem 6, the theoretical results in Theorem 7 show that the statistics ĵC,n3 , ĵC,n4 , ĵC,n6 , ĵC,n7 , ĵC,n8 and ĵC,n9 under multiplicative distortions are also asymptotically equivalent to the statistics ĵn3 , ĵn4 , ĵn6 , ĵn7 , ĵn8 and ĵn9 whether the distortion effects exist or not.So the proposed a-level asymptotic free tests of H 0 are At the significant level a, the rejection regions for the null hypothesis H 0 are ĵ?

Implementation
In this section, we present some simulation results to show the performance of the proposed estimators.
Example 1 In this simulation, the random variable X is generated from the exponential distribution with scale parameter k ¼ 2, F 0 ðxÞ ¼ 1 À exp À x 2 À Á , x > 0: 5000 realizations are generated and sample size are n ¼ 50, n ¼ 100, n ¼ 300 and n ¼ 500, respectively.The test statistics ĵ?
ns , s ¼ 1, :::9, are conducted to check the exponentiality of X.The significant levels are chosen as a ¼ 0:01, a ¼ 0:05 and a ¼ 0:10: To make a comparison, we conduct the Kolmogorov-Smirnov test statistics (Bai and Kalaj 2021;Durbin 1973;Zhang, Zhang, et al. 2020): KS-T test statistic with the true parameter k ¼ 2 (R command ks:testð X i f g n i¼1 , "pexp", 1=2Þ), KS-M test statistic with the maximum likelihood estimator X of k (R command ks:testð X i f g n i¼1 , "pexp", 1= XÞ) and KS-C test statistic with the moment based estimator k in (2.5) (R command ks:testð X i f g n i¼1 , ""pexp}, 1= kÞ); the Anderson-Darling test statistics (Anderson and Darling 1952): AD-T test statistic with the true parameter k ¼ 2 (R command ad:testð X i f g n i¼1 , "pexp}, 1=2Þ), AD-M test statistic with the maximum likelihood estimator X of k (R command ad:testð X i f g n i¼1 , "pexp", 1= XÞ) and AD-C test statistic with the moment based estimator k in (2.5) (R command ad:testð X i f g n i¼1 , ""pexp}, 1= kÞ); the Cram er-von Mises test statistics (Cs€ org} o and Faraway 1996; Zhang, Zhang, et al. 2020): CVM-T test statistic with the true parameter k ¼ 2 (R command cvm:testð X i f g n i¼1 , "pexp", 1=2Þ), CVM-M test statistic with the maximum likelihood estimator  1 show that these test statistics KS-T, AD-T and CVM-T perform well only when the true value k ¼ 2 is used in these test statistics, even when the sample size n is small.If these test statistics are conducted by using the estimators X or k from the generated dataset instead of true value k ¼ 2, they can not have the correct rejection probabilities.In detail, KS-M, AD-M and CVM-M (by using the estimator X) have much lower rejection probabilities, while KS-C, AD-C and CVM-C (by using the estimator k) have much larger rejection probabilities.This suggest the Kolmogorov-Smirnov test statistic, the Anderson 1 show that it performs similar to the ĵ?
n5 and can have correct Type-I errors.In Figure 1, we present the histogram plots of p-values of ĵ?
nk 's, k ¼ 1, :::, 9 under the null hypothesis H 0 when the sample size n is 500.The histogram plots show that these p-values follow the uniform distribution Unifð0, 1Þ, and it further implies test statistics ĵ?
nk 's can be used to test the null hypothesis of the exponentiality.

Example 2
In this example, we study the power functions of test statistics in Example 1.The alternative hypothesis is considered as follows: where F 0 ðxÞ is the same as Example 1, F 1 ðxÞ ¼ 1 À exp ðÀ0:15xÞ, x > 0, and C 0 is considered as C 0 ¼ 0:2: 5000 realizations are generated in this example.In Example 1, we find that the test statistics KS-M, KS-C, AD-M, AD-C, CVM-M and CVM-C can not have the correct Type-I errors, so the power functions of KS-M, KS-C, AD-M, AD-C, CVM-M and CVM-C under the H 1 is meaningless when we consider to use the Neyman-Pearson criterion decision rule.The simulation results are shown in perform the worst in general, since their rejection probabilities are all smaller than others, and even not better than KS-T, CVM-T and d VGE ?n : In Table 3, we consider the alternative hypothesis (5.1) with C 0 ¼ 0:01, and the sample size n is chosen as n ¼ 1000, n ¼ 5000, n ¼ 10000 and n ¼ 20000.From this table, it is seen that the test statistics ĵ?
n7 and ĵ? n9 outperform the rest of test statistics ĵ? nk 's and also KS-T, AD-T, CVM-T, d VGE ?
n : The power functions of ĵ? n7 and ĵ? n9 have largest values and close to one when the sample size n is 20000.For such small value C 0 ¼ 0:01, the test statistics ĵ?
n7 and ĵ? n9 can detect the alternative hypothesis when the sample size n is very large, while KS-T and CVM-T fail and their power functions are just around 0.35 even when n ¼ 20000.From Tables 1 to 3, the simulation results coincide with the theoretical results in Theorem 3 of that ĵ?
n4 is asymptotically equivalent to ĵ?
n6 when the sample size n is large.The power functions of ĵ? n1 and ĵ? n5 are similar, ĵ? n7 and ĵ?
n9 are similar, and also, ĵ? n3 and ĵ? n8 are similar in general.Example 3. In this example, we consider the estimation and 95% confidence interval of k under the multiplicative distortion measurement errors setting.The variable X is generated the same : 5000 realizations are generated and sample sizes are n ¼ 300, n ¼ 500 and n ¼ 1000, respectively.In this example, the Epanechnikov kernel KðtÞ ¼ 0:75ð1 À t 2 ÞI jtj < 1 f g is used to obtain the estimator ŵC ðuÞ: Condition (C3) requires the under-smoothing of h (nh 4 !0) for the non-parametric estimates.The asymptotic variances of the estimators kCk 's depend on neither the bandwidth h nor the kernel function K(t).
Hence, we can use the rule of thumb: This method for the bandwidth h is fairly effective and easy to implement in practice.
In Table 4, we report the mean, the standard errors (SD), the mean squared errors (MSE), and the 95% confidence intervals of kCt , t ¼ 1, 2, 3, 4: From Table 1, we find that the estimators kCt 's estimate true k ¼ 2 well.When the sample size n increases, the values of MSE decrease to zero.The estimators ð kC1 , kC2 Þ perform similarly and also ð kC3 , kC4 Þ perform similarly since their values of SD and MSE have tiny difference.Moreover, ð kC1 , kC2 Þ perform better than ð kC3 , kC4 Þ as the MSE values of ð kC3 , kC4 Þ are almost two and a half times as great as the MSE values of ð kC1 , kC2 Þ: The simulation results coincide with Theorem 4 as the estimators ð kC1 , kC2 Þ are asymptotically better than ð kC3 , kC4 Þ: For the 95% confidence intervals, the asymptotic confidence intervals based Table 4. Simulation results of Mean (M), Standard Error (SD) and Mean Squared Error (MSE) and confidence intervals of k based on the estimators kCs , s ¼ 1, :::, 4: "Lower" stands for the lower bound, "Upper" stands for upper bound, "AL" stands for average length, "CP" stands for the coverage probabilities.MSE is in the scale of 10 on NA and VST also have good performance when the sample size n !500: Generally, the VST confidence intervals perform slightly better than the NA confidence intervals, since the average lengths of VST confidence intervals are slightly shorter than NA confidence intervals.It is seen that the lower bounds increase and the upper bounds decrease as the sample size increases, moreover, all the coverage probabilities are around 95%.For the NA or VST confidence intervals, the confidence intervals based on ð kC1 , kC2 Þ perform better than the confidence intervals obtained from ð kC3 , kC4 Þ: The lower bounds based on ð kC1 , kC2 Þ are larger than those of ð kC3 , kC4 Þ, and the upper bounds based on ð kC1 , kC2 Þ are smaller than those of ð kC3 , kC4 Þ: Moreover, the average lengths based on ð kC1 , kC2 Þ are almost 0.625 times of those based on ð kC3 , kC4 Þ: The simulation results in this table suggest us to use the estimators ð kC1 , kC2 Þ for practical usage.The simulation results in Table 4 also shows the correctness of asymptotic results in Theorem 4.
In Table 5, we report the simulation results of statistics ĵ? C,ns 's and d VGE ?C,n to test the model (5.1) when C 0 ¼ 0 for the null hypothesis H 0 : The sample size n is chosen as n ¼ 300, n ¼ 500, n ¼ 1000, n ¼ 5000, n ¼ 10000 and n ¼ 20000.In this table, the proposed tests work well when the sample size n gets large.When the sample n is 300, test statistics ĵ?
C,n2 , ĵ? C,n8 and d VGE ?C,n have slightly larger Type-I errors, and they perform better when the sample size n !500: The simulation results in this table are shown that these test statistics can obtain correct Type-I errors when the sample size n is large, and this coincides with the theoretical results in Theorem 5-Theorem 7.
In Table 6, we conduct the simulation results of the test statistics ĵ? C,ns 's and d VGE ?C,n to test the model (5.1) when C 0 ¼ 0:01 for the alternative hypothesis H 1 : For such a small value C 0 ¼ 0:01, when the sample size n 500, the test statistics ĵ?
C,n7 and ĵ? C,n9 performs the best since the values of their power functions are larger than the others.When the sample size n increases to 20000, the power functions of ĵ?
C,n4 and ĵ? C,n6 are still around 0.3 when a ¼ 0:01, ĵ? C,n3 , ĵ? C,n8 performs better but their power functions are still around 0.6.The simulation results in this table coincide with Table 3 such that ĵ?
C,n7 and ĵ? C,n9 are the best for the alternative hypothesis with small value C 0 ¼ 0:01: C,n work well in this example since their power functions are all close to one even the sample size n ¼ 300.The simulation results in Tables 6 and 7 show that the test statistics ĵ?
C,n7 and ĵ? C,n9 are superior to the alternative hypothesis (5.1) when F 1 ðxÞ ¼ 1 À exp ðÀ0:15xÞ, x > 0: Because of the limitation of space, we only study the alternative hypothesis with such the choice of F 1 ðxÞ in this paper.The best choice of the test statistics should be analyzed case by case, since different sample sizes or the choices of alternative hypothesis F 1 ðxÞ will result in various performances of test statistics ĵ?
C,ns 's.See also the similar phenomenon in Table 2.

Data analysis
In this section, we present four examples to illustrate our proposed test statistics.The first three examples are the cases without distortion measurement errors, and the last one is considered with multiplicative distortion measurement errors.
n test statistic, its value is 0.3941 and the associated p-value is 0.6935.This again the the exponential distribution fits the data successfully.

Example 4
In this example, we first consider to analyze the Pima Indian Diabetes data as an illustration.This dataset is available online (https://www.kaggle.com/datasets/uciml/pima-indiansdiabetes-database).In literature, the triceps skin fold thickness (SFT) is usually treated as the confounding variable U, see for example, Nguyen and S ¸ent€ urk (2008); Zhang (2022); Zhang, Yang, and Li (2020).We consider to test the exponentiality of the unobserved insulin-X (2-Hour serum insulin (mu U=ml)) for this dataset.In Figure 5, we present the estimated pattern of wðuÞ: In this figure, the estimator ŵC ðuÞ and their 95% confidence bands are obtained by using the local constant estimation methods (Fan and Gijbels 1996).The plot in Figure 5 indicates that the distortion function is not a constant function, suggesting that the confounding variable-SFT has an effect on the observed insulin-X in this dataset.For this dataset, if X follows an exponential distribution, k1 ¼ 155:5482, its NA-based 95% confidence interval is ð141:2502, 173:0669Þ and its VST-based 95% confidence interval is ð140:5736, 172:1180Þ: However, k3 ¼ 67:6855, its NA-based 95% confidence interval is ð58:2429, 80:7824Þ and its VST-based 95% confidence interval is ð57:5554, 79:5987Þ: The values of estimators of k and the 95% confidence intervals vary differently, which further suggests that the unobserved insulin-X can not follow an exponential  < 1 Â 10 À16 ).These test statistic reject the exponential distribution of unobserved insulin-X since their p-values are all close to zero and much smaller than 0.01, when we choose the the significant level a ¼ 0:01: In Figure 6, we present the histograms of the observed insulin-X and the calibrated insulin-XC : Although the features of two histograms shows exponential patterns, the test statistics ĵ? C,ns 's, d VGE ?C,n and values of k1 , k3 suggest that the unobserved insulin-X does not follow an exponential distribution.

Discussions and further research
In this paper, we consider the estimation and hypothesis testing for the exponentiality of a positive random variable with or without the multiplicative distortion measurement errors.The asymptotic properties such as the local properties of these test statistics without distortions are established, and some comparisons of their rejection probabilities are discussed through simulation studies.The proposed test statistics are all asymptotically efficient under the multiplicative distortions settings, as the distortion function is asymptotically eliminated in the asymptotic variances when the null hypothesis holds.Our results in this paper enrich the literature on multiplicative distortion measurement errors in many regards.In future work, other test procedures for the goodness-of-fit of distribution functions can be considered, such as hypothesis tests for the uniform or normal distributions under multiplicative distortions.The content in this paper is focused on the standard condition such that the confounding variable U is independent of unobserved variable X.If this independent condition fails, other techniques should be considered to solve this problem without imposing the independence condition, see for example, Zhang, Lin, et al. (2023).This is merely the beginning for the goodness-of-fit of distribution functions, and the research for this topic is ongoing.

Disclosure statement
No potential conflict of interest was reported by the authors.

Figure 3 .
Figure 3.The histogram for variable X (the failure times) in Example 2.

Figure 4 .
Figure 4.The histogram for variable X (the remission times) in Example 3.

Figure 6 .
Figure 6.Histograms plots of the observed insulin-X (left panel) and the calibrated insulin-XC (right panel).
The statistics ĵC,n1 , ĵC,n5 under the multiplicative distortion measurement errors setting are asymptotically distribution free.Using Theorem 5, for a sample of size n, the proposed a-level asymptotic free tests of H 0 are 2 96 % 0:2102851: The theoretical results in Theorem 6 show that the statistic ĵC,n2 and d VGE C,n under multiplicative distortions are also asymptotically equivalent to the statistics ĵn2 and d VGE n without distortions.So the proposed a-level asymptotic free tests of H 0 are ) and CVM-C test statistic with the moment based estimator k in (2.5) (R command cvm:testð X Table1, the simulation results for sample size n ¼ 50, n ¼ 100, n ¼ 300, n ¼ 500 are reported.The performances of the test statistics ĵ?For the Kolmogorov-Smirnov test statistic, the Anderson-Darling test statistic and the Cram ervon Mises test statistic, the simulation results in Table i f g n i¼1 , "pexp", 1= kÞ), and the d VGE ?n test statistic (Villaseñor and Gonz alez-Estrada 2020).ns 's are summarized as follows.(I) When the sample size n is 50, the test statistic ĵ? n6 generally performs the best, since its rejection probabilities are stable and all close to 0:01, 0:05 and 0.10.Other statistics statistics perform not better than ĵ? n6 : For example, test statistic ĵ? n8 has larger rejection probability 0.0238 much larger than 0.01, and the rejection probabilities of ĵ? n1 , ĵ? n2 , ĵ? n3 and ĵ? n9 are around 0.007, smaller than 0.01.The rejection probabilities of test statistics ĵ? n3 , ĵ? n7 , ĵ? n9 are around 0.04, smaller than 0.05.And the rejection probability of ĵ? n7 is 0.0746, much smaller than 0.10.(II) When the sample size n is 100, all the test statistics perform better.The test statistics ĵ? n5 and ĵ? n6 perform the best, and ĵ? n1 and ĵ? n3 perform the second best.(III) When the sample size n is 300 or 500, most of the test statistics ĵ? ns 's can get the Type-I errors.
-Darling test statistic and the Cram er-von Mises test statistic are un-robust and unstable for difference estimators of k.For the d

Table 1 .
Under the null hypothesis H 0 , simulation results of rejection probabilities of test statistics ĵ? ns , s ¼ 1, :::, 9, the Kolmogorov-Smirnov test statistics KS-T, KS-M, KS-C, the Anderson-Darling test statistics AD-T, AD-M, AD-C, and the Cram ervon Mises test statistics CVM-T, CVM-M, CVM-C and Villaseñor and Gonz alez-Estrada (2020)'s test statistic d Table 2 with the sample size n ¼ 50, n ¼ 100, n ¼ 300 and n ¼ 500, respectively.(I) When the sample size n is 50, the test statistic ĵ? n4 performs the best and ĵ? n6 performs the second best, but they are not better than AD-T.The test statistics ĵ?

Table 5 .
Under the null hypothesis H 0 , simulation results of rejection probabilities of test statistics ĵ? , it performs the worst in general since it performs close to the null hypothesis (C 0 ¼ 0) when n 1000: Even n ¼ 20000, the values of power function of d VGE ?C,n are just around ð0:2, 0:4, 0:5Þ, which are the smallest values compared with the test statistics ĵ?C,ns 's.In Table7, when the values of C 0 increases to 0.1, ĵ?
n4 fails to reject the exponential

Table 6 .
Under the alternative hypothesis H 1 (C 0 ¼ 0:01), simulation results of rejection probabilities of test statistics ĵ? C,ns , s ¼ 1, :::, 9, and the test statistic d Figure 2. The histogram for variable X (times) in Example 1. value is 5.4518 and the associated p-value is 4:9864 Â 10 À8 : The Kolmogorov-Smirnov test statistic, the Anderson-Darling test statistic, the Cram er-von Mises test statistic and the d VGE ?
. The values of test statistics ĵ? are lager than 0.05 at the significant level 0.05.If we choose the significant level a ¼ 0:01, all the p-values of test statistics ĵ?