Some simultaneous progressive monitoring schemes for the two parameters of a zero-inflated Poisson process under unknown shifts

Abstract The zero-inflated Poisson (ZIP) distribution has been extensively studied in the literature during recent years. It is one of the most appropriate models for overdispersed data with an excessive number of zeros. Data of this type frequently arise in manufacturing processes with a low fraction of defective items. A ZIP model has two parameters; one presents the probability of extra zeros and the other stands for the expected Poisson count. In this article, we propose and study three new single control charts for detecting changes in either of the two parameters of a ZIP process. The performance of the schemes is studied via numerical simulation based on Monte-Carlo. We outline that all the existing control charts for monitoring ZIP processes are based on individual observations and assume that the shift sizes in either or both parameters are known. Our proposed schemes do not need any prior information related to shift size and can be used both for individual observations and subgroup samples. The results reveal that they are very effective in the detection of small and moderate shifts in process parameters. Practical implementation of the proposed schemes is also illustrated through an interesting real industrial data set on light emitting diodes (LED).


Introduction
Statistical process control (SPC) is a collection of statistical methodology for measuring and controlling the process behavior. From item-quality inspections in manufacturing industries to health-care surveillance, nowadays, everywhere we see the applications of process monitoring techniques. Control charts are the most widely used technique for monitoring a process and identifying changes in it. In the monitoring of high-yield or health-related processes, the considered quality characteristic cannot always be conveniently represented numerically. In such cases, the common practice is to classify each inspected item (or unit) as conforming or nonconforming, according to its specifications. Examples consist of data in the form of number of nonconformities in a unit of production, the number of errors in a service or the number of surgical deaths in a given time interval. In most cases, it is assumed that the process distribution of the underlying random variable (r.v.), such as the number of errors or number of deaths, is Poisson. Several control charts for the monitoring of Poisson observations are available in the literature (see, e.g., Borror et al. 1998;Lucas 1985;and Montgomery 2009).
The automation and progress in the manufacturing industry or in the health-care sector lead to several processes that are now characterized by a large number of zero counts. Usually, these counts represent the absence of nonconformities or the absence of the events from a rare disease. Woodall (2006) and Sim and Lim (2008), among others, noted that this excess in zeros results in an overdispersed distribution. This, in turn, leads to an underestimation of the rate parameter of the Poisson process. Thus, the attribute control charts that are based on the assumption of the standard Poisson distribution cannot be efficiently used due to an increased rate of false alarms. Consequently, the development of control charts under a more appropriate probability model is necessary. The zero-inflated Poisson (ZIP) distribution (Johnson et al. 2005, 351 À 356) has been considered by several authors as an appropriate model for that kind of process. Control charts for controlling ZIP processes have been proposed and studied by Xie and Goh (1993), Xie et al. (1995Xie et al. ( , 2001, Chen et al. (2008), Sim and Lim (2008), Fatahi et al. (2012aFatahi et al. ( , 2012b, and He et al. (2012He et al. ( , 2014. Applications of the ZIP distribution from several fields of applied research can be found in Bohning (1998) or Xu et al. (2014).
The ZIP distribution consists of two parameters, the first one is the zero-inflated parameter /, where / 2 0; 1 ½ and the second one is the rate k (k > 0) of the standard Poisson distribution. Usually, the monitoring schemes for the ZIP distribution are based on the number X t of nonconforming items at time t. Moreover, in most of the works, the performance of the examined schemes is evaluated when shifts occur only in parameter k. He et al. (2012He et al. ( , 2014 studied univariate CUSUM control charts that are suitable for the joint monitoring of changes in the parameters of a ZIP distribution. More specifically, He et al. (2012) considered separate CUSUM control charts, one for changes in k and the other for changes in /, which are running simultaneously. An out-of-control signal is given when at least one of these schemes triggers an out-of-control signal. Also, they proposed and studied a CUSUM scheme for detecting simultaneous changes in ð/; kÞ. Recently, He et al. (2014) proposed another joint CUSUM scheme that consists of two CUSUM control charts that are running simultaneously; one is a conforming run length (CRL) CUSUM chart and the other is a zero-truncated Poisson (ZTP) CUSUM chart. The CRL-CUSUM is used for detecting changes in /; while the ZTP-CUSUM is used for detecting changes in k. Their numerical analysis and comparison between the various CUSUM schemes for ZIP processes demonstrated that the scheme based on CRL-CUSUM and ZTP-CUSUM is better in the detection of changes in /.
When we are interested in monitoring a process that has two or more parameters, such as the ZIP process, a joint monitoring scheme is usually applied. Broadly speaking, there are two types of such schemes available in the literature. In the first type, separate charts, one for each parameter, are used for process monitoring. For example, a traditional joint monitoring scheme with separate charts is the one that consists of the X and R (or S) charts, which are used for monitoring jointly the mean and the variance of a normally distributed process. All the above-mentioned charts for jointly monitoring the parameters of a ZIP process are of this type. However, in recent years, various authors opposed the use of a scheme with two simultaneous control charts and proposed the use of a single chart for jointly monitoring all the process parameters. This approach is often more elegant and demonstrates an improved performance efficiency. We refer to the works of Chen and Cheng (1998), Razmy (2005), , , and Mukherjee et al. (2015) about joint monitoring via a single control chart.
Motivated by these works, we propose and study new control schemes, suitable for simultaneously monitoring both the parameters of a ZIP process. The proposed schemes are essentially based on a single charting statistic. Specifically, we consider an appropriate distance statistic and use the maximum likelihood (ML) estimates of the parameters / and k at every sampling stage t; ðt ! 1Þ. In order to obtain stable estimates at each sampling stage without collecting samples of very large size, we adopt a popular and well-known approach, similar to the one employed for self-starting control charts. See, for example, Hawkins (1987), Hawkins and Olwell (1998), Quesenberry (1997), Sullivan and Jones (2002), Hawkins and Maboudou-Tchao (2007), and Li et al. (2010). For a nice introduction to the concept of self-starting control charts, we refer to the recent book of Qiu (2013), subsections 4.5.1 and 5.4.1. Various interesting self-starting schemes are discussed in Chatterjee and Qiu (2009), Zou et al. (2007Zou et al. ( , 2012, Zou and Tsung (2010), and Shen et al. (2016). To the best of our knowledge, no self-starting type scheme for joint monitoring the two parameters of a ZIP process has been considered so far in the literature. We call it progressive monitoring, as we use a known parameter setting, unlike common self-starting schemes, which consider an unknown parameter setting. Given the importance of monitoring of the ZIP processes, in this article, we attempt to bridge the long withstanding research gap.
This article is organized as follows: In the next section, we briefly review the basic properties of the ZIP distribution while in Section 3 we present the proposed chart. We extensively study the design issues and performance of the proposed scheme in Section 4. Various numerical results obtained via Monte-Carlo simulation are also juxtaposed. An illustrative example on the practical application of the proposed schemes is presented in Section 5 while conclusions and aspects of further research are summarized in Section 6.
can be used to model count processes containing an excessive number of zeros. By definition, if X is a ZIP random variable, it is defined on f0; 1; 2; :::g (as for the standard Poisson distribution) and its probability mass function (p.m.f.) is given by f ZIP xj/; k ð Þ ¼ /I 0 f g x ð Þ þ 1 À / ð Þf P xjk ð Þ; x ¼ 0; 1; 2; :::; [1] where I 0 f g ðxÞ is the indicator function and f P ðxj kÞ ¼ expðÀkÞ k x =x! is the p.m.f. of the standard Poisson distribution, k > 0 is the mean of the Poisson, and / 2 0; 1 ½ is the zero-inflation parameter. If / ¼ 0, then the ZIP distribution coincides with the standard Poisson distribution with parameter k. If / ¼ 1, then the ZIP distribution reduces to the Dirac distribution on x ¼ 0. Moreover, it is not difficult to verify that the cumulative distribution function (c.d.f.) of X is given by is the c.d.f. of the standard Poisson distribution with parameter k. The mean and the variance of the ZIP distribution with parameters ð/; kÞ are, respectively, given by the following two expressions: The parameters k and / can be estimated by using the ML method. Suppose that a random sample of size n, say, X 1 ; X 2 ; :::; X n ; collected from a ZIP process, is available. Then, the maximum likelihood estimators (MLE) of the parameters k and / can be easily obtained numerically by solving the following system of nonlinear equations: where x þ is the mean of the nonzero values of the preliminary sample and x is the (ordinary) sample mean. Clearly, there is no closed-form expression for thek or/ and they must be determined iteratively. Moreover, Fisher's Information matrix is given by (see, e.g., Patil and Shirke 2007) J ¼ J 11 J 12 J 21 J 22 ; where It is worth mentioning that the variances fork and/ can be found in the diagonal elements of the J À1 . It is not difficult to verify (i.e., via the non-diagonal elements of J À1 ) thatk and/ are not independent r.v. Therefore, it is logical to consider a multivariate control scheme which is based on an appropriate distance statistic, between the ML estimatorsk and/ and the process parameters. We discuss this in the next Section.

Joint monitoring schemes for ZIP processes
Let us assume that the quality characteristic is described according to a ZIP process with parameters / and k. When the process is in-control (IC), we denote the parameters as / 0 and k 0 while in the outof-control (OOC) case, we denote them as / 1 ¼ s/ 0 0<s<1=/ 0 ð Þand k 1 ¼ dk 0 ðd > 0Þ. The main goal is to detect a shift in either of the two parameters or in both. The aim is also to detect a change in the mean number l X (process average) of the "nonconforming" units, from an in-control (IC) value l 0X to an out-of-control (OOC) value l 1X ð6 ¼ l 0X Þ. The l 0X ðl 1X ) is obtained by setting . Note also that, when / is fixed, the expected value l X is an increasing function of k and, when k is fixed, the expected value l X is a decreasing function of /.
Clearly, a change in either of the parameters affects the IC mean of the process. Moreover, certain shifts in both parameters may not necessarily change the process mean but may only alter the process variance and vice-versa. Therefore, it is necessary to develop control charts that are suitable for the simultaneous monitoring and detection of changes in the parameters of a ZIP process. A shift may occur in exactly one of the process parameters or in both. Also, we assume that we are in the standards-given case. This means that the IC values / 0 , k 0 are known from previous experience of the practitioner or they have been accurately estimated from a sufficiently large phase I sample.
3.1. Progressive monitoring using Wald statisticbased chart In this article, we consider a random sample of size m at each state t (time point) of inspection. Let h t ¼ ð/ t ;k t Þ be the ML estimators of h ¼ ð/; kÞ at time t. Clearly,ĥ t is based on n ¼ m Á t observations from a ZIP process. We introduce a charting statistic DS t based on the concept of Mahalanobis distance between the true parameter and its ML estimate. To be precise, we define whereĥ 0 ¼ ð/ 0 ;k 0 Þ and J is the Fisher's information matrix (see Section 2) evaluated at k ¼ k 0 ; / ¼ / 0 . Note that the most stable run-length distribution may be achieved with u n ð Þ ¼ ffiffiffi n p but for small subgroup samples u n ð Þ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi n=log n ð Þ p or u n ð Þ ¼ ½log n ð Þ 2 could also be considered. The performance advantages of these choices are discussed later. Note that our proposed statistic is similar to the Wald statistic described in Serfling (1980), Engle (1984), or Gombay (2002) in different contexts. The Wald-type statistic is popularly used for sequentially testing one or more parameters, based on ML estimators. The convergence to normality of MLEs is rather extremely slow in the case of zero-inflated distributions. This fact makes the zeroinflated distributions a separate subject of interest. Note that we consider the multiplier ffiffiffi n p that has certain practical advantages for moderate subgroup sizes. This is discussed later.
Clearly, at every inspection time t, a random sample from the ZIP processes must be available in order to compute the DS t . Consequently, any traditional approach of monitoring will require a large sample size at each time t, otherwise, the excessive number of zeros in the ZIP process will offer, in practice, a sample of zeros only. For small samples, it is not unlikely to obtain a sample without a nonzero value (especially, when / is large and k is small). In such a scenario, ML estimates will be of a trivial nature. Therefore, as an empirical rule (see, e.g., Rakitzis and Castagliola 2016), researchers and practitioners are recommended to collect a large sample size n from the IC ZIP process at every t ! 1, for example, n ! 100 (depending on the values of / 0 and k 0 ). Clearly, this is not always possible in practice and an alternative approach must be followed. The idea is described in the next subsection.

Implementation of the DS chart
A possible solution to that direction is to apply the idea of self-starting control charts. The suggested steps are the following: Step 1. Let X t ¼ X t1 ; X t2 ; :; X tm ð Þ ; t ! 1; and start collecting samples of size m from the process (i.e., X tl $ ZIP /; k ð Þ; l 2 f1; 2; :::; mg) until a prespecified time t 0 . The first X 1 ; X 2 ; :::; X t 0 À1 samples will be used neither for estimating the parameters nor for plotting anything on the chart. Usually, t 0 is referred to as warm-up time and the sample prior to time t 0 is often referred to as the pilot sample.
Step 2. At time t 0 , obtain a sample of size m from the process and combine it into one sample with all the previously available observations, that is, at time t ¼ t 0 , a sample of size n 0 ¼ m Á t 0 is available. Compute the variance of the observed values and see if it is greater than zero. If it is positive, use these observations for estimating the parameters of the process (by solving numerically the equations given in [3]) and evaluate the first value for the plotted statistic DS t , that is, DS t 0 . Otherwise, if the variance of all n 0 observations is zero, increase t 0 by one at a time until positive variance is attained.
Step 3. For simplicity, use the method of moment estimators as a starting solution. In case the iteration method of solving likelihood equations does not converge, use the method of moments estimators as a proxy solution.
Step 4. Plot the DS t 0 on the chart and, if it exceeds a specific upper control limit UCL, then an OOC signal is given at time t ¼ t 0 ; otherwise, go back to step 2 and continue sampling at time t ¼ t 0 þ 1.
This procedure continues until an OOC signal is triggered for the first time. As long as DS t < UCL; t 2 ft 0 ; t 0 þ 1; :::; g; we obtain at each time t a sample of size m from the process and combine it with all the previous observations into one sample. Up to time t 0 , we collect n 0 ¼ m Á t 0 samples and, in general, at any time t, it is n ¼ m Á t.
Clearly, with the proposed approach, it is not necessary to assemble large samples at each sampling stage t in order to calculate the ML estimates for / and k. Instead, all the available observations from the beginning of the inspection until time t ! t 0 are used for calculating the value of the charting statistic DS t . Note also that the value of DS t as well as the ML estimates for / and k are updated sequentially. The self-starting feature allows us to update the ML estimates of the process parameters without collecting very large samples at each sampling stage.

Progressive monitoring using max-type chart
Define, for any where Vark t jIC ¼ J À1 ð11Þ and Var/ t jIC ¼ J À1 ð22Þ . It is easy to see that both B 1 and B 2 follow the standard normal distribution when the process is IC. We will use these two statistics to construct a max-type scheme for a ZIP process following the idea of max chart for joint monitoring of two parameters. Chen and Cheng (1998) considered a Shewhart-type max chart for joint monitoring of parameters of the normal distribution when standards are known.  considered such a Shewharttype max chart for joint monitoring of parameters of the normal distribution assuming standards are unknown. Li et al. (2016) discussed CUSUM maxtype charts for joint monitoring of parameters of normal distribution assuming standards are unknown. Furthermore, Mukherjee et al. (2015) considered a similar scheme for two-parameter exponential distribution. For nonparametric joint monitoring of location and scale based on the max chart, interested readers may see Mukherjee and Marozzi (2016). In the present context, a max-type monitoring statistic based on the max estimators may be defined as 3.4. Implementation of the ZIP-max chart Step 1. Same as step 1 of Subsection 3.2.
Step 2. At time t ¼ t 0 , obtain a sample of size m from the process and combine it into one sample with all the previously available observations as in step 2 of Subsection 3.2. Use these observations for estimating the parameters of the process and evaluate B 1n and B 2n values using Eq.
Step 3. Plot the Max t 0 on the chart and, if it exceeds a specific upper control limit UCL, then an OOC signal is given at time t ¼ t 0 ; otherwise, go back to step 2 and continue sampling, at time t ¼ t 0 þ 1. This procedure continues until an OOC signal is triggered for the first time. As long as Max t < UCL, continue the sampling of m observations at each stage from the process and combine it with all the previous observations into one sample in similar fashion as that described in Subsection 3.2. As before, this will help us avoid large m and make the chart practically useful.

Progressive monitoring using likelihood ratio
Define, for any t ! t 0 and n ¼ m Á t, the likelihood ratio statistic where a i ¼ 1 if x i > 0 and a i ¼ 0, otherwise, and which can be further written as Several authors, including  and Mukherjee et al. (2015), considered the likelihood ratio-based statistics for joint monitoring of two parameters in normal and two-parameter exponential populations, respectively. Here we consider a similar scheme for the joint monitoring of the two parameters of the ZIP distribution using LR t as the plotting statistic. A likelihood ratio based monitoring statistic may be given by 3.6. Implementation of the ZIP-LR chart Step 1. Same as step 1 of Subsection 3.2.
Step 2. At time t ¼ t 0 , obtain a sample of size m from the process and combine it into one sample with all the previously available observations as in step 2 of Subsection 3.2. Now compute the plotting statistic LS t 0 using Eq. [7].
Step 3. Plot the LS t 0 on the chart and, if it exceeds a specific upper control limit UCL, then an OOC signal is given at time t ¼ t 0 ; otherwise, go back to step 2 and continue sampling, at time t ¼ t 0 þ 1. This procedure continues until an OOC signal is triggered for the first time. As long as LS t 0 < UCL, continue the sampling of m observations at each stage from the process and combine it with all the previous observations into one sample in a similar fashion as described in Subsection 3.2.

Follow-up procedure
Given a signal in any of the three schemes, investigators may wish to identify whether the shift has occurred in /, or in k, or in both of them. Given a signal,  and Mukherjee et al. (2015), respectively, in the context of joint monitoring of two parameters of a normal distribution and a shifted exponential distribution, recommended a follow-up procedure based on the two separate likelihood ratio tests, one for each parameter. In a similar fashion, we may also consider here two separate likelihood ratio tests, one for k and the other for /, for post-signal follow-up. More specifically, if there is a signal at the jth stage, we propose to compute the cumulative sample where n ¼ m Á t, n 0 is the number of zeros in the sample andk is the solution of n 0 1À/ 0 ð Þe Àk and LR /;t ¼ 2ln We propose to compute the observed valueŝ LR k; t andLR /; t of LR k;t and LR /;t , and then evaluate the probabilities using a simple bootstrap technique based on resampling (see, e.g., Chatterjee and Qiu 2009). Thereafter, we may use the following rule: i. If both p 1 and p 2 are low, say, less than 1=ARL 0 , a shift in both the parameters may be declared. ii. If p 1 is low but not p 2 ; a shift only in k may be declared. iii. If p 1 is low but not p 2 ; a shift only in / may be declared. iv. If, by chance, neither p 1 nor p 2 is low, we may consider the signal as a false alarm.

Performance metrics of the proposed chart
In the present subsection, we consider the properties of the run-length distribution in order to investigate the performance of the proposed charts based on the DS t , Max t ; and LR t statistics. For example, the run length (RL) of the DS-chart is given by and it is the number of points plotted on the chart until it triggers an OOC signal for the first time. The RL for the Max-chart and the LR-chart can be defined similarly. Clearly, RL D cannot be lower than t 0 . The EðRL D Þ and ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi VarðRL D Þ p are known as the average run length (ARL) and the standard deviation of run length (SDRL), respectively. Both are the most common performance measures of a control chart. When /; k ð Þ¼ð/ 0 ; k 0 Þ, that is, when the process is IC, the ARL and SDRL may be denoted as ARL 0;D and SDRL 0;D , respectively. On the other hand, if / 6 ¼ 0 and/or k 6 ¼ 0, the process is said to be OOC and the ARL and SDRL are denoted as ARL 1;D and SDRL 1;D , respectively. In a similar manner, we may introduce the IC (OOC) ARL and SDRL of the maxtype chart as ARL 0;M ðARL 1;M ) and SDRL 0;M ðSDRL 1;M Þ, respectively. Further, we denote the IC (OOC) ARL and SDRL of likelihood-ratio based scheme as ARL 0;L ðARL 1;L ) and SDRL 0;L ðSDRL 1;L Þ.
Apart from the ARL and SDRL, an alternative performance measure is the number N of observations to signal. The EðNÞ and ffiffiffiffiffiffiffiffiffiffiffi VðNÞ p are the average number of observations to signal (ANOS) and the standard deviation of the number of observations to signal (SDNOS). In the present context, we consider an additional test sample of fixed size m at each stage of inspection and, therefore, we may write ARL ¼ m Á ANOS. Note also that ANOS cannot be lower than m Á t 0 . For simplicity in notation, we shall denote the ANOS and SDNOS along with similar subscripts as in ARL and SDRL to distinguish the three schemes.
The statistical design of the proposed scheme requires the determination of the UCL in order to have the desired IC ARL (or IC ANOS) value. We use Monte-Carlo simulations to estimate appropriate UCL values and subsequently evaluate the OOC performance of the proposed schemes for various shifts in / 0 and k 0 using R 3.3.2. (R Core Team 2017).

Numerical study
In the current section, we present the results of an extensive numerical study in order to investigate the performance of the proposed schemes. We consider m ¼ 5 and t 0 ¼ 10, while the statistical design is based on the desired IC ARL value, which is ARL 0 ¼ 500. Furthermore, we choose three IC values for k, that is, k 2 f2; 5; 10g. For each of these k 0 , we choose three IC values of / 0 , that is, / 0 2 f0:2; 0:5; 0:8g. Precisely, we consider nine combinations of ð/ 0 ; k 0 Þ as the possible IC scenarios for our investigation.
The reported ARL; SDRL; ANOS; and SDNOS values are obtained by using 25,000 replicates of the Monte-Carlo simulations. As in Antzoulakos and Rakitzis (2008) and Riaz et al. (2011), we additionally report the 5th, 25th, 50th, 75th, and 95th percentiles of the RL distributions to provide more information to the practitioners about the RL distributions of the proposed schemes. Following Mukherjee et al. (2015), we set the winsorization level as 5000 for individual run length in the course of the Monte-Carlo simulation. Interestingly, the multiplier ffiffiffi n p in the plotting statistics enable us to obtain a stable UCL that is free of the nuisance parameters and depends only on the desired IC ARL 0 value.
Several interesting findings were revealed from Tables 1À9. Most importantly, all the charts can detect upward and downward shifts in the parameters of the ZIP process. Nevertheless, the charts based on the DS statistic and the LR statistic are always superior to the chart based on the Max statistic. There is no clear winner between the DS and the LR charts. They perform similarly in most cases, but overall the DS chart is the most competitive scheme. A clear advantage of the proposed schemes may be seen as follows: With only one control limit, we can detect any kind of shifts in k 0 and / 0 , either in the same or in the opposite direction. Based on computational results, it is easy to note that the OOC ARL value is a function of the absolute distance between the OOC and the IC values of the process parameters and is not affected by the direction of shift.
It is worth mentioning that the CUSUM control charts of He et al. (2012) are not always able to detect the change in the ZIP process when both / and k change in the same direction. This is the case in Table  VII of He et al. (2012). Moreover, even though, for some specific shifts in k 0 and / 0 ; the OOC mean is the same (e.g., this is the case in Table 2 for / 1 ; k 1 ð Þ¼ ð0:60; 2:50Þ), the proposed charts (especially the DS and LR charts) react rather quickly due to the change in the variance of the ZIP process. Finally, it is not difficult to verify that, for large shifts in process average, the ARL % t 0 ; whereas the ANOS % m Á t 0 .          There is a common perception that, if all test samples, which are collected sequentially, are combined into one, then the process may react late to a shift occurring at steady state; that is, if the process is in the IC state at the beginning of phase II and a shift occurs after some time. A reason for such perceptions is the sample drawn after a possible shift, which is actually a contaminated sample. Nevertheless, we present, in this subsection, via Monte-Carlo simulation, that the proposed scheme has very good detection power if there is a shift in the steady state. We consider two situations.
i. Shift at steady state after 10 subgroup samples of size 5; that is, after 50 observations. ii. Shift at steady state after 20 subgroup samples of size 5; that is, after 100 observations.
We summarize our findings in Table 10 for some selected choices of shifts along with the OOC results of the corresponding zero-state shifts, in order to facilitate the comparison. Let us consider, for example, initial IC parameters k 0 ; / 0 ð Þ¼ ð2; 0:2Þ. Suppose that there is a shift to k 1 ; / 1 ð Þ¼ ð3; 0:25Þ. If a shift occurs at zero-state, the DS chart signals (on average) after 12.2 samples. Nevertheless, if the same shift occurs at steady state, after 10 test samples, the DS charts signals (on average) after 22.7 test samples. This means, effectively, that the DS chart uses (on average) 22:7 À 10 ¼ 12:7 OOC test samples to detect the shift (because the first 10 test samples are during the IC period). Further, if the shift occurs at steady state, after 20 test samples, the DS charts signals (on average) after 33.6 test samples. In a similar manner, this means, effectively, that the DS chart uses (on average) 33:6 À 20 ¼ 13:6 OOC test samples to detect the shift (because the first 20 test samples are during the IC period). Therefore, the number of test samples from actual OOC situations that the DS chart is using to detect the shift is almost similar under zero-state and the two steady-state shift situations. The same may be said about various shifts of other magnitude as well as for the other two schemes. Similar to the zero-state situation, we see that, for various steady-state shifts, both the DS chart and the LR chart have very similar performance.
In this context, it is worth noting that, in some nonparametric literature, we see that sequential ranks (just like using cumulative samples) often works better if there is a delayed shift or shift at an unknown time point than at the beginning of phase II analysis. That is clearly discussed in Bandyopadhyay and Mukherjee (2007) and Hu skov a and Hl avka (2012). Interested readers are referred to these and other related articles mentioned therein. We omit further details for brevity.

Performance comparisons of the proposed schemes with traditional phase II schemes
There are some well-known control charts for monitoring parameters of a ZIP process. For example, Shewhart-type chart (hereafter ZIP-SH), pÀCUSUM chart (hereafter ZIP-PC), kÀCUSUM chart (hereafter ZIP-LC), p À k-CUSUM chart (hereafter ZIP-PLC), and finally T-CUSUM chart (hereafter ZIP-TC). For further details about these charts, interested readers may see He et al. (2012). Note also that the notation p in He et al. (2012) is equivalent to what is being referred here as ð1 À /Þ. Moreover, all these control charts are based on individual observations and are not useful for subgroup samples of size m ! 1.
In the current article, we consider control charts that can be implemented for individual observations as well as for subgroup samples of size m ! 1. Second, the ZIP-PC chart is useful only for detecting a downward shift in / while the ZIP-LC chart is suitable only for detecting an upward shift in k. The ZIP-PLC chart is useful for an upward shift in both the parameters while the ZIP-SH chart is suitable for an upward shift in k and a downward shift in /. The ZIP-TC chart is useful for shifts in either parameter, preferably for an upward shift in k and a downward shift in /, but is also useful for shifts in any directions and is the most attractive among all existing charts in terms of performance properties and ease of implementation. He et al. (2014) reconfirmed its dominating performance by comparing it with some other charts. Third, except for the ZIP-SH, all other charts assume that the target shift is known. Their optimal design and determination of upper control limit(s) depend on the magnitude of shifts in either or both parameters. Even the ZIP-TC chart suffers from this limitation. On the other hand, because of the discrete nature of the ZIP model, it is almost difficult to find a suitable control limit for the ZIP-SH chart that ensures a target false-alarm probability.
From these perspectives, the proposed schemes are more general in nature. They are useful for shifts in either of the two parameters or in both parameters, in any direction. Furthermore, these schemes do not require any knowledge about the direction and the magnitude of possible shifts either in determining control limits or in implementation. In addition, they can be used for both individual observations and subgroups of size (m ! 1). The proposed schemes are clearly not comparable with any of the existing schemes as, in practice, often the shift size and its direction are unknown or we have subgroups of size m > 1 where none of the existing schemes can be used. On the contrary, our proposed schemes will smoothly work. One can, however, compare the proposed schemes with existing schemes in very specific situations, for example, when m ¼ 1 and the direction and magnitude of the possible shift are known. Even in such cases, our proposed DS scheme outperforms ZIP-SH schemes in almost all situations except for very high magnitude of shifts. In some situations, the DS scheme slightly underperforms compared with the optimal ZIP-PC, ZIP-LC, ZIP-PLC, or ZIP-TC in known-shift situations when m ¼ 1. Note that, if the parameters under OOC situation are unknown, we cannot construct these CUSUM charts. As a consequence, either the false-alarm rate of existing CUSUM schemes will increase or there will be a delay in detection.
We present the numerical results to highlight the comparative study in Table 11. We consider 64 cases covering various situations, namely (i) different IC settings, (ii) shifts in different directions, (iii) cases where the mean remains the same but variance differs, and also (iv) cases where the mean changes while the variance remains the same. Noting that DS-type chart is the most useful when m ¼ 1 compared with LR or Max schemes. So, for the rest of the numerical comparison, we only study DS-type schemes. More specifically, we consider three cases, namely, u n ð Þ ¼ ffiffiffi n p (referred to as the DS scheme), u n ð Þ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n=log n ð Þ p (referred to as the MDS-A scheme), and uðnÞ ¼ ðlognÞ 2 (referred to as the MDS-B scheme). From the numerical results in Table  11, we see that the MDS-A scheme outperforms the optimal ZIP-TC in 29 out of 64 situations. Additionally, it is very competitive in all other cases as well as being more flexible. If, for a particular shift size, one or more among the existing charts are not useful, we indicate them by NA in Table 11. Additional computational details are provided in a supplementary file (available at http://www.asq.org/pub/jqt/).
In summary, it is worth mentioning that the advantages of the DS schemes over the existing CUSUM charts as in He et al. (2012He et al. ( , 2014 are as follows: i. There is no need to consider possible OOC values in determining optimal control limit as possible shifts of interest; ii. The proposed chart is useful for rational subgroups as well as for individual observations; iii. The proposed chart is competitive for shifts in parameters in any direction.
Remark: As was mentioned previously, for all the IC scenarios considered for constructing Tables 1 À 9, the value of the UCL of the proposed DS, LR, and Max schemes are the same. That is, practitioners do not even need the information about the IC values of the parameters to obtain control limits. This is true when subgroup sample m is moderately large, say five or more.
For the DS scheme with u n ð Þ ¼ ffiffiffi n p ; the UCL depends only on the values of ARL 0 , m; and t 0 . This can be justified by the fact that the process DS mt 0 ; DS mt 0 þm :::; DS mt 0 þ2m ; … generated by the successive values of the proposed charting statistic DS t is closely related to the Wald statistic process of order s ! 2, where s is the number of process parameters. Here, it is s ¼ 2, that is, u and k and, in that case, it may be approximated by a Bessel type process (see Gombay 2002). Critical values for the respective sequential test use a similar statistic, which is used for testing the hypotheses: can be found in the book of Borodin and Salminen (2002). However, the distribution of the related stopping time variable (i.e., the run length RL) where c is a prefixed constant (e.g., the UCL), is much more complicated and, to the best of the authors' knowledge, no related results exist in the literature.

Signal types and possible actions in monitoring number of defects
A major question that arises in practice is how to connect a signal with a specific action. It is clear from the numerical analysis that both k and / can change in both directions (upward and downward), either in the same or in the opposite. This affects the behavior of the process and creates several OOC situations. Each of these situations needs to be treated explicitly. Zhou and Zhu (2008) considered different actions (four different scenarios) when a signal is given under a statistical process monitoring procedure. Their approach is based on combining SPC with conditionbased maintenance (CBM); this means that the process is monitored by SPC (e.g., a control chart) and, when an OOC state is detected, reactive maintenance Let us assume that X $ ZIPð/; kÞ and X denotes the number of defects that occurred in each sample. Therefore, following the footsteps of Zhou and Zhu (2008), we can summarize the actions on the different signals as follows: i. A signal indicates that a change in one (or both) parameter(s) causes a decrease for both mean and variance of the process. This is, for example, the case / 1 ; k 1 ð Þ¼ ð0:2; 1:5Þ or / 1 ; k 1 ð Þ¼ ð0:25; 1:5Þ in Table 1. Process mean and variance shift from their IC values 1.6 and 2.24 to 1.2 and 1.56 or to 1.125 and 1.547, respectively. In the context of monitoring the number of defects, this means that there is a decrease in the average number of defects produced by the process while process variability has also been decreased. This is a good sign and, actually, there is no need to stop the production but wait till planned maintenance. ii. A signal indicates that a change in one (or both) parameter(s) causes an increase in mean and a decrease in the variance of the process. This is, for example, the case / 1 ; k 1 ð Þ¼ ð0:5; 1:0Þ or / 1 ; k 1 ð Þ¼ ð0:7; 1:5Þ in Table 3. Process mean and variance shift from their IC values 0.4 and 1.04 to 0.5 and 0.75 or to 0.45 and 0.923, respectively. In that case, the variance decrease is a sign of improvement and can be also interpreted as a sign that a process is moving toward long-run stability. However, an increase in process average (i.e., an increase in the average number of defects) is related to process deterioration and it is usually of major concern. In that case, we suggest a call for a reactive maintenance. iii. A signal indicates that a change in one (or both) parameter(s) causes a decrease in mean and an increase in the variance of the process. This is, for example, the case / 1 ; k 1 ð Þ¼ ð0:6; 6:0Þ or / 1 ; k 1 ð Þ¼ ð0:7; 7:0Þ in Table 5. Process mean and variance shift from their IC values 2.5 and 8.75 to 2.4 and 11.04 or to 2.1 and 12.39, respectively. In that case, even though the average number of defects has been reduced, the variance increase is a sign of process deterioration. Thus, the process is unstable and we suggest reactive maintenance. iv. A signal indicates that a change in one (or both) parameter(s) causes an increase in both mean  and variance. This is, for example, the case / 1 ; k 1 ð Þ¼ ð0:7; 2:0Þ or / 1 ; k 1 ð Þ¼ ð0:7; 2:5Þ in Table 3. Process mean and variance shift from their IC values 0.4 and 1.04 to 0.6 and 1.44 or to 0.75 and 2.063, respectively. This is clearly the most undesired situation and we suggest reactive maintenance. v. If there is a signal but no strong evidence for changes in either mean or variance (i.e., a falsealarm case), it is suggested to call for compensatory maintenance.

Example
In order to demonstrate the practical application of the proposed schemes, we consider the data from He et al. (2012He et al. ( , 2014. The data were collected from a light emitting diode (LED) packaging industry. He et al. (2012) considered a sample of size 200 and assumed the first 100 data points as phase I observations. They removed four data points from the first 100 samples during phase I analysis and argued that the ZIP process is the appropriate model for the dataset. The IC parameter values were k 0 ¼ 5:5619 and / 0 ¼ 0:8745, as estimated in He et al. (2012), based on 96 outlier eliminated observations. Further details on this data set can be found in He et al. (2012He et al. ( , 2014. For a fair comparison, we use sample size m ¼ 1 and t 0 ¼ 10. That is, for our study, n 0 ¼ m Á t 0 ¼ 40 as He et al. (2012) also considered in their phase I sample. In order to facilitate comparison with He et al. (2012He et al. ( , 2014, we set ANOS 0 ¼ ARL 0 ¼ 200 for analyzing the data. When m ¼ 1; we know that ANOS 0 ¼ ARL 0 . Using 25,000 iterations of Monte Carlo simulation, we determine the upper control limit UCL DS;200 ¼132.1968. Clearly, we get a signal at the very first points with each of the three schemes. It is therefore easy to understand that some shift already took place within the first 40 samples. This is natural as, according to He et al. (2012), there are two outliers within the first 40 samples. The details are omitted for brevity. For more meaningful and fair comparison with control schemes as in He et al. (2012), we then remove the first four outliers and then observe when our proposed schemes are giving signals. We present the resulting control charts for DS, MDS-A, and MDS-B schemes in Figure 1. We show the chart for 100 phase II samples of He et al. (2012), without the initial 96 observations. It is expected that elimination of outliers from the warm-up sample will slow down the detection and protect more against false alarms (which is also important). We see that there is only one problematic observation (observation no. 103) among the first 30 phase II samples. In this part, He et al. (2012) could not detect any signal via the ZIP-PC scheme. A correct signal was triggered by ZIP-PLC and ZIP-TC schemes at the right point but then there was a series of false alarms. The process eventually came back to normalcy and then again went out of control after sample no. 127. The ZIP-PC scheme signaled at sample no. 162 for the first time. Interestingly, MDS-B is producing signals at sample no. 133, while the MDS-A and DS schemes both are giving the first signal at sample no. 137. Note that the MDS-A scheme marginally misses signals during sample no. 133 À 136. The Table 12. Plotting statistics for DS, Max and LR statistics for LED data as in He et al. (2012) tk/ DS t MX t LR t slight delay in DS-type schemes is natural as we omitted outliers from warm-up samples, which otherwise lead to quick detection. We additionally consider, a situation with m ¼ 4; t 0 ¼ 10; ARL 0 ¼ 200 ði.e., ANOS 0 ¼ 200 Â 4 ¼ 800Þ. Using 25,000 iterations of Monte Carlo simulation, we find, in this case, the upper control limit UCL DS;200 ¼ 62, UCL Max;200 ¼ 36; and UCL LR;200 ¼ 62 for DS, Max, and LR charts, respectively. In Table 11, we also present the values of the plotting statistics, respectively for DS, MDS-A, MDS-B, Max, and LR statistics along with the ML estimates of the process parameters at each sampling stage from t ð! t 0 Þ, in the columns entitledk and/. Note that the first plotting statistic in each case is based on all the first 10 test samples, each of size four, that is, 40 warm-up data points. The next plotting statistic in each case is based on 40 warm-up data points and the next four sample observations that constitute a test sample. Similarly, the third plotting statistic in each case is based on 44 previous observations and the next test sample of size four, and so on in Table 12. In Table 12, the points at which OOC signals are obtained are shown in the boldfaced entries.
From Table 12, it is easy to see that an OOC signal is observed for the first time at the 32nd test sample (i.e., after 128 sample observations) in all the three charts when the value of each plotting statistic exceeds the corresponding upper control limit when target ANOS 0 ¼ 200. These findings are consistent with the findings of He et al. (2012) in the sense that they observed a clear shift only after the 128th sample point in ZIP-LC and ZIP-TC charts. There were some early signals in these charts but there was no persistent shift, which may also be an early false alarm, as ANOS 0 ¼ 200 is not very high as per industry standard. Nevertheless, if we set target ARL 0 ¼ 200, from Table 12 and Figure 2, it is easy to see that the DS chart and the LR chart signals at the 35th test sample (i.e., after 140 sample observations) but the Max chart does not signal at all. This is not surprising as we show via our simulation study that Max chart is not very effective in monitoring two parameters of a ZIP process simultaneously. He et al. (2012) finally concluded about a shift after the 162nd sample, whereas a signal is clearly observed slightly earlier when progressive monitoring is used based on the DS chart or the LR chart. Further, it comes with  Table VIII of He et al. (2014).
an advantage that no phase I sample need to be identified beforehand.

Conclusions
The zero-inflated Poisson distribution is a well-known model, which is applied frequently in a zero-defects environment as well as for processes with a large number of zeros. In this article, we propose and compare three control charts suitable for monitoring a zero-inflated Poisson process. We provide detailed statistical design and implementation steps for each of the proposed schemes. We also introduce a post-signal follow-up procedure. We compare three schemes under zero-state and steady-state shifts in terms of ARL and ANOS. Percentile points of the respective run-length distributions are also given. We also compare these schemes with Shewhart-and CUSUM-type charts for ZIP processes. Some variations of the proposed distance-based schemes are also considered. We carry out simulation based on a Monte-Carlo approach for the determination of the upper control limit of the proposed schemes. For reasonably higher target ARL 0 , such as 370 or 500, the value of the upper control limit of these schemes does not depend on the IC values / 0 and k 0 of the ZIP process. Our numerical analysis revealed that the proposed DS scheme is very powerful in the detection of small and moderate shifts in process parameters. Also, it can detect both increasing and decreasing shifts from / 0 and k 0 , as well as shifts in the opposite directions. Under certain circumstances, it outperforms the CUSUM charts of He et al. (2012) for ZIP processes in terms of the OOC ARL. The LR scheme behaves almost similarly as the DS scheme and the Max scheme is by far the worst in most cases.
Finally, future research consists of dealing with a common practical problem that the true value of the process parameters is unknown. In such situations, subsequent modifications of the proposed schemes need to be studied in detail. In other words, in an unknown parameter context for a ZIP process, development of self-starting monitoring schemes is an interesting open research problem and is worth investigating. In this context, it is also important to study the effect of estimation of unknown parameters based on a warm-up sample and use these estimates as the true value in the proposed schemes. The research toward that direction is in progress. The development of time-weighted multivariate schemes (e.g., MEWMA) based on the DS or LR statistics also constitutes an important topic for future research.  Table VIII of He et al. (2014). Subgroups are of size four.