Analysis of medians under two-way model with and without interaction for Birnbaum–Saunders distributed response

The Birnbaum–Saunders (BS) distribution, well-known as the fatigue-life distribution, has been used in numerous disciplines ranging from engineering to medical sciences. In this article, we develop a test for analysis of medians for BS distributed response to assess the impact of two interacting factors on the median, where no test is presently available. The proposed integrated likelihood ratio test (ILRT) eliminates the nuisance shape parameters by integrating them out. The second-order accurate asymptotic chi-square distribution of ILRT is derived. An in-depth simulation study strongly supports its excellent performance even under small group sizes. Furthermore, ILRT developed under the one-way model is found uniformly superior over its peers, is straightway extendable under general multiway setup, and has potential to be extended to other non-normal response variables. Its genuine need in industry, where non-normal responses are commonly encountered, is highlighted through analysis of three real data sets: ILRT strongly picked out the deposition time as influential factor in epitaxial layer experiment, revealed significant impact of spools on fiber life for the failure times of Kevlar 49 fiber data, and gave more accurate parameter estimates in delivery time data experiment, as assessed by various model adequacy tools, where its competitors failed to deliver desired results.


Introduction
The Birnbaum-Saunders (BS) distribution is widely used in the context of material fatigue and reliability applications, also known as fatigue-life distribution.It is used in various situations where the accumulation of a particular factor forces a quantifiable characteristic to exceed a critical threshold.For example, migration of metallic flaws in nano-circuits due to heat in a computer chip, accumulation of harmful substances in the lungs from air pollution, the occurrence of natural disasters such as earthquakes and tsunamis, among many other situations.Recently, its use is not limited to analyze 'fatigue' or 'cumulative damage' type areas from where it was originated but also has found profound applications in business, economics, finance, industry, insurance, nutrition, psychology, quality control, among many other disciplines and is acquiring increasing popularity.For more details see Leiva [3], among many other references.
One-way and two-way analyses are commonly used to analyze data generated from industries and many other disciplines, where the data are most often likely to be nonnormally distributed.More often such data are non-negative and have a lifetime related probability distribution, where there is an acute paucity of literature related to two and multiway setup.Often the non-negative and/or skew nature of the data is totally ignored and normal-based methods are blindly used, leading to a possibility of incorrect decisions.Section 5 demonstrates analyses of three real data sets related to engineering, where the results based on F-test were not reliable.Thus, there seems to be an absolute need for developing correct statistical procedures for handling non-normal ANOVA problems.In spite of the fact that BS distribution is widely used in various disciplines, the two-way ANOVA setup under BS is absolutely un-handled in the literature, whereas there are only a few ways available in a one-way setup.Thus, the present work seems to be an attempt to fill this gap in the literature.
Noting that the mean and standard deviation of the BS distribution are directly proportional to the scale parameter, as well as the scale parameter of the distribution equals the median, assessing the impact of external factors on the median or equivalently on the scale is an important problem in industrial applications, leading to a factorial model for the scale parameter which henceforth is refereed to as the median.Presently no test is available in the literature under this scenario, while under the one-way model, two tests by Niu et al. [6] based on generalized pivotal quantities (GPQ) and Delta method are available.Regression models have been handled through the log-transformation on the BS distributed response, see for example Rieck and Nedelman [7] and Lemonte et al. [4] among others.
There is an acute paucity of literature on the analysis of factorial data with non-normal response variables.A few existing tests like generalized linear models make use of maximum likelihood-based procedures which work better under relatively larger sample sizes while in industry there could be many situations where gathering more data would be probably expensive.
The present work develops a simple test for the two-way setup with and without interaction for the median, based on the idea of eliminating nuisance shape parameters through the use of integrated likelihood (IL).An integrated likelihood ratio test (ILRT) is constructed and is shown to be asymptotically chi-square distributed up to the second-order.Simple ad-hoc multiplicative adjustments for small group sizes have been developed leading to improved performance under small samples rendering the ILRT uniformly usable.A simulation study shows that ILRT controlled the type-I error remarkably and uniformly well and attains very good power under varying realistic situations.A similar test under the one-way model is also developed and found to outperform the existing GPQ and Delta method-based tests.For similar work under normally distributed data, we refer to Kulkarni and Patil [9], SenGupta and Kulkarni [8], and Kulkarni and SenGupta [10] in the context of directional data.
The remainder of this article is organized as follows.Section 2 develops the ILRT under a two-way with and without interaction model and its asymptotic distributions.Section 3 is devoted to a thorough performance assessment of the ILRT based on a rigorous simulation study.Section 4 develops its version under a one-way model and reports its performance assessment.A rigorous analysis with several model adequacy checks of three real data sets from industrial applications presented in Section 5 highlights the advantages of the proposed test in industry.The paper is concluded with a section of conclusions.

The proposed ILRT
Let L(ψ ψ ψ, λ λ λ) be a likelihood function under consideration, ψ ψ ψ being the vector of parameters of interest and λ λ λ ∈ the vector of nuisance parameters.An IL is of the form where is a non-negative weight function on making the above integral convergent for every fixed ψ ψ ψ.L depends on the data and ψ ψ ψ, and can be used as a standard likelihood function for all likelihood-based inference procedures.
In the following, we develop analysis of medians, a version of ANOVA for the mean parameter under two-way with interaction setup.The medians are assumed to be influenced by two factors while the shape parameters are assumed to be unaffected.

Two-way setup with interaction
The probability density function (pdf) of a BS distribution (see Leiva [3]) under two-way setup with interaction form for the median ψ ij is is the unknown overall effect, α i is the unknown effect of the ith level of first factor, β j is the unknown effect of the jth level of second factor, (αβ) ij is the unknown interaction effect of the ith level of first factor and jth level of second factor and γ ij are unknown shape parameters.
Let T ij1 , . . ., T ijn ij be the observations from the (i, j)th cell following a BS( ijk /N) −1 be the overall sample size, arithmetic and harmonic mean, respectively, Tai..
) −1 be the arithmetic and harmonic means of ith row, with similar counter parts for column, Taij.= ijk /n ij ) −1 be the arithmetic and harmonic means of (i, j)th cell and ta , th , tai.. , thi.. , ta.j., th.j., taij., thij.denote their observed values, respectively.The hypothesis corresponding to the test for no interaction is  [1] recommend the uniform prior taking a constant value 1 over the entire nuisance parameter space as an apparent option when no prior information is available about the nuisance parameters.In the present case, an added advantage is that it gives a closed-form expression for the resulting IL.
Though analytical proof of the existence of a unique maximum likelihood estimate (MLE) is not tractable at stage, thousands of sample data sets were randomly generated and observed to give unique MLE's both under ¯l0 as well as ¯l1 , and practically no difficulty was found in employing the test developed here.
The expressions of resulting IL ratio and the ILRT statistics say, T ILRT2 are same as those given in Equations ( 6) and ( 7), respectively, with estimates of median parameter replaced by ψij0 = μ0 + αi0 (10) and are the MLE's of ψ ij under H 0 and whole parameter space, respectively.The asymptotic χ 2 distribution of T ILRT2 is stated in Theorem 2.2.
The proof is exactly in line with the one of Theorem 2.1 and is omitted.

Assessment of convergence of type-I error rate through simulations
The type-I errors of ILRT are simulated for various common group sizes n = 5( 5)50 (here and in the sequel the notation a(b)c denotes the set of values of n starting with a, ending with c, with successive increments by the number b), overall effect μ = 0.5, 1, 2, 3.5, 5; ith row level α 1 = 0.5, 1, 2, 3.5, 5, α i = α i−1 × h 2 , i = 2, . . ., I; incrementing h 2 through 0.5, 1, 1.5.By default, under H 0 , under the without interaction setup, β j = 0.5, 1, 2, 3.5, 5, j = 1, . . ., J while under the with-interaction setup, jth column level β 1 = 0.5, 1, 2, 3.5, 5, β j = β j−1 × h 2 , j = 2, . . ., J and (ij)th interaction effect (αβ) ij = 0.5, 1, 2, 3.5, 5, i = 1, . . ., I, j = 1, . . ., J (and other parametric combinations as specified in the set P given in Section 3.1 except GSP).The simulated type-I errors plotted in Figures 1 and 2 for with and without interaction setup, respectively, converge very well to the target level 0.05, indicating good conformation to the asymptotic χ 2 distribution specified by Theorems 2.1 and 2.2, for group sizes exceeding 5 and 30, respectively.Under smaller group sizes, the type-I errors are slightly smaller for the with-interaction setup and larger for the without-interaction setup than the desired level.Ad-hoc corrective adjustment factors suggested in the next subsection (Equations ( 12) and ( 13)) stabilized the type-I errors quite well to the desired level under small group sizes.

Corrective adjustments under small samples (two-way setup)
Let T α and q α be the αth quantiles of the test statistic T and its desired χ 2 distribution, respectively.Nothing that P(T > T 1−α ) = α, and hence that P[(q 1−α /T 1−α )T > q 1−α ] = α, it follows that the entity q 1−α /T 1−α works as a corrective factor (cf ) for adjusting the observed sizes to the desired level α under small group sizes.These adjustments are most  often essential under small group sizes since the asymptotic distributions of likelihoodbased tests work well under little larger group sizes.
The cf is calculated under different group sizes using the ratio of theoretical 95th quantiles of the desired χ 2 (I−1)(J−1) and χ 2 (J−1) (note that these are q 1−α at 5% level) distribution to the simulated 95th quantiles of T ILRT1 of ( 7) and T ILRT2 of (7, 10 and 11) under H 0 , respectively, based on 2000 simulations for each of the parametric combinations specified in Section 2.3.The resulting cf was found to depend mainly on the common group size and is given in Equations ( 12) and (13) for two-way with and without interaction setup, ( The ILRT is suggested to be used with this cf to be employed individually for each group based on the distinct group size n ij in place of n in the expression for cf (resulting in cf ij say) giving rise to the adjusted statistic for two-way with and without interaction setup .
The box plots of the simulated sizes of adjusted ILRT plotted in Figure 3(a,b) for two-way with and without interaction setup, respectively, are indicative of the improvement in the size performance of ILRT after adjustments.Throughout in the sequel, we use this adjusted version of ILRT and the word 'ILRT' is used to represent this adjusted ILRT throughout.

Performance assessment through simulation study
This section presents an exhaustive simulation study to test the performance of the proposed ILRT in terms of type-I error and power function under a fairly representative set of parametric combinations.
Type-I errors of ILRT are simulated for the parametric combinations in P based on 2000 simulations.The commonly used 5% level of significance was used.Figure 3( a,b) giving the box plots of the simulated sizes under two-way with and without interaction setup indicates well-concentrated type-I errors of ILRT around the nominal level.
Figure 4 displays the power functions (as a function of h 1 ) under I = 2; J = 2 and I = 6; J = 2 as representative of all patterns represented in the simulation study.For I = 6; J = 2 increments were slightly altered to be h 1 = 1(0.2)2.6 for a better visual display.The panel headings of each sub-figure are the pairs (VPS, GSP).
Clearly, ILRT gives consistently satisfactory power under all parametric combinations considered and hence it can be uniformly used under all real-life situations.ILRT-based homogeneity test under one-way setup can be similarly developed and is derived in the next section.Note that the test is equivalent to ANOVA for means under BS distribution when the shape parameters of all groups are the same.

ILRT under one-way setup
Consider the one-way setup with I groups, there being n i observations in ith group, i = 1, . . ., I. The pdf of the jth observation in the ith group t ij is is the median and γ i is the shape parameter.
Let T i1 , . . ., T in i be the random observations forming the ith group, following be the overall sample size, arithmetic and harmonic mean, respectively, Tai.= n i j=1 T ij /n i and Thi.= ( n i j=1 T −1 ij /n i ) −1 be the arithmetic and harmonic means of ith group and ta , th , tai , thi denote their observed values, respectively.The hypothesis of interest is to test The likelihood function of aggregate sample is Here, β β β = (β 1 , . . ., β I ) is the vector of unknown parameters of interest and γ γ γ = (γ 1 , . . ., γ I ) is the vector of unknown nuisance parameters.Similar to two-way setup, integration of the likelihood function with respect to the uniform prior yields and with the abuse of similar notation, and In the similar context, β0 and β β β are maximizers of rhs of ( 15) and ( 16), respectively.For numerical maximization, the vector ( ta × th , ( ta 1 .× th 1 ., . . ., ta I .× th I .)) worked well as initial starting value.Exactly parallel computations yield the following ILRT statistics for the one-way setup: The asymptotic χ 2 distribution of T ILRT3 is stated in Theorem 4.1.Its proof is similar to that of Theorem 2.1 and is omitted.

Corrective adjustments under small samples (one-way setup)
Based on 100,000 simulations, the simulated type-I errors of ILRT at 5% level of significance, for various group sizes n = 5(5)45; 100, 200; group shape parameters γ 1 = 0.5(0.5)5,γ i = γ i−1 × h 2 , i = 2, . . ., I, incrementing h 2 through 0.5, 1, 1.5; (and other parametric combinations as specified in the set Q given in Section 4.2.1 except GSP) converged well to the target level 0.05, indicating good conformation to the asymptotic χ 2 distribution specified by Theorem 4.1 for group sizes exceeding 45.As before, a minor ad-hoc corrective adjustment factor is developed using a parallel approach, stated in Equation ( 18), and succeeds very well in controlling the sizes, suggesting the use of corrected statistics given in Equation ( 19)

Performance assessment through simulation study
A rigorous simulation study is attempted to compare the performance of ILRT with the GPQ-based test in terms of type-I error and power function.

Type-I error rate
The study is carried out under the following fairly representative set of parametric combinations: number of groups I = 2(2)8; group shape parameters γ 1 = 0.5, 2, γ i = γ i−1 × h 2 , i = 2, . . ., I, incrementing h 2 through 0.8, 1, 1.15; group median parameters β i = 1, i = 1, . . ., I (kept fixed based on a pilot study showing no impact of β β β on the observed sizes under H 0 ), and n i = 5 and n i = 15, respectively, representing small and medium group sizes, i = 1, . . ., I. Similar to the two-way setup, four GSP were considered: GSP-1: 100% (all) small size groups (n i = 5, i = 1, . . ., I), GSP-2: 75% small size and 25% medium size (n i = 15) groups, GSP-3: 50:50 proportion of small and medium size groups and the last one GSP-4: 25% small size and 75% medium size groups.The four VPS considered were VPS-1 (small equal shape parameters (< 1)): γ 1 = 0.5; h 2 =1, VPS-2 (large shape under large size groups): γ 1 = 0.5; h 2 =1.15,VPS-3 (small shape under large size groups): γ 1 = 2; h 2 =0.8 and VPS-4 (large equal shape parameters (> 1)): γ 1 = 2; h 2 =1.As before, the set of all parametric combinations resulting from the above settings is denoted by Q in the sequel.Based on 5000 simulations, type-I errors of ILRT, LRT, GPQ and Delta method are simulated for each of the parametric combinations in Q.For GPQ, the type-I error is estimated as the proportion of the simulated 5000 P-values falling below the nominal level of α.A single P-value was calculated based on 1000 GPQs.Commonly used 5% level of significance was used.The box plots of simulated type-I errors plotted in Figure 6 revealed uniform performance of tests over Q, differing across the tests, namely, conservative sizes of GPQ, well-concentrated close to nominal sizes of ILRT and liberal (large) sizes of LRT and Delta method.
Thus, LRT and Delta method being not usable with respect to size criterion is not included in the further power study presented next.

Power comparison
Powers were simulated based on 5000 simulations for the aforementioned tests for the same parametric setup Q.Additionally, the median parameter vector β β β was systematically varied taking β 1 = 1 and 2)3 and additionally by h 1 = 5(5)30, 40, 50 under two groups so as to acquire reasonable power.Again, h 1 = 1 corresponds to the null hypothesis.
The power functions (as a function of increments h 1 ) for I = 2, 4, 6 and 8 groups are displayed in Figure 7.Here too, the panel headings of each sub-figure are the pairs (VPS, GSP).For two groups, GSP-2 and 4 are not valid as the partition can be only 50:50 or 100%.Notable observations revealed in the power comparison are reported below.

Observations based on power functions
ILRT performs excellent under two situations: (i) small size groups, irrespective of the shape parameter pattern in the groups and (ii) under larger values of shape parameter for all groups.In all other situations, both tests perform almost similar.

Examples
The three real data sets; two under the two-way and one under the one-way setup highlight the contribution of the ILRT tests developed here in industrial applications to arrive at more correct decisions.The model adequacy checks attempted by employing two graphical analyses, namely the quantile residual plots suggested by Dunn and Smyth [2] and interval plots of residuals, as well as the AIC and BIC criteria strongly support in favor of the decision given by ILRT.Matlab codes for data analysis are provided as supplementary material.

Example 1. Two-way setup: epitaxial layer thickness
The data are taken from Montgomery [5], where the epitaxial layer thickness (measured in μm) is studied to examine the impact of two factors, namely, deposition time (A) and arsenic flow rate (B), each one at two levels.Each of the resulting four combinations is replicated four times.The summary statistics sample median ( βij ) and shape parameter Cell number (i, j) ( ( γij ), i = 1, . . ., I, j = 1, . . ., J are reported in Table 1.The researchers were interested in reducing the variability in the thickness of the layer by keeping the average thickness level as close to the target level as possible.
Often, practitioners adopt a normal-based F-test with homogeneous variances by ignoring the non-negative feature and/or skewed nature of the response distribution.This popularly used analysis employed on the present data picked none of the two factors as having a significant impact on the epitaxial layer thickness.The data are also analyzed using the BS-based ILRT developed in Section 2 which seems more appropriate here on account of the non-negative nature of the response.The later was also validated by model adequacy checks described in the next paragraph.As noted earlier, this analysis is equivalent to testing the impact of factors on the cell medians, that is more relevant for a skewed distribution like BS. Results of analyses are reported in Table 2, where the ILRT strongly picked out the deposition time (factor A) as an influential factor, which the conventional F-based analysis was incapable to pick.
To assess the adequacy of the model supported by ILRT, the quantile residuals (r q ) are obtained as, r qijk = −1 {F(t ijk , μ, αi , γij )}; i = 1, . . ., I, j = 1, . . ., J and k = 1, . . ., n ij , where F(•) is the cumulative distribution function (CDF) of BS distribution and (•) is the standard normal CDF.Since the model has picked out a single significant factor, the estimates μ, αi are taken as the maximizers of the log-IL given in Equation ( 9) involving single factor and γij are the regular MLE's of the group shape parameters computed holding the parameters μ, α i fixed at μ, αi .For the model involving no significant factors supported by the F-test, r ijk = (t ijk − tijk ); i = 1, . . ., I, j = 1, . . ., J and k = 1, . . ., n ij .tijk = 14.514; i = 1, . . ., I, j = 1, . . ., J and k = 1, . . ., n ij being the overall mean of all observations.The P-value (0.4790) of the normality test for quantile residuals under ILRT based on BS (Figure 8(a)) distribution strongly supports the validity of the BS assumption for the data as well as the model proposed by ILRT.Furthermore, the almost uniform appearance of the interval plot (Figure 9(a)) of the residuals grouped cell-wise over the four cells (renumbered as (1, 1) = 1, (1, 2) = 2, (2, 1) = 3 and (2, 2) = 4) where each individual interval covers zero, under the BS model goes in the favor of the model given by the ILRT indicating factor A significantly affects the epitaxial layer thickness.Note that the similar plot for F-based residuals indicates heterogeneous variances for the residuals and the middle two cells do not cover zero, indicating that their mean is different from zero.Furthermore,  the smaller values of AIC, BIC and MSE reported in Table 2 for BS more strongly support in favor of the BS-based model, against the F-based model with no significant factors.
The main effect plot of deposition time (factor A) plotted in Figure 10 further suggests that epitaxial layer thickness increases as the deposition time increases.In the plot, the range of epitaxial layer thickness under both levels of deposition time falls in the allowable range of 14-15 μm.However, the main aim was to reduce the variability of the layer while being close to the target level.As the smaller level of deposition time controls the mean level within the target range, the smaller level, that is 10 min of deposition time is recommended to insure smaller median parameters and hence smaller variability.The level of arsenic flow rate can be arbitrarily chosen, as it does not produce any significant impact on the median.

Example 2. One-way setup: failure times of Kevlar 49 fiber
The test procedure under one-way setup is illustrated with a real data set from Leiva [3].The response variable is the failure time (measured in hours) of Kevlar 49 fiber subject to various stress levels (measured in MPa) and spools.Eight spools at a constant stress level 29.7 are considered for the present analysis.Table 3 presents the summary statistics, sample median ( βi ) and shape parameter ( γi ), i = 1, . . ., 8.
The data are analyzed as a one-way setup for the median under BS distribution assuming unequal shape parameters.ILRT (Equation ( 19)) is employed with the multiplicative corrective adjustment given in Equation ( 18).The P-value for the GPQ test was computed based on 100,000 samples.A conventional F-test is also employed on the data.The results of the data analysis are reported in Table 4.
The ILRT and F-test indicate a significant impact of spools on the fiber life while the GPQ is not capable of identifying this impact.Note that this is a situation with large number of small sized groups.In the simulation study for power comparison considered in  Section 4, this situation is represented in Figure 7 under I = 8 groups with the first column representing small sized groups.Thus, the present situation resembles first column of Figure 7 under I = 8 groups where ILRT is clearly an out-performer over GPQ indicating more trust in the results produced by ILRT over those of GPQ for this case.The main effect plot (Figure 11) indicates that the average failure time of fiber is maximum at spool-4.
As before, for model adequacy checks, the quantile residuals are obtained as, r qij = −1 {F(t ij , βi , γi )}; i = 1, . . ., I, j = 1, . . ., n i , where F(•) CDF of BS distribution and (•) is the standard normal CDF.For ILRT βi are obtained based on the log-IL given in Equation ( 16) and γi is the regular MLE computed holding the medians β i fixed at βi while for GPQ, GPQ-based estimates of parameters are used.As before, for conventional F-test r ij = (t ij − tij ); i = 1, . . ., I and j = 1, . . ., n i , where tij = ti., i = 1, . . ., I and j = 1, . . ., n i being the mean of ith group.The normal probability plots of residuals are plotted in Figure 12, which clearly indicate that the ILRT-based residuals are normal while others may not be so.On account of AIC, BIC and P-value for normality of quantile residuals, the model and hence the parameter estimates offered by ILRT seem to be more reliable.Here too, the uniform appearance of the interval plots (Figure 13(a)) without any trend under the H 1 model also goes in favor of the decision given by the ILRT indicating significant differences among the spools.Note also that the GPQ-based residuals indicate non-zero mean for spool numbered 1, 2, 4 and 7.

Example 3. Two-way setup: delivery time data
The data are taken from Montgomery [5], where the engineer is interested to test the impact of two different types of 32-ounce bottles (glass (level-1) and plastic (level-2)) (factor A) on the time to deliver 12-bottle cases of the product.Two workers (factor B) are used to   move 40 cases of product 50 ft on a standard type of hand truck and stacking the cases in a display to perform a task.Each treatment combination is replicated four times, and the delivery time is recorded.Sample medians and shape parameters of the resulting data under BS distribution assumption are tabulated in Table 5.
The data are analyzed using conventional normal-based F-test with homogeneous variances and also using the BS-based ILRT developed in Section 2. The results are assessed by model adequacy checks.Table 6 shows the results of the analysis, which show that both tests identified both factors as influential factors, and no significant interaction.However, it is notable that the model parameter estimates given by the two tests are different.To assess the adequacy of the model supported by ILRT, the quantile residuals are obtained as before, r qijk = −1 {F(t ijk , μ, αi , βj , γij )}; i = 1, . . ., I, j = 1, . . ., J and k = 1, . . ., n ij .Since the interaction term does not seem to be significantly different from zero, the estimates μ, αi , βj are taken as the maximizers of the log-IL given in Equation ( 4) and estimates of the shape parameter γij are the regular MLE's of the group shape parameters computed holding the parameters μ, α i , β j fixed at μ, αi , βj ; i = 1, . . ., I, j = 1, . . ., J.

Concluding remarks
The proposed ILRT provides a uniformly satisfactory option for two-way and one-way setup under BS distributed response variable commonly encountered in industries.The multiway generalization of the test is straightforward, and an attempt is being made to extend the test to other non-normal response variables namely when the data are Gamma, or Weibull distributed, after employing the close to-normal transformations suggested in Kulkarni and Powar [11], Kulkarni and Powar [12], respectively as well as under censoring themes, an area that is very less addressed in the literature in spite of its wide applicability.For one-way model, the ILRT developed here is observed to prominently outperform its GPQ-based peer test under almost all of scenarios.ILRT has an added advantage of having less computational efforts making it more useful for practitioners not using advanced software.The analysis of real data sets clearly highlights the need of ILRT in industrial applications.

Figure 1 .
Figure 1.Box plots portraying the rate of convergence of type-I errors of ILRT with common group size (n) under two-way with interaction setup.

Figure 2 .
Figure 2. Box plots portraying the rate of convergence of type-I errors of ILRT with common group size (n) under two-way without interaction setup.

Figure 3 .
Figure 3. Box plot of type-I errors of ILRT under two-way with and without interaction setup.(a) Two-way with interaction setup.(b) Two-way without interaction setup.

Figure 6 .
Figure 6.Simulated type-I errors of tests under one-way setup.

Figure 7 .
Figure 7. Simulated power functions under one-way setup.

Figure 8 .
Figure 8. Example 1.Normal probability plots of residuals: two-way setup.(a) Quantile residuals under ILRT based on BS.(b) Residuals under F-test based on normal.

Figure 9 .
Figure 9. Example 1. 95% confidence interval plots of the mean of residuals for each treatment combination: two-way setup.(a) Quantile residuals under ILRT based on BS.(b) Residuals under F-test based on normal.

Figure 10 .
Figure 10.Example 1. Main effect plot of deposition time factor under two-way model.

Figure 11 .
Figure 11.Example 2. Main effect plot of spool under one-way model.

Figure 12 .
Figure 12.Example 2. Normal probability plots of residuals: one-way setup.(a) Quantile residuals based on maximum IL estimates of parameters under H 1 .(b) Quantile residuals based on GPQ-based estimates of parameters under H 0 .(c) Residuals under F-test based on normal distribution with unequal treatment means.

Figure 13 .
Figure 13.Example 2. Interval plots of residuals: one-way setup.(a) Quantile residuals based on maximum IL estimates of parameters under H 1 .(b) Quantile residuals based on GPQ-based estimates of parameters under H 0 .(c) Residuals under F-test based on normal distribution with unequal treatment means.

Figure 14 .
Figure 14.Example 3. Normal probability plots of residuals: two-way setup.(a) Quantile residuals under ILRT based on BS.(b) Residuals under F-test based on normal.
For the model supported by the F-test, the residuals are r ijk = (t ijk − tijk ); where tijk = tai.. + ta.j.− ta , referring to the notation developed in Section 2.1, i = 1, . . ., I, j = 1, . . ., J and k = 1, . . ., n ij .The P-value (0.1250) of the normality test for quantile residuals under ILRT based on BS (Figure14(a)) distribution strongly supports the validity of the BS assumption for the data as well as the model parameter estimates obtained under ILRT.Furthermore, the almost uniform length and appearance of the interval plot (Figure15(a)) of the residuals grouped cell-wise over the four cells, where each individual interval covers zero, under the BS model goes in the favor of the model estimates given by the ILRT indicating factors A and B significantly affect the delivery time.Similar plot for F-based residuals indicates heterogeneous variances for the residuals.Furthermore, the smaller values of AIC and BIC reported in Table6under BS-based model more strongly support in favor of the BS-based model, against the F-based model.The main effect plot of two worker (factor B) and bottle type (factor A) plotted in Figure16further suggests that delivery time required for worker 2 and plastic bottle is smaller.

Figure 15 .
Figure 15.Example 3. 95% confidence interval plots of the mean of residuals for each treatment combination: two-way setup.(a) Quantile residuals under ILRT based on BS.(b) Residuals under F-test based on normal.

Figure 16 .
Figure 16.Example 3. Main effect plots of factors A and B.

Table 1 .
Example 1. Summary sample statistics for four treatment combinations.

Table 3 .
Example 2. Summary sample statistics for eight spools.

Table 5 .
Example 3. Summary sample statistics for four treatment combinations.
a Quantile residuals are used.