Maximum One-Factor-At-A-Time Designs for Screening in Computer Experiments

Abstract Identifying important factors from a large number of potentially important factors of a highly nonlinear and computationally expensive black box model is a difficult problem. Morris screening and Sobol’ design are two commonly used model-free methods for doing this. In this article, we establish a connection between these two seemingly different methods in terms of their underlying experimental design structure and further exploit this connection to develop an improved design for screening called Maximum One-Factor-At-A-Time (MOFAT) design. We also develop efficient methods for constructing MOFAT designs with a large number of factors. Several examples are presented to demonstrate the advantages of MOFAT designs compared to Morris screening and Sobol’ design methods.


Introduction
Identifying the most important factors (or inputs) affecting the output from a set of potentially important factors is an important problem in scientific investigations (Saltelli et al. 2004). Once the most important factors are identified, one can focus only on those factors, which can help reduce the cost of future investigations. This problem is known as factor screening in the literature. Doing factor screening with noisy observations is often called "variable selection" in statistics and "sensitivity analysis" in uncertainty quantification. This article deals with designing experiments so that factor screening can be done with less data. A recent review of this topic can be found in Woods and Lewis (2017).
The simplest and most intuitive method for doing screening experiments is to vary one factor at a time, which is arguably the most commonly used method by scientists and engineers. Fractional factorial designs (Wu and Hamada 2021) and definitive screening designs (Jones and Nachtsheim 2011) can provide a much better estimate of the factor's importance than onefactor-at-a-time (OFAT) designs by fitting linear or quadratic regression models. This is especially true when the observations are noisy. However, when the data are generated from computer models with negligible errors (Santner et al. 2003), nonlinear and higher-order interaction effects are often of interest and the problem of factor screening becomes much more challenging, which is the main focus of this article.
When nonlinear and higher-order interaction effects are suspected, we can fit nonparametric regression models such as a Gaussian process (GP) model to the data, and then estimate Sobol' indices (Oakley and O'Hagan 2004;Chen, Jin, and Sudjianto 2005)  importance. Space-filling designs (Joseph 2016) can be used for efficiently estimating such nonparametric regression models. However, this model-based approach targets response surface modeling rather than factor screening. Moreover, the model fitting can become cumbersome when there are a large number of factors involved in the experiment. Morris (1991) proposed a model-free screening method, which uses a collection of OFAT designs randomly placed within the experimental region. Each OFAT design gives an estimate of the "local" sensitivity of the computer model, which taken all together provides information on the factors' global effects and importance. On the other hand, Sobol ' (1990, 1993) proposed Monte Carlo-based designs for directly estimating the "global" sensitivity of the computer model, albeit with a larger number of runs. More details of these and other sensitivity analysis methods can be found in the book by Da .
In this article, we propose a new class of optimal designs for factor screening, which is rooted in the approaches of Sobol' and Morris. Therefore, in Section 2 we first review their approaches in detail. Then, in Section 3, we propose our maximum onefactor-at-a-time (MOFAT) designs, which possess optimality properties for screening using the Sobol' index. Efficient construction methods are given in Section 4, and several numerical illustrations are presented in Section 5. We conclude our article with some remarks in Section 6. Technical details and proofs are included in the online supplementary materials.

Literature Review
In this review, we focus on the methods proposed by Morris (1991) and Sobol' (1990and Sobol' ( , 1993. Readers are referred to Da Veiga et al. (2021) for details on other methods for screening.

Morris Screening
Let there be k input factors x = (x 1 , . . . , x k ) and a scalar output y. Unless otherwise stated, we assume that x ∈ [0, 1] k . Morris (1991) defined elementary effects to estimate the local sensitivity of factors by changing one-factor-at-a-time (OFAT). Let x i ( ) = (x 1 , . . . , x i ± , . . . , x k ) , where the + or − sign for is chosen so that x i + ∈ [0, 1] or x i − ∈ [0, 1]. Then, the k elementary effects are given by Several x's are chosen randomly from an L-level regular grid: Morris (1991) design can be viewed as a collection of l OFAT designs. An example with L = l = 4, k = 2, and = 1/(L − 1) is shown in the left panel of Figure 1. This illustration uses "strict" OFAT designs according to the classification by Daniel (1973), where a run is based on the previous run. Let d i(j) be the elementary effect of x i at x = x j , j = 1, . . . , l. Then, we can compute the following summary measures to understand the global effects of factors: Based on these two measures we can classify the factor effects as (i) negligible (low |μ i | and low σ i ), (ii) linear and additive (high |μ i | and low σ i ), and (iii) nonlinear and/or interactions (high σ i ). Campolongo, Cariboni, and Saltelli (2007) proposed another measure by taking the absolute values of d i(j) 's: which they claim as sufficient to rank order factors according to their importance. Instead of the "strict" OFAT design, one can also use the "standard" OFAT design as shown in the right panel of Figure 1, where all the changes are made from a standard or base run (shown as black circles in the figure). Although Morris (1991) has discussed both types of OFAT designs in his article, the "strict" version seems to be more commonly used in practice (Da Veiga et al. 2021). However, in this article, we are more interested in the "standard" version, which will be clear when we look at the design for the Sobol' method in the next section.

Sobol' Design
Sobol ' (1990, 1993) proposed variance-based indices which are popular tools for global sensitivity analysis. Let y be the scalar output and x = (x 1 , . . . , x k ) be the set of k input factors. Assuming the functional relationship between y and x is square integrable, the variance of y can be decomposed into the variance due to the main effects, two-factor interactions, etc., as follows: and so on, and x ∼i represents the set of all factors except x i .
In the literature, two popular variance-based measures of sensitivities are the Sobol' first-order index and total Sobol' index, where the former focuses on factors' main effects and the latter considers all interactions of factors (Sobol' 2001;Saltelli et al. 2008). Since our aim is factor screening, we focus on the total Sobol' indices which measure the total contributions of input factors x i (i = 1, . . . , k) to the output variance, including the variance components of all orders caused by the interactions with any other input factors. The total Sobol' indices can be computed as Sobol ' (2001) proposed sampling-based methods to estimate t i , which are also known by the name pick-freeze methods. The approach starts by constructing a design D as follows. 1. Randomly generate two l × k matrices A and B. 2. For any i = 1, . . . , k, generate the matrix C (i) via replacing the ith column of matrix A (i.e., a i = (A 1i , . . . , A li ) T ) with the ith column of matrix B (i.e., b i = (B 1i , . . . , B li ) T ). 3. Aggregate matrices A, C (1) , . . . , C (k) by rows to get the n-run and k-factor design (n = l(k + 1)): We will refer to the design generated using the foregoing three-step procedure as Sobol' design. Let A j and C (i) j (j = 1, . . . , l) be the jth runs in the matrices A and C (i) , respectively and let y(x) be the response for an input x in the design D. There are a number of possible Monte Carlo estimators available for the total Sobol' indices {t i } k i=1 . A commonly used estimator is (Jansen 1999): where In (4), var(y) is the sample variance n j=1 (y j −ȳ) 2 /(n−1) of the response values {y 1 , . . . , y n }, whereȳ denotes the sample mean. If t i is larger than t j (i, j = 1, . . . , k), we can conclude that factor i is more important than factor j. Thus, using { t i } k i=1 , we can easily select the s most important factors (s k) according to practical needs or screen out those factors with t i ≈ 0. Since var(y) is a constant, one can also use V tot i to rank order the factors. Interestingly, Sobol' design D in (3) can be viewed as a collection of l OFAT designs provided that A j = C (i) j for j = 1, . . . , l and i = 1, . . . , k. This is because we can reorganize the rows of a Sobol' design as D 12-run design with A, C (1) , and C (2) represented by black circles, red triangles, and green pluses, respectively, where the OFAT structure is shown using arrows.
The foregoing discussion makes an interesting connection between Sobol' design and Morris (screening) design. Morris design can be written in the form of a Sobol' design when a standard OFAT is used. Conversely, a Sobol' design can be converted to a Morris design with standard OFAT if A j = C (i) j for j = 1, . . . , l and i = 1, . . . , k. However, there are some subtle differences between the two. Morris design is deterministic up to the l base runs for the standard OFAT, whereas Sobol' design is completely random. Because of the randomness, Sobol' design typically needs a large l. We can try to reduce the number of runs in a Sobol' design by using a deterministic sample (such as a Quasi-Monte Carlo sample), but then the rows of the design may not be distinct and it may not reduce to a collection of l OFAT designs. Another crucial difference between the two is that the changes in the factor levels within an OFAT is fixed in Morris design (that is, constant ), whereas they change drastically in Sobol' design due to its inherent randomness.

MOFAT Designs
Now that we have established close connections between two of the most popular methods for screening in the previous section, we can focus our attention on the design structure in (3). In this section, we will improve/optimize this design structure so that we can estimate the importance of factors more efficiently and perform screening more reliably.
As can be seen in (5) and (1), both V tot i and μ * i are sample averages to approximate some expectations. Sobol' (2001) uses Monte Carlo or Quasi-Monte Carlo sequences to compute the average. Thus, the ith columns of A and C (i) should be two independent uniform samples from [0, 1]. The best uniform sample of size l that minimizes the discrepancy from a uniform distribution is given by {0.5/l, 1.5/l, . . . , (l − 0.5)/l} (Fang and Wang 1993). Thus, we can take the ith columns of A and C (i) as two random permutations of these l values. This makes A, C (i) (i = 1, . . . , k) as Latin hypercube designs (LHD, McKay, Beckman, and Conover (1979)).
Since there are many choices for LHDs, we can try to choose the one that improves the identification of important factors. One possibility is to choose the LHDs so as to maximize the total Sobol' index or μ * i because large values of them will help identify the important factors. This, however, depends on the underlying response surface. Since the response surface is unknown to us before the experiment, we should design the experiment in such a way that it will work well irrespective of the response surface. Thus, to evaluate the effectiveness of a design, we may assume the response surface to be a realization from a stochastic process.
Assume that y(x) is a Brownian motion (Zhang and Apley 2014) with Here the ω i can be viewed as a measure of importance of the ith factor. Suppose the experiment is performed according to the design in (3). Then, where the last step follows from the fact that the jth rows of A and C (i) differ only for the factor i, which is a unique property of the OFAT design. Since t i is proportional to V tot i , we can maximize the total Sobol' index by maximizing l j=1 |A ji −C (i) ji |, which is the L 1 distance between the ith columns of A and C (i) . Define the maximum one-factor-at-a-time (MOFAT) design as the design D that maximizes l j=1 LHDs. This definition leads to the following optimality result for our proposed design.
Proposition 1. If the response surface is a realization of a Brownian motion field with mean and variance functions given in (6), then a MOFAT design maximizes the expected value of V tot i for i = 1, . . . , k.
Finding a MOFAT design requires optimization over 2lk variables (i.e., the elements of matrices A and B), which can be expensive for large l and k. Below we provide a result to simplify this optimization. Define the following transformation: where x is the largest integer not exceeding x and the indicator function I(x < (1 + l)/2) equals to 1 if x < (1 + l)/2 and 0 otherwise. This transformation has the following property.
Now construct the design D as follows. In this description we have used integers {1, 2, . . . , l} as elements of D, which can be transformed to [0, 1] using x i ← (x i − .5)/l.
1. Find an appropriate l × k LHD A. Apply the transformation function T l defined in (8) to all elements in A to form the matrix B, that is, B pq = T l (A pq ) for p = 1, . . . , l and q = 1, . . . , k. 2. For i = 1, . . . , k, generate the matrix C (i) via replacing the ith column of matrix A with the ith column of matrix B. (3) to get the n-run and k-factor design, where n = l(k + 1).

Aggregate matrices
Proposition 2. For any l, designs generated by the foregoing three-step construction are MOFAT designs, where the minimum L 1 -distance between any pair of A ji and B ji (j = 1, . . . , l and i = 1, . . . , k) is l/2 and the L 1 -distance between the ith (i = 1, . . . , k) column of matrices A and C (i) is l 2 /2 .
Thus, with this result, B can be quickly obtained from A using the transformation in (8). Therefore, to find D, we only need to optimize lk variables (that is, the elements of matrix A), which is a great simplification to the original optimization problem. This approach to design construction matches with the specific choice of recommended in Morris (1991) when l is even. However, when l is odd, the 's resulting from the transformation in (8) are not constant, which makes MOFAT design different from Morris design even if it uses a standard OFAT.
Proposition 3. MOFAT designs with at least three columns do not have duplicated rows.
By Proposition 3, we directly have A j = C (i) j (j = 1, . . . , l and i = 1, . . . , k) in MOFAT designs with three or more columns (k 3). By the proof for Proposition 3, it is straightforward to see that MOFAT designs with k = 2 also have such a property, though they may have duplicated runs. Therefore, the MOFAT design, as a special class of Sobol' designs, can always be written as a collection of l OFAT designs, as discussed in Section 2.2. Moreover, Proposition 3 also justifies the use of MOFAT designs in computer experiments because replicates are undesirable due to the deterministic nature of the experiments.
The only remaining thing to decide in the aforementioned three-step construction of MOFAT designs is the choice of matrix A. An optimal A can be chosen so that the MOFAT design in Step 3 will have good space-filling properties. This is to ensure that no part in the experimental region is left out. In this work, we consider the maximin L 1 -distance criteria for measuring a design's space-fillingness (Johnson, Moore, and Ylvisaker 1990;Morris and Mitchell 1995), which seeks to spread out the design points evenly over the entire experimental region via maximizing the smallest L 1 -distance between any pair of points (L 1 -distance is chosen to be consistent with the distance metric in (7)). We adapt a threshold accepting (TA) global optimization algorithm (Dueck and Scheuer 1990;Xiao and Xu 2018) for searching the optimal A. The technical details, including the pseudo codes, for the proposed TA algorithm are given in the supplementary materials.
An example of MOFAT design with k = 2 and l = 4 is shown in the left panel of Figure 3. We can see that MOFAT design tends to explore the experimental region in a much better way than the Morris designs in Figure 1 and Sobol' design in Figure 2. Additionally, the MOFAT is balanced (each of the four levels appear three times for both the factors), while the Morris designs (Figure 1) and Sobol' design ( Figure 2) are not. Furthermore, MOFAT design has no duplicated runs, whereas one (Figure 1 right) or two (Figure 1 left) of the runs in Morris designs are replicated. Although Sobol' design has no duplicated runs in Figure 2 because of the random sampling, some of the runs are very close to each other, which is not desirable in a deterministic computer experiment. The proposed MOFAT design has only l levels, {.5/l, 1.5/l, . . . , (l − .5)/l}, for each factor. Besides factor screening, if response surface modeling is also of interest, we may try to improve the projections of MOFAT design by adjusting the values in A and B as follows: o t h e r w i s e , and (9) for i = 1, . . . , l. If (9) is used for A, then (10) should be used for B, and vice versa. This choice can be made using maximin L 1 -distance criterion. The resulting MOFAT design is shown in the right panel of Figure 3. We can see that now both the factors have eight levels as opposed to only four levels in the original design. Since factor screening is the focus of this article, we will use the original version of the MOFAT design for the rest of the article.

Efficient Construction of MOFAT Designs
In this section we develop some theoretical results to facilitate the optimal search of A used in Step 1 of the three-step construction of a MOFAT design. Of particular interest is the special case with l = 3 (i.e., the smallest MOFAT design) for which an algebraic construction method can be developed.

Large Number of Factors
When k is large (say k > 10), the optimization of design D can be time consuming. Here, we propose a method to further reduce the computation. Let P l be a matrix whose columns are all possible permutations of numbers (1, . . . , l): Let matrix A be the l-run and k -factor maximin L 1 -distance LHD, where k > l! and k = (k mod l!). The optimal choice of matrix A can now be obtained as described in Proposition 4. First we state a lemma, which is needed for the proof of the proposition.

Proposition 4. The matrix A formed by combining k/(l!)
replicates of the matrix P l and matrix A by column is the lrun and k-factor maximin L 1 -distance LHD, and the MOFAT design D developed by such a matrix A is space-filling under the maximin L 1 -distance criterion when k >> l.
By Proposition 4, when k is large, instead of searching for the optimal matrix A with k factors that will lead to the spacefilling MOFAT D, we only need to search for the much smaller maximin LHD A with (k mod l!) factors, which greatly saves the computation. Practitioners can adopt either the proposed TA in supplementary materials or some of the existing R packages such as SLHD (Ba 2015) and LHD (Wang, Xiao, and Mandal 2020), to find A .
Although we require k >> l to avoid the worst-case scenario in Proposition 4, both empirical results and theoretical proofs show that k 2l often suffices. Since the main purpose of using MOFAT designs is factor screening with economic run sizes, we would only use small or moderate l. Thus, we almost always have k 2l in practice. MOFATs in Proposition 4 also have improved space-filling properties in all slices (i.e., A and C (i) for  i = 1, . . . , k), which is in a similar spirit to space-filling Sliced LHDs (Ba, Myers, and Brenneman 2015).

A Special Case
When l = 3, it is possible to algebraically construct the MOFAT design with n = 3(k + 1) runs. Since the purpose of screening is to quickly identify the important factors with as few runs as possible, this special case is of great interest in practice.
This algebraic construction still follows the three-step procedure in Section 3 except that we have a deterministic construction of matrix A rather than doing an algorithmic search. Specifically, in its Step 1, the l × k matrix A is generated via combining k/(l!) replicates of P 3 and the first (k mod l!) columns of P 3 by columns, where the matrix P 3 is defined as 1 1 2 2 3 3 2 3 1 3 1 2 3 2 3 1 2 1 ⎞ ⎠ .
Proposition 5. When using a matrix A constructed using P 3 in the three-step construction of MOFAT designs, we have: 1. A is the maximin L 1 -distance design whose minimum L 1distance is 4k/3 ; 2. if k > 3, the resulting MOFAT design D has the minimum pairwise L 1 -distance of 1 which appears 2k times, and it achieves the bound in Lemma 2.
By Proposition 5, the selected design A is space-filling, and the resulting design D also has good space-filling property among other alternatives. Thus, such a matrix A is a robust choice for Step 1.

Numerical Studies
In this section, we conduct simulations to understand the performance of the proposed MOFAT design for factor screening. MOFAT designs can be generated using the R package MOFAT (Xiao and Joseph 2022).
We compare the proposed MOFAT design (l = 3) to Sobol' design and Morris design (with strict OFATs) introduced in Section 2. All designs have n = 33 runs and k = 10 factors. We use random samples generated from the uniform distribution to construct the matrices A and B in Sobol' designs (i.e., the Monte Carlo approach). Additional results using randomized Sobol' sequences to construct Sobol' designs (i.e., the quasi-Monte Carlo approach) are reported in supplementary materials. Morris designs are generated using the R function "morris" in the R package sensitivity at default settings . We have randomly assigned the factors to the columns of designs in different replications. This analysis is repeated 100 times.
In the top panels of Figure 4(a) and (b), we show the boxplots of V tot values defined in (5) from MOFAT and Sobol' designs, where larger values represent more significant factors. In Figure 4(a), it is seen that the proposed MOFAT can accurately identify the significant factors x 1 to x 5 in all the replications. The V tot values for x 1 and x 2 are almost always equal, which indicates that these two factors have the same importance. The V tot values for x 4 are always larger than those for x 5 , which shows that x 4 is Figure 4. Boxplots of V tot calculated from MOFAT and Sobol' designs (top panels), and boxplots of μ * calculated from MOFAT and Morris designs (bottom panels) for the Friedman function in Example 1. more important than x 5 . Comparing Figure 4(a) and (b), we can see that the V tot values from MOFAT designs are on the average larger than those from Sobol' designs, and they also have less variations. Additionally, in Figure 4(b), factors x 1 to x 5 may be mis-identified as insignificant (nearly zero V tot values) in some of the replications. Thus, MOFAT designs clearly outperform Sobol' designs in this example.
In the bottom panels of Figure 4(c) and (d), we show the boxplots of the μ * values defined in (1) from MOFAT and Morris designs, where larger values represent more significant factors. It can be seen that the proposed MOFAT can accurately identify the significant factors x 1 to x 5 along with their relative importance in all replications. Results from Morris designs involve larger variations and furthermore, factors x 1 and x 2 may be mis-identified in some of the replications. Thus, MOFAT designs outperform Morris designs under the μ * metric in this example.
We compare the proposed MOFAT design to Sobol' and Morris designs, all of which have n = 63 runs and k = 20 factors. The results are shown in Figure 5 (a few outliers are removed for better visualization; see supplementary materials for the original plots). Note that Morris (1991) used a larger design with 84 runs, where it is concluded that factors x 1 through x 10 are active with factors x 8 to x 10 being the most important.
In the top panels of Figure 5(a) and (b), we show the boxplots of V tot values from MOFAT and Sobol' designs, respectively. It can be seen that the MOFAT design can identify the potentially important factors x 1 to x 10 whose V tot values are larger than 0. Specifically, in Figure 5(a), factors x 8 , x 9 and x 10 have large V tot values and small variations. Clearly, they are among the most significant factors, which is consistent with the findings of Morris (1991). Comparing Figure 5(a) and (b), we can see that the V tot values from MOFAT designs are generally larger than those from Sobol' designs with less variations. Clearly, MOFAT designs outperform Sobol' designs in this example.
In the bottom panels of Figure 5(c) and (d), we show the boxplots of the μ * values from MOFAT and Morris designs, respectively. Although Morris screening design has produced much larger values for μ * , they have much larger variability than the μ * values from the MOFAT design. Overall, both designs seem to be doing a good job in identifying the important factors in this example.
As mentioned in the Introduction, another approach to identify important factors is to fit a GP model to the data obtained using a space-filling design and then compute the sensitivity indices using the fitted GP model (Oakley and O'Hagan 2004;Chen, Jin, and Sudjianto 2005). This approach is attempted for this example. First, we fit a GP model to the data using the R package DiceKriging (Roustant, Ginsbourger, and Deville 2012) generated from a 63-run Maximin LHD obtained using the R package SLHD (Ba 2015). Next, we simulate data from the fitted GP model and calculate V tot using a very large Sobol' design with l = 1, 000. Figure 6 shows the boxplots of V tot from 100 replications. Comparing Figures 5(a) and 6, we can see that the V tot values from the GP-based approach are generally smaller than those of the MOFAT design, risking the correct identification of important factors. Furthermore, the GP-based approach requires more than four seconds of computations (per trial) as compared to less than a second for the MOFAT designbased approach. This computational saving can become quite substantial when more number of factors are involved. Thus, Figure 6. Boxplots of V tot calculated from the GP-based approach for the Morris function in Example 2.  ∈ [63.1, 116] transmissivity of lower aquifer (m 2 /yr) x 6 = H l ∈ [700,820] potentiometric head of lower aquifer (m) x 7 = L ∈ [1120,1680] length of borehole (m) 9855,12045] hydraulic conductivity of borehole (m/yr) overall, the MOFAT design-based approach is superior to the GP-based approach for factor screening both in terms of quality and speed.
Example 3. The Borehole function models the rate of flow of water (m 3 /yr) through a borehole that is drilled from the ground surface through two aquifers (Worley 1987): Table 1 gives the details of the eight factors in (13). In the analysis, we have standardized the ranges of all factors to [0, 1]. Moon, Dean, and Santner (2012) considered a modification of the borehole function by adding 12 inert factors (x 9 , . . . , x 20 ). We use this setup to test the performance of MOFAT designs.
As in the previous example, we consider the designs (MOFAT, Sobol' , and Morris) with n = 63 runs and k = 20 factors. In the top panels of Figure 7 V tot values from MOFAT and Sobol' designs, respectively. It can be seen that the MOFAT design identifies x 1 as the most important factor, followed by x 4 , x 6 , and x 7 , which is consistent with the findings of Moon, Dean, and Santner (2012) where they identified the same set of factors using a 70-run twostage method. In addition, MOFAT designs also identify x 8 as an important factor. Comparing Figure 7  Example 4. Consider a variant of the Easom function (Pires et al. 2010) with k = 10 input factors x 1 , . . . , x 10 ∈ (−10, 10), where the first two factors are active and the last eight are inert. It has the following form: where c 1 and c 2 are randomly sampled from the uniform distribution U(0, 10). In (14), nonzero outputs are clustered around two centers (c 1 , c 2 ) and (−c 1 , −c 2 ). As an illustration, Figure 8. Visualization of the modified Easom function in Example 4 for factors x 1 and x 2 when c 1 = c 2 = π (nonzero outputs are clustered around (π , π) and (−π , −π)).
the response surface of f (x) in the space of the active factors x 1 and x 2 with c 1 = c 2 = π is shown in Figure 8.
First, we consider MOFAT designs of different sizes under the V tot metric for factor screening. We replicate all analysis 100 times, where c 1 and c 2 are drawn randomly in each replication. In Figure 9(a), we show the boxplot of √ V tot values calculated from the 33-run MOFAT designs (l = 3) from the algebraic construction. Clearly, MOFAT designs with l = 3 cannot solve the problem. This is because nearly all nonzero outcomes are clustered in two regions centered at (c 1 , c 2 ) and (−c 1 , −c 2 ), and the MOFAT design with three levels cannot easily capture these two small regions. Thus, all outputs y(A j ) ≈ y(C (i) j ) ≈ 0, and all V tot i ≈ 0, where i = 1, 2 and j = 1, 2, 3. In such a case, we may need to use larger values of l to explore more regions of the response surface. Figure 9(b) and (c) show the results from 66-run MOFAT designs (l = 6) and 132-run MOFAT designs (l = 12), respectively. It can be seen that the 132-run MOFAT designs (l = 12) can reliably identify the significant factors x 1 and x 2 . In Figure 9(d), we also show the results from 132-run Sobol' designs, where a larger fraction of the replications will declare x 1 and x 2 to be inert compared to the MOFAT designs.
Next, we look at the performance of 132-run MOFAT designs and Morris designs under the μ * metric. We can see from Figure 10(a) and (b) that MOFAT designs can clearly identify  the active factors x 1 and x 2 , while Morris designs cannot do this in most replications.

Conclusions
Although OFAT designs are not favored in traditional experimentation, Morris (1991) showed that when several OFAT designs are placed in different parts of the experimental region, they can work as an efficient tool for factor screening in complex computer models. We noticed that when a specific form of the OFAT design is used in Morris screening, the resulting design structure resembles Sobol' design (Sobol' 1990(Sobol' , 1993 used for computing global sensitivity indices. We exploited this connection in this article and developed an improved design called MOFAT design. Different from Sobol' and Morris designs, MOFAT design is obtained by optimizing a spacefilling design criterion and therefore, the MOFAT design is more deterministic. Furthermore, because the runs in MOFAT design are optimized, it requires as small as n = 3(k + 1) runs to screen k factors, which can be extremely beneficial when the computer models under investigation are expensive to evaluate. Several numerical studies are presented to illustrate its superior performance over the Sobol' and Morris designs. We believe that MOFAT designs with n = 3(k + 1) can perform well in practice. Yet, when outcomes are highly concentrated in small regions, designs with larger size may be required to identify the important factors. Since we have little or no information on the shape of the response surface before the experiment, it would be useful to develop a sequential framework for MOFAT design that starts with n = 3(k + 1) runs and add more runs to it when necessary. We leave this as a topic for future research.
In our simulations, the μ * measure often performed better than the total Sobol' index for factor screening. At this time, we do not advocate one measure over the other, but this observation is interesting and warrants further study.

Supplementary Materials
The supplementary materials contain technical proofs of the theoretical results, details of the algorithm for constructing MOFAT designs, and codes for reproducing Figure 4. The R package MOFAT (Xiao and Joseph 2022) available through CRAN can be used for generating MOFAT designs.