A practical method for evaluating worker allocations in large-scale dual resource constrained job shops

In two recent articles, Lobo et al. present algorithms for allocating workers to machine groups in a Dual Resource Constrained (DRC) job shop so as to minimize Lmax , the maximum job lateness. Procedure LBSA delivers an effective lower bound on Lmax , while the heuristic delivers an allocation whose associated schedule has a (usually) near-optimal Lmax  value. To evaluate an HSP-based allocation’s quality in a given DRC job shop, the authors first compute the gap between HSP’s associated Lmax  value and ’s lower bound. Next they refer this gap to the distribution of a “quasi-optimality” gap that is generated as follows: (i) independent simulation replications of the given job shop are obtained by randomly sampling each job’s characteristics; and (ii) for each replication, the associated quasi-optimality gap is computed by enumerating all feasible allocations. Because step (ii) is computationally intractable in large-scale problems, this follow-up article formulates a revised step (ii) wherein each simulation invokes , an improved version of , to yield an approximation to the quasi-optimality gap. Based on comprehensive experimentation, it is concluded that the -based distribution did not differ significantly from its enumeration-based counterpart; and the revised evaluation method was computationally tractable in practice. Two examples illustrate the use of the revised method.


Introduction
In conventional job shop scheduling, by assumption the system's ability to complete jobs on time is constrained only by the number of machines that are available to process jobs. This perspective on the job shop scheduling problem ignores the additional constraint imposed by the number of workers that are available to operate the machines. Dual Resource Constrained (DRC) systems are subject to limits on the availability of both machines and manpower (Treleven and Elvers, 1985). Lobo et al. (2013aLobo et al. ( , 2013b address the problem of minimizing L max , the maximum job lateness, in a DRC job shop. Because the job shop scheduling problem with even a single constrained resource is NP-hard (Lenstra and Rinnooy Kan, 1979), Lobo et al. (2013aLobo et al. ( , 2013b develop search procedures (mainly heuristics) to find the following: (i) the most promising (hoped to be optimal) allocation of available workers to the machine groups (departments) in the job shop; and (ii) the best achievable schedule for the job shop based on allocation (i). The objective of this article is to develop a practical method for applying the approach of Lobo et al. (2013b) to large-scale DRC job shop problems.
Given an allocation ϑ of workers to machine groups, Lobo et al. (2013a) derive a corresponding lower bound LB ϑ on the value of L max for all schedules based on that allocation. They also develop LBSA, a search algorithm that is guaranteed to deliver an allocation ϑ * whose corresponding lower bound LB ϑ * exactly equals the smallest value of LB ϑ taken over all feasible allocations. The authors use LB ϑ * as a benchmark for evaluating heuristic solutions to the DRC job shop scheduling problem. Lobo et al. (2013b) establish criteria for verifying that an allocation is optimal-i.e., the allocation corresponds 0740-817X C 2014 "IIE" to a schedule that minimizes L max . For situations in which the LBSA-delivered allocation ϑ * does not satisfy the optimality criteria, Lobo et al. (2013b) develop HSP, a heuristic search procedure that delivers an allocation ϑ HSP , which generally results in an improved schedule. Specifically, ϑ HSP usually enables the Virtual Factory (an iterative heuristic scheduler developed by Hodgson et al. (1998)) to generate a schedule whose L max value, denoted by VF ϑ HSP , is less than VF ϑ * , the corresponding L max value for the Virtual Factory-generated schedule based on allocation ϑ * . Ideally, ϑ HSP enables the Virtual Factory to generate a schedule for which VF ϑ HSP is close not only to the minimum feasible value of L max but also to the lower bound LB ϑ * . The Virtual Factory is used as the heuristic scheduler because of its proven track record in successfully generating optimal or near-optimal schedules in job shop scheduling problems for which the primary objective is to minimize L max (Weintraub et al., 1999;Hodgson et al., 2000;Thoney et al., 2002;Zozom et al., 2003;Hodgson et al., 2004;Schultz et al., 2004). Nevertheless, Lobo et al. (2013aLobo et al. ( , 2013b note that their approach can work with other heuristic schedulers. It should be emphasized that throughout this article, we restrict attention to situations in which the LBSA-delivered allocation ϑ * does not satisfy the optimality criteria of Lobo et al. (2013b). In an application for which ϑ * satisfies either of the optimality criteria of Lobo et al. (2013b), we know with certainty that ϑ * is an optimal worker allocation so that the probabilistic method developed in this article is not needed; but, as explained below, this method will confirm the optimality of ϑ * if it is used in such an application.
A "VF-best" allocation ϑ VFB enables the Virtual Factory to generate a schedule whose corresponding "quasi-optimal" L max value, denoted by VF ϑ VFB , is the smallest value of L max that can be achieved by the Virtual Factory for a given DRC job shop scheduling problem, taken over all feasible worker allocations for that problem. The value VF ϑ VFB is termed a "quasi-optimal" L max value because although all feasible allocations are explored, the Virtual Factory is not guaranteed to generate a schedule yielding the exact minimum value of L max for each of those worker allocations. By exploiting the generally superior quality of schedules generated by the Virtual Factory, Lobo et al. (2013b) seek to answer the following question: For a given worker allocation ϑ in a real instance of the DRC job shop scheduling problem, what is our degree of confidence that ϑ is a VF-best allocation? In other words, for a simulation replication of the given job shop in which the key characteristics of each job are randomly sampled, what is the probability that the resulting (random) quasi-optimality gap VF ϑ VFB − LB ϑ * for the simulation replication is at least as large as the (fixed) value VF ϑ − LB ϑ * for the real DRC job shop scheduling problem that must actually be solved? (1) can be answered by evaluating 1 − F VFB ( VF ϑ − LB ϑ * ) as an easily interpreted measure of our degree of confidence that the allocation ϑ is a VF-best allocation for the specific problem at hand.
To estimate the c.d.f. F VFB (·) for a given DRC job shop scheduling problem by means of a simulation experiment, Lobo et al. (2013b) perform the following steps: (i) a total of Q = 500 independent simulation replications of the given DRC job shop scheduling problem are performed; and (ii) for each simulation replication, the VF-best allocation ϑ VFB and its associated quasi-optimality gap VF ϑ VFB − LB ϑ * are obtained by enumeration of the allocation search space-that is, by invoking the Virtual Factory to generate a schedule for every feasible allocation.
However, because the size of the allocation search space grows exponentially with an increase in either the number of machine groups or the number of machines, the computational complexity of enumeration is prohibitive in large-scale DRC job shop scheduling problems. Experimental results supporting this conclusion can be found in Lobo et al. (2013b).
In this article we formulate and evaluate a revised method for rapidly assessing the quality of a given worker allocation in a specific instance of the DRC job shop scheduling problem. The revised method is based on HSP2, a version of HSP that is designed for large-scale applications. Specifically, in step (ii) of the revised method, HSP2 is used instead of enumeration to yield the allocation ϑ HSP2 and its associated gap VF ϑ HSP2 − LB ϑ * , where the latter quantity is taken as an approximation to the quasi-optimality gap VF ϑ VFB − LB ϑ * . The revised method has proved to be computationally tractable in problems of realistic size and complexity. Through comprehensive experimentation using progressively larger DRC job shop scheduling problems, we compare the empirical reference distributions delivered by enumeration with the corresponding distributions delivered by HSP2, where the comparisons are based on visual inspection as well as the Mann-Whitney test. Except for situations in which the staffing level is so high as to render question (1) moot, we concluded that the difference between the HSP2-based distribution and its enumeration-based counterpart was not significant from either a practical or a statistical perspective. Two numerical examples illustrate the use of the new method for probabilistically evaluating the quality of a given worker allocation in a given instance of the DRC job shop scheduling problem.
The rest of this article is organized as follows. Section 2 provides background on the solution approach of Lobo et al. (2013aLobo et al. ( , 2013b for a given instance of the DRC job shop scheduling problem. In Section 3 we present our revised method for evaluating the quality of a given worker allocation. In Section 4 we describe the techniques used to validate the adequacy and reliability of the revised method by visual and statistical comparisons of the distribution of VF ϑ HSP2 − LB ϑ * with that of VF ϑ VFB − LB ϑ * ; and in Section 5 we summarize the experimental results obtained by applying these validation techniques. Section 6 contains two numerical examples illustrating the intended use of the revised method in practical applications. The main findings of this work are summarized in Section 7. The Online Supplement to this article contains the full set of experimental results.

Background and motivation
Beginning with Nelson (1967), there has been extensive research activity expended on DRC systems, especially DRC job shops. Treleven (1989), Gargeya and Deane (1996), and Hottenstein and Bowman (1998) survey research on DRC job shop scheduling through the 1990s. Hottenstein and Bowman (1998) find that there are two main questions regarding the allocation of workers: • When should workers transfer from one machine group to another? • To which machine group (where) should those workers move?
The authors find that most of the cited research addresses these questions through investigation of worker flexibility, centralization of control, worker allocation rules, queue discipline, and the cost of transferring workers. Felan et al. (1993) examine the effect that labor flexibility and staffing levels have on key measures of job shop performance. The authors consider a homogeneous group of workers whose flexibility comes from each worker's ability to work in a varying number of departments within the job shop; moreover, the authors assume that for a given staffing level, each department has the same number of workers assigned to it. Although Felan et al. (1993) allow workers to transfer between departments, the authors are concerned with finding the optimal combination of staffing level and workforce flexibility level, rather than optimizing the allocation of workers to departments given a staffing level. Current work on DRC job shop scheduling has focused on incorporating more realistic assumptions about worker behavior-e.g., learning, fatigue, and forgetfulness (Malhotra et al., 1993;Kher et al., 1999;Kher, 2000;Jaber and Neumann, 2010).
To allocate fully cross-trained workers in a DRC job shop, Lobo et al. (2013aLobo et al. ( , 2013b formulate methods that are motivated by the authors' examination of, and involvement with, the operations in the U.S. Navy's Aviation Depots, a large-scale apparel producer, and a major high-end furniture manufacturer. In each case, cross-trained workers are allocated on an ad hoc basis; and in each of these disparate organizations, there is a clearly recognized need for a more systematic and effective approach to the allocation of workers. The approach taken in Lobo et al. (2013aLobo et al. ( , 2013b and in this article makes extensive use of the Virtual Factory, which was developed by Hodgson et al. (1998) as part of their solution to the J M ||L max single resource constrained job shop scheduling problem. Incorporating a transient, deterministic simulation of a job shop, the Virtual Factory is based on an approach proposed by Vepsalainen and Morton (1988). The Virtual Factory is an iterative procedure that sequences jobs using a revised slack calculation on the first iteration. On each subsequent iteration, the Virtual Factory sequences the jobs using an iteratively revised slack calculation that includes revised queuing time estimates from the previous iteration. Iterations continue until a previously computed lower bound on L max is achieved or until the number of iterations reaches a prespecified limit; then the Virtual Factory delivers the best schedule encountered over all iterations.

Lower bound on L max and allocation optimality criteria of Lobo et al. (2013a, 2013b)
The DRC job shop scheduling problem considered by Lobo et al. (2013aLobo et al. ( , 2013b encompasses N jobs, each of which must be processed on some of the M machines in the job shop, where there are W workers available to operate the machines; and W < M, so that the job shop is not fully staffed. Each job has its own route specifying the order in which the job visits its required machines, as well as the processing times required by the job on those machines. Each job also has its own duedate, the time by which all processing must be finished for the job to be considered on time; and thus the job's lateness L is the job's completion time minus its due-date. The job shop is divided into S machine groups, and the main problem is to make an allocation ϑ of the W workers to the S machine groups so as to minimize L max , the maximum lateness taken over all N jobs. Following Pinedo (2012, Section 2.1), we use the notation J M |W|L max to refer to this DRC job scheduling problem.
As mentioned in Section 1, procedure LBSA of Lobo et al. (2013a) delivers a worker allocation ϑ * and the associated value LB ϑ * that is a lower bound on the value of L max over all feasible allocations and over all schedules based on those allocations. To verify that a given allocation is optimal (i.e., it corresponds to a schedule that minimizes L max ), Lobo et al. (2013b) establish two theorems providing optimality criteria that are readily checked in practice. Theorem 1 is a general criterion for verifying the optimality of any feasible allocation (including allocation ϑ * ), while Theorem 2 is a criterion for verifying the optimality of allocation ϑ * under certain termination conditions for LBSA.
To provide a meaningful indication of the likelihood that the LBSA-delivered allocation ϑ * is optimal in practical applications, Lobo et al. (2013a) perform an extensive simulation experiment. The experimental job shop consists of M = 80 machines organized into S = 10 machine groups of equal size, and there are N = 1200 jobs to be completed. Each job has an average of eight operations, where each operation processing time is randomly sampled from the discrete uniform distribution on the integers {1, . . . , 40}. The due-date range of the jobs varies from 200 to 3000 in increments of 400. The staffing level is either 60, 70, 80, or 90%, where a 60% staffing level means that 48 workers are available for allocation. This experimental designs yields 64 designated DRC job shop scheduling problems-i.e., 64 different combinations of job shop type, staffing level, and due-date range. For each designated DRC job shop scheduling problem, the authors generate Q = 200 simulation replications of the problem using the method given in Appendix A of Lobo et al. (2013a). Each simulation replication of a designated DRC job shop scheduling problem share the following "structural" characteristics: • the number of jobs; • the number of machines and the number of machine groups; • the pattern of symmetric or asymmetric loading of the machine groups; • the due-date range; and • the level of staffing-i.e., the ratio of the number of workers to the number of machines expressed as a percentage between 0% and 100%.
However, for each simulation replication, the following characteristics of each job are randomly sampled from the appropriate probability distributions: • the job's due-date; • the number of operations required to complete the job; • the route of the job through the job shop (i.e., the ordered sequence of machines that the job must visit); and • the processing time of each operation.
In addition, for each combination of job shop type and duedate range, the Q = 200 simulation replications are generated so that for i = 1, . . . , Q, the i th randomly generated simulation replication is the same one used with each staffing level. This use of the Monte Carlo method of common random numbers enables sharper comparisons of system performance for different levels of staffing (Chang et al., 1985). Lobo et al. (2013b) find that depending on the structural characteristics of a given DRC job shop scheduling problem, on a simulation replication of that problem there can be a substantial probability that ϑ * does not satisfy their optimality criteria. This finding led to the development of heuristics to search for an allocation that enables the Virtual Factory to generate a schedule whose L max value is less than VF ϑ * .

HSP: The heuristic of Lobo et al. (2013b) for solving the DRC job shop scheduling problem
The heuristic HSP of Lobo et al. (2013b) is designed to find a VF-best allocation for the DRC job shop scheduling problem when the LBSA-delivered allocation ϑ * does not satisfy the relevant optimality criteria (i.e., Theorem 1 and Theorem 2 of Lobo et al. (2013b)). Three distinct search heuristics are incorporated into HSP-namely, the Local Neighborhood Search Strategy (LNSS), Queuing Time Search Strategy 1 (QSS1), and Queuing Time Search Strategy 2 (QSS2). LNSS makes use of the structure of the allocation search space to select promising allocations for evaluation. In selecting the next allocation to be evaluated, both QSS1 and QSS2 exploit the Virtual Factory's estimates of job queuing time at each machine group. LNSS is initialized using allocation ϑ * ; then QSS1 is initialized using the best allocation encountered by LNSS, where the best allocation is the one that enables the Virtual Factory to generate a feasible schedule with the smallest L max value encountered thus far. Next QSS2 is initialized using the best allocation encountered by either LNSS or QSS1. The allocation ϑ HSP finally returned by HSP is the best allocation encountered by LNSS, QSS1, or QSS2. The experimental results of Lobo et al. (2013b) indicate that across the 64 designated DRC job shop scheduling problems, the performance of HSP is typically very good. For each designated problem, in a substantial percentage of the simulation replications in which allocation ϑ * does not satisfy the relevant optimality criteria, the HSP-delivered allocation ϑ HSP is a VF-best allocation (in this case allocation ϑ VFB is obtained via enumeration). However, the authors note that the difference VF ϑ HSP − LB ϑ * is usually non-zero; therefore, if allocation ϑ VFB and consequently the value VF ϑ VFB are both unknown, then we cannot be certain that ϑ HSP is a VF-best allocation. To gauge the likelihood that, for a specific instance of the DRC job shop scheduling problem, allocation ϑ HSP is in fact a VF-best allocation, Lobo et al. (2013b) develop a probabilistic method for evaluating allocation quality.

The probabilistic method of Lobo et al. (2013b) for evaluating allocation quality
For each DRC job shop scheduling problem used in their experimentation, Lobo et al. (2013b) generate Q = 500 simulation replications of that problem as detailed in Section 2.2. Attention is restricted to only the Q 1 simulation replications in which allocation ϑ * does not satisfy the relevant optimality criteria (Theorem 1 and Theorem 2 of Lobo et al. (2013b)), where 0 ≤ Q 1 ≤ Q. (In situations where Q 1 = 0 with high probability, there is of course no need to search for a VF-best allocation; therefore, in the following discussion, we assume that Q 1 > 0.) For the i th simulation replication (i = 1, . . . , Q 1 ), the partial enumeration strategy of Lobo et al. (2013b) is used to obtain ϑ VFB i and VF ϑ VFB i . Given an arbitrary allocation ϑ, the quantity PLB(ϑ) = VF ϑ − LB ϑ * measures the performance of ϑ relative to the lower bound on L max . Computing PLB(ϑ VFB ) for the Q 1 replications of interest yields the random sample which is used to estimate the distribution of the random variable PLB(ϑ VFB ) taken over all possible realizations (i.e., simulation replications) of the given DRC job shop scheduling problem. Given a real instance of the DRC job shop scheduling problem that must actually be solved and a specific worker allocation ϑ that yields the fixed quantity PLB(ϑ) for the real problem, Lobo et al. (2013b) seek to test the null hypothesis that ϑ is a VF-best allocation for the real problem. They test this hypothesis by estimating the probability that for a simulation replication of the given DRC job shop scheduling problem, the random variable PLB(ϑ VFB ) = VF ϑ VFB − LB ϑ * will be at least as large as the fixed value PLB(ϑ) for the real problem. This upper-tail probability can be viewed as the p-value or level of significance for a test of the null hypothesis (Bickel and Doksum, 2007, pp. 221-223). The upper-tail probability can also be interpreted as the user's level of confidence that ϑ is a VF-best allocation for the real problem. For example, if approximately 90% of the observations in the random sample VFB are at least as large as PLB(ϑ) and Q 1 is reasonably large, then with roughly 90% confidence we may conclude that ϑ is a VF-best allocation for the real DRC job shop scheduling problem to be solved. Lobo et al. (2013b) and Lobo et al. (2014) fit a continuous distribution to the random sample VFB in situations for which VFB contains a sufficiently large subset of non-zero observations, so as to enable a meaningful assessment of the adequacy of the fitted distribution. In particular, if Q + 1 > max { 0.9Q 1 , 50 }, then the entire random sample VFB is fitted using a continuous distribution. (The best-fitting continuous distributions identified by Lobo et al. (2013b) and Lobo et al. (2014) in various cases include the generalized beta distribution, the shifted gamma distribution, the bounded Johnson distribution, the unbounded Johnson distribution, the shifted lognormal distribution and the shifted Weibull distribution.) On the other hand if 50 < Q + 1 ≤ 0.9Q 1 , then the authors use VFB to fit a mixed distribution of the form where of the degenerate distribution with unit probability mass at the origin so that is estimated using the distribution from the above list that provides the best fit to the subsample + VFB . Finally if Q + 1 ≤ 50, then F VFB (x) is simply estimated by the empirical c.d.f. of VFB . Lobo et al. (2013b) and Lobo et al. (2014) use the simulation-generated random sample VFB of size Q 1 to compute an estimate F VFB Then for a given real instance of the associated DRC job shop scheduling problem and a specific worker allocation ϑ that yields PLB(ϑ) > 0 in the real problem, the authors take 100{ 1 − F VFB Q 1 [PLB(ϑ)] }% as their approximate level of confidence that ϑ is a VF-best allocation for the real problem; and if PLB(ϑ) = 0, then their confidence level is 100% (i.e., it is known with certainty) that ϑ is a VF-best allocation for the real problem.
is the approximate p-value associated with the authors' simulationbased test of the null hypothesis that ϑ is a VF-best allocation for the real problem; and if PLB(ϑ) = 0, then the associated p-value is exactly 1.0.

A practical method for evaluating worker allocations in large-scale DRC job shops
To evaluate the quality of a given allocation, the method of Lobo et al. (2013b) requires some form of enumeration of the allocation search space on each simulation replication so as to compute the associated "quasi-optimality" gap. Because of its computational complexity, such enumeration is impractical for even a single simulation replication of a largescale DRC job shop scheduling problem. In this article we develop a revised method for evaluating allocation quality based on the following idea: on the i th simulation replication of a given problem, the revised method delivers a readily computed, heuristic-based approximation to the enumerationbased observation PLB(ϑ VFB i ) that is too expensive to compute. In other words, we use "simulated" simulation replications of the random variable PLB(ϑ VFB ) to evaluate the quality of a given worker allocation.
The usefulness of the revised method depends critically on a heuristic approximation to PLB(ϑ VFB i ) that is sufficiently fast and accurate on each simulation replication i of a largescale DRC job shop scheduling problem. For this purpose we formulated HSP2, an improved version of HSP whose performance is more stable across a broad range of DRC job shop scheduling problems with varying degrees of size and complexity. In the original HSP algorithm, elapsed execution time is one of the main stopping criteria. Among the search substrategies comprising HSP, the heuristic LNSS has up to 45 seconds for searching different allocations, whereas the heuristics QSS1 and QSS2 together have a total of 15 seconds for this purpose, ensuring that HSP's total execution time is at most 1 minute per allocation. However, each of these substrategies invokes the Virtual Factory for each allocation searched; and the execution time required by the Virtual Factory to generate a schedule based on a specific allocation increases with the size of the associated job shop. Therefore, the time needed by HSP to search a fixed number of allocations increases with the size of the job shop, and the performance of HSP may vary substantially across DRC job shop scheduling problems of different sizes if the execution time is kept constant.
HSP2 is an improved version of HSP in which the main stopping criterion for each of its constituent search substrategies is based on the number of allocations searched, not the execution time. In HSP2, the substrategy LNSS is allowed to search up to 10 allocations, while QSS1 and QSS2 are each allowed to search up to four allocations. These limits are based on the average number of allocations that we have found are necessary to ensure adequate performance of HSP2 in DRC job shop scheduling problems of the largest size considered in all our current and previous work.
In the revised method for evaluating allocation quality, attention is focused on the Q 2 ≤ Q simulation replications of a given DRC job shop scheduling problem in which allocation ϑ * does not satisfy the relevant optimality criteria. Applying HSP2 to the i th such simulation replication (i = 1, . . . , Q 2 ), we obtain the random sample The c.d.f. F HSP2 (x) underlying HSP2 should be reasonably close to that of VFB because HSP2 is designed to ensure that for i = 1, . . . , Q 2 , the HSP2-delivered observation PLB(ϑ HSP2 i ) is a reasonably close approximation to the observation PLB(ϑ VFB i ) that would have been generated by jointly using the Virtual Factory and enumeration of the allocation search space on the i th simulation replication of the given problem.
Next we fit a c.d.f. F HSP2 Q 2 (x) to the simulation-generated random sample HSP2 along the same lines as in the discussion of Equations (3) to (5) in the previous section. In particular, we fit a continuous distribution to HSP2 only if HSP2 contains a sufficiently large subset such that and the rest of the discussion of Equations (4) and (5) applies similarly. Given a real instance of the associated DRC job shop scheduling problem and a specific worker allocation ϑ that yields PLB(ϑ) > 0 in the real problem, we take 100 [PLB(ϑ)]}% as our approximate level of confidence that ϑ is a VF-best allocation for the real problem; and if PLB(ϑ) = 0, then our confidence level is 100% (i.e., we know with certainty) that ϑ is a VF-best allocation for the real problem. The main advantage of the revised method for evaluating allocation quality is that it requires orders of magnitude less execution time to yield the simulation-generated observations { PLB(ϑ HSP2 i ) : i = 1, . . . , Q 2 } on which the estimated c.d.f. F HSP2 Q 2 (x) is based. In Section 6 we present two numerical examples that illustrate the application of the revised method to two moderately large DRC job shop scheduling problems.

Techniques for validating the revised method
As a first step in gauging the differences and similarities between the two distributions respectively underlying VFB and HSP2 for a given DRC job shop scheduling problem, we compute the usual sample statistics for the following pairedsample differences computed from VFB and HSP2 , and the paired observations PLB(ϑ VFB i ) and PLB(ϑ HSP2 i ) are computed from the same simulation replication i of the given DRC job shop scheduling problem for i = 1, . . . , Q , where Q = Q 1 = Q 2 . In particular, we compute the sample mean and variance of the paired-sample differences, for each designated DRC job shop scheduling problem in the overall simulation experiment. Thus, we can use Equation (11) to test the hypothesis that for a given DRC job shop scheduling problem, E[D] = 0 so that the underlying distributions from which VFB and HSP are sampled must have the same mean. Since we know that PLB(ϑ VFB ) ≤ PLB(ϑ HSP2 ) when both random variables are computed from the same simulation replication of a given DRC job shop scheduling problem, the appropriate alternative hypothesis based on the paired-sample Student's t-statistic (11) is that E[D] > 0 so that the distribution corresponding to HSP2 has a larger mean than that for the distribution corresponding to VFB . The following two techniques are used to test the hypothesis that the random samples VFB and HSP2 are both drawn from the same underlying distribution. The first technique is a plot of the empirical c.d.f. of each random sample on the same graph. The degree of overlap between the two empirical c.d.f.s as judged by the reader provides an indication of how likely it is that the two data sets are drawn from the same underlying distribution. For the random sample { PLB(ϑ VFB i ) : i = 1, . . . , Q 1 } based on Q 1 independent simulation replications of a designated DRC job shop scheduling problem, the corresponding empirical c.d.f. is defined as where for the condition C, the indicator function I{C} = 1 if condition C is true, and I{C} = 0 otherwise.
We also use the Mann-Whitney (or Wilcoxon) nonparametric statistical test (Conover, 1999, pp. 272-281) to test the null hypothesis that F VFB (x) = F HSP2 (x) for all real x. As explained in the preceding paragraph, the Mann-Whitney test should be a one-sided test, where the alternative hypothesis is that PLB(ϑ HSP2 ) is stochastically larger than PLB(ϑ VFB ), meaning that F HSP2 (x) ≤ F VFB (x) for all real x (Lehmann, 1975, p. 66). Because the Mann-Whitney test requires that the random samples VFB and HSP2 are mutually independent, for each designated DRC job shop scheduling problem we used two independent sets of simulation replications to compute the random samples VFB and HSP2 . The Mann-Whitney test was performed using the function scipy.stats.mannwhitneyu in SciPy (Oliphant, 2007).

Experimental validation results
An extensive simulation experiment was performed to provide credible evidence that, regardless of the size of DRC job shop considered, the random sample obtained using the new method ( HSP2 ) and the random sample obtained using the original method ( VFB ) are both drawn from the same underlying distribution. Section 5.1 defines the experimental design and the DRC job shop scheduling problems used, while the results of the extensive simulation experiment are presented in the rest of the section.

Test problems used in the experimental validation
The DRC job shop used in Lobo et al. (2013aLobo et al. ( , 2013b has N = 1200 jobs and M = 80 machines spread uniformly across S = 10 departments; and in our experience, this combination of M, S, and N can arguably be considered as the specification for a "large" DRC job shop to be used in the experimental validation. Similarly, we selected DRC job shops of "medium" and "small" sizes for the experimental validation based on our experience with such systems; and the defining features of the three job shop sizes are summarized in Table  1. In each selected DRC job shop, there are M/S machines in every department (machine group). In Table 1, the column subheadings Min., Max., and Same under the column heading No. job operations respectively denote the minimum and maximum number of operations for each job and the maximum number of the job's operations that can occur in the same machine group. In the DRC job shop of medium size with M = 64 machines, we set the number of workers W so as to approximate the desired staffing level percentages as closely as possible.
The definition of a designated DRC job shop scheduling problem must be expanded to include the size of the job shop being considered, so that a designated problem now refers to the combination of the size of the job shop, the type of job shop, the staffing level, and the due-date range.
In the symmetric (or balanced) version of the selected DRC job shops, on average each machine group has 100/S% of the job shop's workload pass through it. Asymmetry in a selected job shop is created by altering the probability that a machine group is on each job's route. Thus, if machine group i has probability 0.075 of being on each job's route, then on average machine group i will have 7.5% of the job shop's workload pass through it. Table 2 contains the probabilities of each job being routed through each machine group in the asymmetric (or unbalanced) version of each selected DRC job shop.
In all DRC job shops considered, the due-date of each job is a function of the due-date range, which varies from 200 to 3000 in increments of 400; and the processing time of each operation is randomly sampled from the discrete uniform distribution on the integers {1, . . . , 40}.

Paired-sample Student's t-tests
For the small, medium, and large asymmetric DRC job shop scheduling problems described in Tables 1 and 2, Table 3 summarizes the sample statistics {D i }, D, S D , and t as defined by Equations (9), (10), and (11). The sample statistics for the small, medium, and large symmetric DRC job shop scheduling problems can be found in Table 4. All results are expressed in the units of an average operation processing time, which is equal to 20.5 time units because the operation processing time is uniformly distributed on the integers {1, . . . , 40}. Also note that because the paired-sample Student's t-statistic is being computed, the random samples VFB and HSP2 are not mutually independent: for each designated DRC job shop scheduling problem both VFB and HSP2 are constructed from the same set of Q = 500 simulation replications.
At the 1% level of significance, the results indicated that at the 80% and 90% staffing levels, there were statistically significant differences in the means of the underlying populations from which VFB and HSP2 were respectively sampled for both symmetric and asymmetric job shops and across all due-date ranges. A similar conclusion applied to the large asymmetric job shop with 70% staffing. On the other 0.14 0.14 0.14 0.10 0.08 0.08 0.08 0.08 0.08 0.08 Large hand, for the rest of the DRC job shop scheduling problems (asymmetric and symmetric DRC job shop scheduling problems with 60% and 70% staffing, and across all due-date ranges), the results indicate that there were no practically significant differences between the expected values of the two underlying populations from which VFB and HSP2 were respectively sampled.

Graphical and statistical comparisons of distributions underlying VFB and HSP2
For the empirical c.d.f. plots that follow, for each designated DRC job shop scheduling problem the random samples VFB and HSP2 used are the same as those used for the results in Tables 3 and 4, which is to say that both VFB and HSP2 are constructed from the same set of simulation replications.  The Mann-Whitney test requires that the random samples VFB and HSP2 are mutually independent. Thus, for the results presented in Tables 5 and 6, two independent sets of Q = 500 simulation replications were used to generate the random samples VFB and HSP2 . For each designated DRC job shop scheduling problem, Table 5 and Table 6 contain the following information: (i) Q 1 = VFB , the number of replications in the first set of Q = 500 simulation replications in which allocation ϑ * did not satisfy the optimality criteria (Theorem 1 and Theorem 2 of Lobo et al. (2013b)); (ii) Q 2 = HSP2 , the number of replications in the second set of Q = 500 simulation replications in which allocation ϑ * did not satisfy the optimality criteria; and (iii) the p-value corresponding to the one-sided Mann-Whitney test as performed on the mutually independent random samples VFB and HSP2 . In asymmetric DRC job shop scheduling problems with lower staffing levels (60% and 70%), there is strong evidence that the new method results in an excellent approximation to the distribution of the quasi-optimality gap. However, in asymmetric DRC job shop scheduling problems with higher staffing levels (80% and 90%), the evidence indicates that the approximation to the distribution of the quasioptimality gap achieved by the new method is not especially convincing. In particular, the approximation to the distribution of the quasi-optimality gap achieved by the new method is less than satisfactory for the following asymmetric DRC job shop scheduling problems: • small job shop with 90% staffing and due-date range 1800-3000; • medium job shop with 90% staffing and due-date range 1400-3000; • large job shop with 80% staffing and due-date range 1400-3000; and • large job shop with 90% staffing and due-date range 1400-3000.
However, in all of the aforementioned asymmetric DRC job shop scheduling problems, the vast majority of PLB(ϑ HSP2 ) values is less than a single average operation processing time (20.5 time units). Thus, the asymmetric DRC job shop scheduling problems in which the new method does not result in an excellent approximation to the distribution of the quasi-optimality gap are also problems that do not necessarily require the approximation to begin with. Figures 3 and 4  In general, in symmetric DRC job shop scheduling problems there is strong evidence that the new method results in an excellent approximation to the distribution of the quasioptimality gap. In the symmetric DRC job shop scheduling problems with 60, 70, and 80% staffing, the degree of overlap between empirical c.d.f.s on these plots gives the appearance that there is a single empirical c.d.f. plotted; and this congruence is reflected in the associated Mann-Whitney p-values. In symmetric DRC job shop scheduling problems with a staffing level of 90%, although there is strong agreement in the overall shape of the two empirical c.d.f.s, the discrepancy between the two empirical c.d.f.s is manifested as a downward shift of the empirical c.d.f. corresponding to the random sample HSP2 relative to the empirical c.d.f. corresponding to the random sample VFB . However, in general, this discrepancy is not reflected in the Mann-Whitney test p-values.

Overall results
DRC job shop scheduling problems with lower staffing levels and with large optimality gaps represent the problems for which a probabilistic characterization of the optimality gap is necessary and for which question (1) is important. A larger optimality gap translates into a higher level of uncertainty regarding the quality of the allocation, making question (1) more relevant. The results of the large-scale simulation experiment indicate that the revised method results in a distribution that is neither practically nor statistically different from its enumeration-based counterpart. The revised method works well across a wide variety of both asymmetric and symmetric DRC job shop scheduling problems, and works equally well on small problems as it does on medium and large problems. Thus, there is good reason to believe that the new method will work equally well on DRC job shop scheduling problems that are much larger than the largest size problems considered in this article.

Examples illustrating the revised method for evaluating allocation quality
The first numerical example involves a large asymmetric job shop with 70% staffing and a due-date range of 200. The revised method was used to generate the random sample HSP2 and fit a distribution to that sample. The resulting fit is the mixed distribution F HSP2 Q 2 = 0.301F 0 (x) + 0.699F c (x), where F c (·) ∼ Weibull(1.47, 51.86, 1.00). In the legend of Fig. 5, the label "Datapoints" refers to Q + 2 , and the p-value refers to the chi-squared goodness-of-fit test statistic for the fitted distribution.
For the given DRC job shop scheduling problem, allocation ϑ * failed to satisfy either of the optimality criteria; and we obtained LB ϑ * = 3702 and VF ϑ * = 3825 so that PLB(ϑ * ) = 123. From these results we know that  consequently, our level of confidence is only 2.08% that ϑ * is a VF-best allocation. On the other hand for the HSP2generated allocation ϑ HSP2 , we obtained VF ϑ HSP2 = 3712 and PLB(ϑ HSP2 ) = 10 so that indicating we are 64.77% confident that allocation ϑ HSP2 is a VF-best allocation. The second example involves a large symmetric job shop with 80% staffing and a due-date range of 2200. Again the revised method was used to generate the random sample HSP2 and fit a distribution to that sample. The resulting fit is the continuous distribution F HSP2 where (·) denotes the standard normal c.d.f. Therefore, our level of confidence is only 0.31% that ϑ * is a Fig. 6. Probability distribution fitting, symmetric job shop, 80% staffing, 2200 due-date range.
In both of the foregoing examples, the revised method of Section 3 enabled rapid evaluation of the quality of a given worker allocation. In addition, the revised method allowed the user to quantify the increase in solution quality that is obtained by using one allocation instead of another: in both examples the HSP2-generated allocation ϑ HSP2 was substantially better than the baseline allocation ϑ * .

Conclusions
In this article we formulated and validated a revised method for evaluating the quality of a given worker allocation in a given DRC job shop scheduling problem. Building on the original method of Lobo et al. (2013b), the revised method was designed to be applicable in large-scale problems. When used in conjunction with the Virtual Factory (Hodgson et al., 1998), the revised method has proved to be computationally tractable in problems of realistic size and complexity. Based on the results of our experimental validation of the revised method, we concluded that the method was reliable and effective except in situations for which the staffing level was so high as to make the search for near-optimal worker allocations practically unnecessary.

Supplemental material
Supplemental data for this article can be accessed on the publisher's website.