A scheme for unifying optimization and constraint satisfaction methods

Optimization and constraint satisfaction methods are complementary to a large extent, and there has been much recent interest in combining them. Yet no generally accepted principle or scheme for their merger has evolved. We propose a scheme based on two fundamental dualities: the duality of search and inference, and the duality of strengthening and relaxation. Optimization as well as constraint satisfaction methods can be seen as exploiting these dualities in their respective ways. Our proposal is that rather than employ either type of method exclusively, one can focus on how these dualities can be exploited in a given problem class. The resulting algorithm is likely to contain elements from both optimization and constraint satisfaction, and perhaps new methods that belong to neither.

Our scheme is based on exploiting two dualities: the duality of search vs. inference and the duality of strengthening vs. relaxation. Some of these ideas are anticipated in Hooker (1994).
Branching algorithms provide one example of these dualities at work. The search/inference duality is evident in Bockmayr and Kasper's (1998) observation that both optimization and constraint satisfaction rely on``branch and infer.'' Branching is a search mechanism. During the branching process, one can generate inferences in the form of cutting planes (as in optimization) or constraint propagation to achieve domain reduction (as in constraint satisfaction). The combination of branching and inference is usually much more eective than either alone.
The strengthening/relaxation duality is also evident in branching algorithms. When one branches on the possible values of a variable, the resulting subproblems are strengthenings of the original problem in the sense of a restriction; they have an additional constraint that ®xes the value of the variable and therefore shrinks the feasible set. In optimization one typically solves a relaxation of the problem at each node of the search tree in order to obtain bounds on the optimal value, often a continuous relaxation such as a linear programming or Lagrangean relaxation. The reduced variable domains obtained in a constraint satisfaction algorithm in eect represent a relaxation of the problem at that node of the search tree. In any feasible solution the variables must take values in these domains, but an arbitrary selection of values from the domains need not comprise a feasible solution. Again, enumeration of strengthenings is more eective when combined with relaxation of some kind.
The thesis of this paper is that because both constraint programming and optimization use problem-solving strategies based on the same dualities, their methods can be naturally combined. Rather than employ optimization methods exclusively or constraint satisfaction methods exclusively, one can focus on the how these dualities can be exploited in a given problem class. The resulting algorithm is likely to contain elements from both optimization and constraint satisfaction, and perhaps new methods that belong to neither.
Section 1 begins the paper with a simple illustration of how the dualities might operate in a branching context. It does so ®rst in a constraint satisfaction and in an integer programming setting. It then combines the two approaches.
The next two sections explore the dualities more deeply and propose methods that belong to neither constraint satisfaction nor integer programming. Section 2 investigates the search/inference duality. It explains how constraint programmers exploit problem structure to apply eective inference algorithms. It also frames the search/inference duality as a formal optimization duality that generalizes classical linear programming duality. This perspective forges a link between the concept of a nogood in constraint satisfaction and Benders decomposition in optimization. In addition, it provides a general method for sensitivity analysis.
Section 3 takes up the duality of strengthening and relaxation. This duality is studied in optimization under the guise of Lagrangean and surrogate duality, which ®nds a strong relaxation by searching over a parameterized family of relaxations. Both are de®ned only for inequality constraints, but both are special cases of a general relaxation duality that can be developed for a much wider range of problems.
A particularly promising maneuver is to combine constraint programming's approach to inference with optimization's approach to relaxation. When formulating a problem, the constraint programmer often identi®es a group of constraints that show special structure and represents them with a single global constraint, such as llEdifferentY element or umultive. The optimizer often relaxes a problem by transforming it to an instance of a specially structured class of problems that can be solved to optimality, such as a linear programming problem. Both of these techniques are key to the success of the respective ®elds. There is a natural way to combine them: design relaxations of the sort used in optimization for global constraints of the sort used in constraint programming . This idea is illustrated in section 4 by presenting relaxations for element constraints.
The paper concludes by suggesting issues for future research.

A motivating example
A small example can illustrate how dualities can operate in a constraint satisfaction and in an integer programming setting as well as in a combined mode. One example can illustrate only a few of the relevant ideas, but it will help make the discussion to follow more concrete. Consider the following optimization problem: The set D j f1Y F F F Y 5g is the initial domain of each of the variables x j . The optimal solution is x 1 Y x 2 Y x 3 = (2, 3, 1) with optimal value 22. We will solve the problem by constraint satisfaction methods, by integer programming, and ®nally by a combined approach. The particular methods illustrated are not the best available for either constraint satisfaction or integer programming. They are chosen because they help illustrate how methods may be combined. The intention is not to compare the performance of constraint programming and integer programming, but to present a combined approach.

Constraint satisfaction
A constraint satisfaction method can solve (1) by solving the feasibility problem Initially z I. Each time a feasible solution is found, the search continues with z set to the value of the solution. A possible search tree is shown in Table 1. The nodes are traversed in the depth-®rst order shown. At node 1, where D 1 D 2 D 3 f1Y 2Y 3Y 4Y 5g, the search branches on x 1 . This creates a subproblem at node 2 by setting D 1 f1g, and one at node 7 by setting D 1 f2Y 3Y 4Y 5g. (Other branching schemes are possible.) Similarly, the search branches on x 2 at node 2, creating nodes 3 and 6. Because x 1 1 at node 2, setting x 2 1 or x 3 1 would be inconsistent with the llEdifferent constraint. The domains of x 1 Y x 2 are therefore reduced to D 1 D 2 f2Y 3Y 4Y 5g. There are several dierent domain reduction algorithms that remove domain elements that are inconsistent with llEdifferent, varying in degree of eciency and completeness (Marriott and Stuckey, 1998;ReÂ gin, 1994).
A feasible solution is found at node 4 that permits one to set z 25. At node 5, another type of domain reduction, based on maintaining bounds consistency, can be applied (Marriott and Stuckey, 1998). It infers from (2) that where min D j is the smallest element in D j , and similarly for x 1 and x 2 . This changes neither D 1 f1g nor D 1 f2g but reduces D 3 f4Y 5g to the empty set because (6) implies that x 3 3. In general, the implications of one constraint are propagated to other constraints by means of a constraint store, which in the present case consists of the reduced domains that are inferred from one constraint and made available to others. The subproblem at node 5 is therefore infeasible. Node 6 reveals a solution of value z 23, and ®nally, the optimal value z 22 is discovered at node 8, and the search completes its proof of Unifying optimization and constraint satisfaction methods 13 optimality in nine nodes. Constraint satisfaction therefore solves (1) with a combination of search (branching) and inference (domain reduction).

Integer programming
Integer programming exploits the search/inference duality along with a duality of strengthening and relaxation. Due to the restricted vocabulary of integer programming, however, the model is more complex.
There are other and better integer programming models for this particular problem. Model (7)± (10) is used here to illustrate the big-M constraints, and how they may be avoided in general.
The problem is again solved by branching. This time, however, a continuous relaxation of the problem is solved each node of the search tree. The continuous relaxation of (7)±(10), which is solved at the root node, is obtained by deleting the integrality constraints on x j and replacing y ik P f0Y 1g with 0 y jk 1. The optimal value of the resulting linear programming relaxation provides a lower bound on the optimal value of (7)±(10).
In integer programming, inference often takes the form of cutting plane generation. Cutting planes are inequalities that are satis®ed by every integer solution of the continuous relaxation but possibly violated by some noninteger solutions. By``cutting o'' noninteger solutions, cutting planes can provide a tighter bound when added to the constraint set of the continuous relaxation. In this case, one might add the cutting planes, x 1 x 2 x 3 ! 5 2x 1 x 2 2x 3 ! 9 11 (These cutting planes are derived from the inequality constraints alone.) There is a vast literature describing how cutting planes may be generated for highly structured problem classes, such as traveling salesman, job shop scheduling, set covering, set packing, and a host of other problems. A search tree for the problem represented by (7)±(10) and (11) appears in Table 2. The search branches on variables that have nonintegral values in the continuous relaxation, in the order x 1 Y x 2 Y x 3 Y y 12 Y y 13 Y y 23 . At node 1, y 23 1 5 , and one branches by setting y 23 0 and y 23 1, creating nodes 2 and 7. At node 2, x 1 2 1 2 , and the branches are de®ned by x 1 2 and x 1 ! 3. The branching constraints are added to the relaxation at each node. For example, y 23 0 is added at node 2. The ®rst feasible (i.e., integral) solution is found at node 5. Its optimal value provides an upper bound " z 22 on the optimal value of the original problem, i.e., the constraint 4x 1 3x 2 5x 3 22 is added to the problem at node 5. At node 6, the value of the relaxation is 23, so further branching at node 6 cannot lead to an optimal solution, and the tree is pruned at this point. This bounding mechanism provides the name, branch-and-bound, for this particular kind of search. Table 2 Solution of an integer programming problem by branching, relaxation and cutting plane generation Branch-and-bound search relies on the interplay of the search/inference and strengthening/ relaxation dualities. It searches by branching and draws inferences by cutting plane generation (which results in branch and cut). Branching likewise creates strengthenings of the problem by adding constraints. A relaxation of this strengthened problem is created by dropping integrality requirements and applying a linear programming algorithm.

A combined approach
The respective advantages of constraint satisfaction and integer programming are easily combined in this case. Domain reduction and cutting plane generation are simply two forms of inference, and both can be used. In fact, domain reduction can be applied to the cutting planes (11) as well as the original constraint (8). The advantages of relaxation are also available. The integer programming relaxation was obtain by dropping integrality constraints from the large model (7)±(10). But the largest part of this model, (9)±(10), adds little to the quality of the relaxation. One can use the original model (1) as a problem statement and for feasibility checks, but create a continuous relaxation that consists of (7)±(8), the cutting planes (11) and the bounds 1 x j 5. Because the relaxation is distinguished from the model, both are more succinct.
A search tree appears in Table 3. At each node constraint propagation is ®rst applied to the original problem, and if successful, the bounds in the relaxation are adjusted accordingly (we add the branching constraints both to the original model and the relaxation). As soon as a feasible solution is found, a constraint is added to the problem indicating the bound on the solution and it is updated as necessary.
The search branches on the alternatives x 1 ! 2, x 3 ! 2 at node 1 because the solution of the relaxation sets x 2 x 3 1; the alternatives are obviously exhaustive. At node 2 the search branches on a nonintegral variable x 1 . Because the solution of the relaxation at node 3 is integral and satis®es llEdifferent, it is feasible, and no further branching is needed.
Due to the combined eects of constraint propagation and relaxation, a combined approach may Table 3 Solution of a constraint satisfaction problem by branching, domain reduction, relaxation, and cutting plane generation produce a search tree that is smaller than those that result from constraint programming or integer programming methods. In addition, processing may be faster at each node than in integer programming, because the relaxation is smaller. There may also be nodes at which one need not solve the relaxation, because constraint propagation alone (which is often faster than solving an LP) may determine that the problem is infeasible.

Duality of search and inference
A search method examines possible values of the variables until an acceptable solution is found. An inference method attempts to derive a desired implication from the constraint set. Popular search methods include branching (which examines partial solutions) and local search heuristics (which examine complete solutions). Inference methods include cutting plane methods (in which inequalities are inferred) and domain reduction (in which smaller domains are inferred). Search and inference tend to work best in combination. Search alone may happen upon a good solution early in the process, but it must examine many other solutions before determining that it is good. Inference alone can rule out whole families of solutions as inferior, but this is not the same as ®nding a good solution. Working together, search and inference can ®nd and verify good solutions more quickly.
As illustrated above, a common strategy for running search and inference in parallel is to use inference in the context of a branching search. Constraint satisfaction infers smaller domains, for instance by maintaining consistency. Smaller domains result in less branching. The example illustrates the maintenance of bounds consistency for inequalities and hyperarc consistency (Marriott and Stuckey, 1998) for llEdifferent constraints. The cutting planes of optimization are designed to strengthen continuous relaxations, but can reduce domains as well, for example if one maintains bounds consistency for them.
Combining search and inference also provides an eective way to exploit problem structure. Inferences drawn from specially structured constraints can reveal which regions of the solution space are unproductive and need not be examined. This is discussed ®rst below. Finally, the interaction of search and inference can be interpreted as a formal duality. This leads to a link between nogoods and Benders decomposition as well as a general approach to sensitivity analysis.

Inference and structure
One advantage of using inference in the context of search is that it can exploit special structure. If the problem or some part if it exhibits a pattern that has been closely analyzed oine in order to identify strong implications, these implications can be generated quickly. The practical success of optimization and constraint satisfaction owes much to this strategem.
The two ®elds have developed dierent and complementary approaches to recognizing structure. Constraint programmers identify sets of constraints, at the modeling stage, that can be recognized as a single global constraint (Beldiceanu and Contejean, 1994;Laburthe, 1997a, 1977b;ReÂ gin and Pugat, 1997;ReÂ gin, 1999). The umultive constraint, for example, requires that a set of tasks be scheduled so that, at any given time, their total consumption of resources is within bounds. A variety of scheduling problems are special cases of this general pattern. When they are formulated as such, the solver can apply domain reduction procedures that deal with the constraints globally rather than one at a time, thus resulting in much greater reduction of domains.
In optimization, the modeler typically identi®es a problem as an instance of a class for which solution methods have been designed, such as linear programming, network¯ow, or 0±1 programming problems. Beyond this point the recognition of structure is often automated as part of the solution algorithm. The problem is scanned for opportunities to generate knapsack cuts, ®xed charge cuts, covering inequalities, etc. In some cases substructures are identi®ed, as for example subgraphs in a traveling salesman problem that give rise to comb inequalities. Another dierence with the constraint satisfaction community is that constraint generation is aimed at strengthening a Unifying optimization and constraint satisfaction methods continuous relaxation rather than raising the degree of consistency of the constraint set. Interestingly, for a brief period in the early days of operations research, optimizers used constraints that were unrelated to the continuous relaxation. They were part of the implicit enumeration schemes of that day (e.g., Gar®nkel and Nemhauser, 1970). Perhaps such constraints have since been neglected because the community, unaware of the theory of consistency, has not had a clear understanding of how they might accelerate search.
One of the most impressive traits of the human mind is its pattern-recognition ability. The constraint satisfaction approach uses this ability to identify structure primarily in the modeling stage. Optimization uses it primarily in the design of the solution algorithms that automatically detect patterns. To restrict oneself to one approach or the other seems a mistake. The modeler's insight into the practical situation should be used, as should the mathematician's analysis when the problem is suciently stylized to apply it.

Inference duality in optimization
One way to capture the duality of search and inference in a more rigorous setting is to state it as a formal optimization duality. For this purpose, a general optimization problem might be written minimize xPD fx subject to x P S 12 where S is the feasible set and D D 1 Â F F F Â D n the domain. This can be viewed as a search problem: ®nd an x P S D that minimizes fx. The inference dual is maximize z where the arrow indicates implication: for all x P D, if x P S then fx ! z. If an optimal value exists for (12), it is the same as the optimal value of (13). So optimization can be viewed as an inference problem: what is the tightest lower bound on fx one can infer from x P S?
The optimization literature has closely studied inference duality in the special case of linear programming. Here (12) becomes where A is an m Â n matrix. The dual problem is to infer the strongest possible inequality cx ! z (i.e., the tightest lower bound z) from Ax ! bY x ! 0. A fundamental result of linear programming (the Farkas Lemma) states that if Ax ! bY x ! 0 is feasible, it implies cx ! z if and only if some nonnegative linear combination uA ! ub of Ax ! b dominates cx ! z. That is, uA c and ub ! z for some u ! 0. So the dual problem can be written maximize uPR m ub subject to uA cY u ! 0 15 This is the classical linear programming dual. It has the same (possibly in®nite) optimal value z Ã as (14), unless both (14) and (15) are infeasible. The solution u of the dual problem can be viewed as encoding a proof that cx ! z Ã , because it speci®es a linear combination of Ax ! b that dominates cx ! z Ã .
Linear programming has the convenient property that a solution of the dual always has polynomial length (i.e., linear programming belongs to both NP and co-NP). A proof of optimality can in general be exponentially long. For example, if x in (14) is restricted to be integer, so that (14) becomes an integer programming problem, then the inference dual can no longer be written in the j . h o o k e r e t a l .
form (15). The dual must be solved by a proof of optimality that most commonly takes the form of an exhaustive search treeÐwhich has exponential size in general.
A major bene®t of inference duality is that it provides a scheme for sensitivity analysis. This kind of analysis is very important in practice because it indicates how the solution would be aected by perturbations of the problem data. It allows one to focus on data that really matter.
Up to a point, sensitivity analysis is straightforward. Given an optimal solution of the primal problem (12), one can analyze under what data alterations this solution remains feasible. However, this says nothing about whether it remains optimal, and this is where the inference dual comes into play. Because a solution of the inference dual is a proof, one can analyze under what data perturbations the proof remains valid and the solution therefore remains optimal (assuming it remains feasible as well).
This scheme works out nicely in linear programming. Let x Ã be an optimal solution of (12) and u Ã an optimal solution of the dual problem (15). Let (14) be perturbed so that it minimizes c Dcx subject to A DAx ! b Db. Obviously, x Ã remains feasible if Dx Ã ! Db. However, it remains optimal as well if u Ã remains a valid proof that c Dcx ! z Ã ; i.e., if u Ã DA Dc and u Ã Db ! z Ã .
Until very recently, the optimization community has approached the sensitivity question for discrete problems in a dierent fashion, using the concepts of value function, super-additive duality, etc. These approaches tend to make sensitivity analysis computationally very dicult. Both optimization and constraint satisfaction problems could bene®t from the inference duality approach. The branch-and-bound tree for an integer programming problem, for example, can be analyzed to determine under what problem perturbations the tree remains a proof of optimality (Dawande and Hooker, 2000;Hooker, 1996Hooker, , 1999. If branching and domain reduction prove a problem instance to be infeasible, one can examine for what problem perturbations this proof remains valid. More generally, a solution of the inference dual provides an explanation (in the form of a proof) for why a solution is optimal, or why the problem is infeasible. In practice, an explanation of the solution could be more valuable than the solution itself. Perhaps methods can be developed to reduce this proof to its bare essentials, delineating as clearly as possible the reason for optimality or infeasibility. These ideas remain largely unexplored.

Nogoods and Benders decomposition
If in the midst of search a trial solution is found to be unsatisfactory, the reasons for its failure can be analyzed. This analysis may lead to a constraint that rules out many solutions that fail for the same reason. By adding this constraint to the problem one can avoid unnecessary search. Such a constraint is a nogood, a well-known idea in the constraint satisfaction literature (Tsang, 1983). A nogood is a way of learning from one's mistakes. It combines search and inference in a particular way: the inferred constraints (nogoods) are occasioned by the discovery of bad solutions.
A related idea has evolved in the optimization literature. One way to combine search and inference is to search values of some of the variables and use inference to project the constraints onto these variables. Let the variables of (12) be partitioned as follows.
minimize xP x YyPD y fxY y subject to xY y P S

16
We will search over values of x. If we examine a particular value " x, we can ®nd optimal values for the remaining variables subject to x " x. This poses the subproblem minimize yPD y f" xY y subject to " xY y P S The variables are partitioned in such a way that the subproblem has special structure that makes it easier to solve. The variables may decouple, for example. The next step is to solve the inference dual of the subproblem by generating a proof of optimality for its optimal solution y " x . This proof might take the form of a branching tree. By examining the conditions under which this proof is valid, we may be able to de®ne a function B " x x that provides a lower bound on the optimal value of (16) for a given value of x. Two examples of this are provided below. Obviously, B " x " x f" xY y " x , but even when x has some value other than " x, it may be possible to determine what kind of lower bound the dual proof still provides. If z is the objective function value of (16), this analysis yields the nogood z ! B " x x. It states that there is no point in examining solutions x for which the objective function value is less than B " x x. The master problem minimizes the objective function subject to nogoods that have been accumulated so far: are the nogoods generated so far. Each time the master problem is solved, the solution " x generates another nogood. The nogoods in eect project the constraint set onto the variables x. It is usually unnecessary to generate all of the nogoods that de®ne the projection, because the process stops when a nogood is satis®ed by the previous " x. When this strategy is applied to a problem with the following form, the result is Benders decomposition. The subproblem is a linear programming problem. minimize yPR n f" x cy subject to Ay ! b À g" x

20
The dual solution u " x of (19) proves bound z ! u " x b À u " x g" x. Because the proof remains valid when x has values other than " x, we have the nogood z ! B " x x u " x b À u " x gx, also known as a Benders cut. When the subproblem (17) is solved by branching, one may be able to construct a boolean formula Px such that the branching tree remains a proof of optimality of y " x whenever Px 1. It is shown in , for example, that in integer programming the branching proof can be viewed as encoding a resolution proof whose premises are implied by constraints that are violated at the leaf nodes of the tree. The violated constraints imply the premises when x " x. One can let Px 1 for all values of x for which these constraint continue to imply the premises. Then z ! B " x x f" xY y " x Px is a valid nogood (if z ! 0. Nogoods can be combined with branching. Suppose that at a given node of the branching tree, constraint propagation or cutting planes prove infeasibility, or more generally prove z ! z, where z I in the case of infeasibility. One can view this proof as solution of the inference dual of a subproblem containing the variables that have not yet been ®xed at that node. Then one might derive a nogood z ! B " x x, where x is the vector of variables that have been ®xed. This nogood can serve as a valid constraint throughout the rest of the tree search. Thus at each node one can generate complementary constraints: constraints involving the un®xed variables by means of constraint propagation and cutting plane methods, and nogoods involving the ®xed variables. Again there is cross-fertilization. The optimization community has apparently never used nogoods in branching search. The constraint satisfaction community has apparently never used generalized Benders decomposition as a means to generate nogoods, although Beringer and De Backer (De Backer and Beringer, 1993;Beringer and De Backer, 1995) have done related work. The ability of Benders decomposition to exploit structure could give new life to the idea of a nogood, which has received limited attention in practical algorithms.
j . h o o k e r e t a l . 20 A strengthening of a problem shrinks the feasible set, and a relaxation enlarges it. Solving a strengthened minimization problem provides an upper bound on the optimal value of the original problem. Relaxing the problem provides a lower bound. The interplay of strengthening and relaxation is an old theme in optimization that goes under the name of primal-dual methods. These appear, for example, in dual-ascent and other Lagrangean methods for discrete optimization, out-of-kilter and related methods for network¯ow problems (Bazaraa et al., 1990), and the primal-dual simplex method for linear programming. All of these exploit the same formal duality, which is de®ned below. Branch-and-bound search can be regarded as a primal-dual method in a somewhat dierent sense that will also be discussed.
Properly chosen strengthenings and relaxations may be much easier to solve than the original problem. Solving several of them and choosing the tightest bounds that result may therefore provide a practical way of bracketing the optimal value of a problem that cannot be solved to optimality.
As noted earlier, enumeration of strengthenings usually takes the form of branching or local search. It is less obvious how to enumerate relaxations of a problem, but a clever method has evolved over the years: one parameterizes relaxations. Each parameter setting yields a dierent relaxation. The problem of ®nding parameters that yield the tightest bound might be called the relaxation dual problem. Several well-known dualities in optimization are special cases, including the linear programming dual, the Lagrangean dual, anal the surrogate dual.
These classical duals, however, apply only to problems with inequality constraints. The general relaxation dual may provide a key to relaxing constraints that take other forms. This is particularly important for bringing the advantages of relaxation to constraint satisfaction methods, which permit a much broader repertory of constraints than the inequality constraints of mathematical programming. The ®rst section below suggests how this might be done.
Branch-and-bound search dualizes strengthening and relaxation in a dierent way. The second section suggests how generalization of this idea can also result in new methods.

Relaxation duality
We begin by de®ning strengthening and relaxation more carefully. The de®nition stated above assumes that the objective function in a strengthening or relaxation is the same as in the original problem. It need not be. The following problem is a strengthening of (12) in a more general sense if S H & S and f H x ! fx for x P S H : Problem (21) is a relaxation of (12) if S H ' S and fx f H x for x P S. Equivalently, let the epigraph E of an optimization problem (12) be the set fzY xjz ! fxY x P Sg. A strengthening's epigraph is a subset of E, and a relaxation's epigraph is a superset.
Suppose that a family of relaxations is parameterized by l P L, so that f H x fxY l and S H Sl. Each relaxation is written This is a valid relaxation if: Sl ' SY all l P L fxY l fxY all x P SY l P LX 23 The problem of ®nding a relaxation that gives the tightest lower bound is the relaxation dual, maximize lPL ylX 24 Unifying optimization and constraint satisfaction methods The classical Lagrangean relaxation replaces the objective function with a lower bound that is obtained by penalizing infeasible solutions and perhaps rewarding feasible ones. It is de®ned only when the constraints have inequality form, so that S fxjg i x 0Y i P Ig. It is obtained by setting fxY l fx iPI l i g i x and Sl D for l ! 0. Note that the feasible set is the same for all l. This is a valid relaxation because clearly, Sl ' S, and fx iPI l i g i x fx for all l ! 0 and all x P S (so that g i x 0). In this case, the relaxation dual is the Lagrangean dual, which is widely used in integer and nonlinear programming to obtain bounds.
The surrogate relaxation leaves the objective function untouched, but replaces the inequality constraints with a nonnegative linear combination of those constraints. It is obtained by setting fxY l fx and Sl fxj iPI l i g i x ! 0g, with l ! 0. The relaxation dual in this instance is the surrogate dual, which can also be used to obtain bounds for integer and nonlinear programming. The Lagrangean and the surrogate dual of a linear programming problem are both equivalent to the linear programming dual.
A relaxation dual can be de®ned for a much wider range of problems than those involving inequality constraints. It is necessary only that the relaxation observe the formal properties (23). This can be illustrated by the traveling salesman problem. The traditional continuous relaxation requires that the problem be written with inequality constraints, resulting in a long problem statement (exponentially long in the most popular formulation). However, the problem can be written very succinctly as follows.
where y n1 y l . Here y i is the i-th city visited and c ik the cost on arc jY k. One can of course write a relaxation for this formulation by reverting to the inequality model and relaxing the integrality constraints. However, this is not a practical option when an inequality formulation of the problem at hand is unavailable, too large, or has a weak relaxation. A generalized Lagrangean relaxation can perhaps accommodate such cases.
In the case of the traveling salesman problem, a generalized Lagrangean relaxation might be given by fxY l i c y i Yy i1 j l j N j À 1 25 and SxY l f1Y F F F Y ng, where N j is the number of x i 's equal to j . Because (25) can be written fxY l i c y i Yy i1 i l y i À l i the value yl can be readily computed by dynamic programming. The dual (24) can be solved by subgradient optimization, because N 1 À 1Y F F F Y N n À 1 is a readily available subgradient. In other types of problem, a concept from constraint satisfaction may help provide a useful relaxation of the constraint set. The dependency graph G for a problem (12) indicates the extent to which variables decouple (Tsong, 1983). It contains a vertex for each variable and an edge iY j when variables x i and x j occur in the same constraint, or in the same term of the objective function (which we may, for simplicity, assume to be a sum of terms). If vertices (along with adjacent edges) are removed from G in order 1Y F F F Y n, the induced width of G with respect to this ordering is the maximum degree of a vertex at the time it is removed.
Problem (12) can be solved by nonserial dynamic programming (Bertele and Brioschi, 1972) in time that is exponential in the induced width of G. Although the induced width is normally too large for this to be practical, relaxations can be de®ned for which it is small. This might be done by removing several arcs from G to obtain Gl, where l is a list of the arcs removed. For each arc x i Y x j removed, replace each constraint containing both x i and x j with two projections of the constraint. (The objective function can be analogously treated.) The projections are obtained by projecting the constraint onto all of its variables except x i and onto all of its variables except x j . Once the projections are computed, resulting problem has dependency graph Gl.
The relaxed set SxY l is now de®ned to be the projected problem for Gl, and yl is computed by nonserial dynamic programming. The dual problem (23) might be attacked by local search methods over the space of l's.
These represent only two examples of how discrete relaxations might be parameterized and bounds obtained by solving a relaxation dual. The potential of this approach is largely unexplored.

Searches that combine strengthening and relaxation
The most popular strategy for combining strengthening and relaxation in a search procedure is to enumerate strengthenings and solve a relaxation of each. In integer programming, for instance, one might enumerate strengthenings in a branch-and-bound tree and solve the continuous relaxation of the strengthened problem at each node. A rationale for this strategy is that it hedges against the liabilities of both strengthening and relaxation: strengthenings may not be easy to solve until they become very strong (i.e., almost all variables are ®xed), and whereas a continuous relaxation may be easy to solve, it may also be very weak. By solving a relaxation at each node of a search tree, one solves an easy problem that may nonetheless be a relatively strong relaxation because several variables have been ®xed. Bounds derived from the relaxations can be used in a branch-and-bound scheme.
This represents only one way that strengthening and relaxation can interact. There are others. For example, the reverse strategy is seldom recognized: one can solve strengthenings of a relaxation. The only requirement is that the relaxation remain an easy problem when strengthened, for example when variables are ®xed. This is normally the case. A feasible solution is found when the solution of a strengthening is feasible in the original problem. One backtracks whenever a feasible solution is found, or it can be determined that no solution of the current strengthening is feasible in the original problem. Bounding can be used as before.
In integer programming, the reverse strategy is identical to the original strategy, because continuous relaxation and variable ®xing are commutative functions. Fixing a variable in a continuous relaxation has the same eect as relaxing the problem after ®xing that variable. Perhaps this is why the reverse strategy has not been noticed.
In general, relaxation and strengthening do not commute. For example, if x 1 Y x 2 b 0, the constraint x 1 x 2 ! 4 can be relaxed to x 1 x 2 ! 4 by writing a ®rst-order Taylor series approximation at the point x 1 Y x 2 2Y 2. The relaxation becomes x 2 ! 0 when x 1 is ®xed to, say, 4. Reversing the direction, ®xing x 1 4 changes the nonlinear constraint to x 2 ! 1, which relaxes to itself and is dierent from x 2 ! 0.
Viewing search consciously as an interplay of strengthening and relaxation can therefore lead one to combine them in dierent ways and obtain new methods. The eectiveness of these new methods has yet to be tested.

Generating relaxations via inference
As mentioned earlier, the global constraints of constraint programming provide an opportunity to exploit structure not only for purposes of domain reduction, but for relaxation as well.
To clarify this point, it should be acknowledged that a global constraint is sometimes relaxed in order to compute reduced domains. The result is not, however, normally the sort of relaxation that is recommended here; namely, one that can be solved to optimality in order to obtain useful bounds, such as a linear programming relaxation.
It is true that domain reduction can itself be viewed as a process that generates a relaxation. It in eect derives``in-domain'' constraints that restrict each variable to a reduced domain. The constraint programming community normally views in-domain constraints as comprising a Unifying optimization and constraint satisfaction methods constraint store that propagates the implications of one constraint to other constraints. However, they can also be viewed as comprising a relaxation that is easily solved: merely choose one value from each domain (Bockmayr and Kasper, 1998). It may even be practical to optimize the objective function subject to the in-domain constraints. But even in this case, the result is unlikely to provide a useful bound.
The desired sort of relaxation has usually been obtained in the form of cutting planes that are derived from inequality constraints. This imposes a severe limitation, because most useful global constraints are neither expressed nor easily expressible in inequality form. It is often possible, however, to derive linear inequality relaxations, as well as other soluble relaxations, from constraints other than inequalities. This has been done even in traditional operations research for disjunctions of linear inequalities (Balas, 1975;Beaumount, 1990). This and more recent work are summarized in (Hooker and Osorio, 2000).
As an illustration, we present here continuous relaxations for element constraints, which are important due to their role in implementing variable subscripts. To highlight the overall strategy of attaching both domain reduction procedures and relaxations to a global constraint, we also analyze domain reduction for element constraints.

Discrete variable subscripts
Variable subscripts are rapidly becoming ubiquitous in modeling of combinatorial optimization problems. The element constraint is well known in the constraint programming world (Van Hentenryck, 1989;Marriott and Stuckey, 1998) as a way of indexing discrete variables, but variable subscripts have now also been introduced in mathematical modeling languages such as AMPL (Fourer, 1994;Fourer et al., 1995) and OPL (Van Hentenryck, 1999a). While domain reduction ensuring arc-or hyperarc consistency is relatively simple for indexing over discrete variables, the case of continuous variables is much less explored.
The element constraint is written In the simplest case, y is a single variable whose initial domain is f1Y F F F Y kg, and v 1 Y F F F Y v k is a list of values. The variable z can be discrete or continuous. The constraint says that z must take the yth value in the list. An element constraint of this form implements a term with a variable subscript. A term of the form c fy , where fy is a function of the variable y, is implemented by imposing the constraint and replacing all occurrences of c fy with z. For example, if y P f1Y 2Y 3Y 4g the term c yYy1 is replaced by z and the constraint Because the simplest element constraint (26) contains only two variables, arc consistency is equivalent to full consistency. It is achieved in the obvious way. Let D z Y D y be the current domains of z and y, respectively, and let " D z Y " D y be the new, reduced domains. Then the two rules, " D z D z fv j jj P D y g and D y D y fjjv j P D z g, in a ®x-point iteration, will achieve arc consistency.
Indexing among values is just an instance of the general case of variables (as opposed to constants) with variable subscripts. Because element now contains k 2 variables, arc consistency does not imply hyperarc consistency. However, full hyperarc consistency can be obtained as follows (of which the rules above are a special case). " where initially the domains are: Rules (a), (b) and (c) imply that the reduced domains are: Thus y is ®xed to 3, which means that x 3 z. The common domain of x 3 and z is the intersection of their original domains. u t

Continuous variable subscripts
While the discrete cases above are well known, variable subscripts in continuous linear inequalities are far less explored. In this case, constraint propagation is replaced by cutting plane generation. Suppose for example that x is a continuous variable in the constraint x fy ! b. The inequality z ! b is inserted into the LP model along with additional constraints that de®ne z. If the value of y is ®xed, e.g., to 1, one adds the constraint z x f1 . If the current domain of y is f1Y 2g, however, the constraint that de®nes z is a disjunction: x f1 z x f2 zX 27 Although (27) cannot be added to a linear model, a linear relaxation of it, de®ned by cutting planes, can be used instead.

Unifying optimization and constraint satisfaction methods
In general, a subscripted variable x fy is represented by replacing it with z and the linear relaxation of the general disjunction iPD y x fi zX 28 To be useful, the variables x j must also have bounds, such as 0 x fj m fj for j P D y . We assume in what follows that jD y j ! 2.
If each upper bound is the same value m 0 , we get the following valid inequalities for (28): The inequality (30) is a surrogate inequality (Balas, 1979) and the bounds (31)±(32) are from before. Let g be the set de®ned by (28) and (31)±(32), and let be the polyhedron de®ned by (29)±(32). It can be easily shown that g , since points in g are convex combinations of points where at least one of the x i 's is equal to z. Equations (29)±(30) follow immediately.
We note that the inequalities de®ning are facets of the convex hull of g. The jD y j 1 points, in g are anely independent and satisfy (29) at equality. Also, the jD y j 1 points are anely independent and satisfy (30) at equality. The bounds (31)±(32) are obviously facets of the convex hull of g. It is shown in Hooker (2000) that is in fact the convex hull of the disjunction (28).
If the upper bounds dier the convex hull relaxation can be much more complex. In this case one can use a weaker and simpler relaxation by letting m 0 max i fm fi g in (29) and (32) and replacing (31) with the actual bounds.
One can augment this relaxation with second relaxation. First write the disjunction (28) in the weaker form of two disjunctions, iPD y x fi À z ! 0Y 33 iPD y Àx fi z ! 0Y 34 and replace each with the linear``elementary'' relaxation described in Beaumount (1990 When all upper bounds are the same, m 0 max i fm fi g, (35)±(36) become the following, which are dominated by (29)±(30): iPD y x fi ! jD y jz À jD y j À 1m 0 Y 37 iPD y x fi jD y jz jD y j À 1m 0 X 38 However, when the upper bounds dier, it is advantageous to use both (29)±(30) and (35) Example 2 The goal is to generate linear inequalities to represent x y ! b in a linear programming solver, where the current domain of y is f1Y 2g. Suppose initially that 0 x j 5 for j 1Y 2. Then one can generate the inequality z ! b and de®ne z with the relaxation of (27). The latter is given by (29)±(32), which in this case is Thus the constraints z ! b and (39)±(41) are added to the LP model. Now suppose the upper bounds are dierent: 0 x 1 4Y 0 x 2 5. The elementary relaxation in (35)±(36) becomes 5x 1 4x 2 À 9z 20Y 5x 1 4x 2 À 9z ! À20X These inequalities are combined with z ! b, (39), (41) and the bounds 0 ! x 1 ! 4Y 0 ! x 2 ! 5 in the LP model. u t An alternative to the disjunctive relaxation discussed above would be to use a variant of the conventional``big-M'' formulation. Equations (33)±(34) would then form a relaxation of (28) as where M max i fm fi g. Note that we introduce jD y j new continuous variables in this relaxation. The projection of (42)±(45) on x f1 Y F F F Y x fjD y jYz is very weak so as a relaxation the big-M formulation is not only more costly (jD y j new variables and 2jD y j 1 new constraints), but is almost useless. Its use is in a search framework, where y is needed for branching purposes, i.e., where the framework does not allow branching on constraints, e.g., on parts of a disjunction. Instead, the y's in the big-M formulation above are used to simulate that capability.

Incremental cutting plane generation
Although useful on their own, the relaxations for variable subscripts described in the previous section are mainly intended for use within a branch-and-bound search. Due to branching and inference (such as constraint propagation), the domain of the indexing variable y will shrink as we descend in the search tree. This means that x fi should be removed from the equations when i T P D y (by setting the corresponding coecient to zero), and also that the coecients jD y j and m j need to be updated in equations (29)±(38) when D y is modi®ed. This last comment is related to how the M's in a conventional``big-M'' formulation are selected and handled. Ideally, they should be updated when the variable bounds change, to obtain the strongest possible relaxation, but often this seems to be neglected.

Future research directions
A number of research directions are identi®ed in the foregoing. They may be summarized as follows: . Deciding what to relax. Learn how to identify subsets of constraints that have a useful continuous relaxation. . Continuous relaxations for global constraints. Find useful continuous relaxations for common global constraints.
Unifying optimization and constraint satisfaction methods . Relaxation duals. Use the idea of a relaxation dual to create discrete relaxations for common global constraints. . Sensitivity analysis. Develop inference-based sensitivity analysis for problem classes by analyzing when problem perturbations leave the proof of optimality (or infeasibility) intact. Also, learn how to generalize this analysis so as to explain the solution. . Using nogoods in branch-and-cut search. Investigate the possibility of using nogoods obtained by inference-based Benders decomposition as cuts that are complementary to the traditional cuts in branch-and-cut search. . Finding nogoods that exploit structure. Use generalized Benders decomposition as a means to identify nogoods that exploit problem structure and perhaps thereby improve the utility of nogoods. . Strengthening and relaxation. Experiment with new ways for combining strengthening and relaxation during search, for instance by solving strengthenings of a relaxation. . Uni®ed solution technology. Solve a wide variety of problems with a view to how the search/ inference and strengthening/relaxation dualities may be exploited, with the aim of building a solution technology that uni®es and goes beyond classical optimization and constraint satisfaction methods.
TuÈ rkay M and Grossmann, IE, 1996.``Logic-based MINLP algorithms for the optimal synthesis of process networks'' Computers and Chemical Engineering 20 959±978. Van Hentenryck, P, 1989 j . h o o k e r e t a l . 30