Set manipulation with Boolean functional vectors for symbolic reachability analysis

Symbolic techniques usually use characteristic functions for representing sets of states. Boolean functional vectors provide an alternate set representation which is suitable for symbolic simulation. Their use in symbolic reachability analysis and model checking is limited, however, by the lack of algorithms for performing set operations. We present algorithms for set union, intersection and quantification that work with a canonical Boolean functional vector representation and show how this enables efficient symbolic simulation based reachability analysis. Our experimental results for reachability analysis indicate that the Boolean functional vector representation is often more compact than the corresponding characteristic function, thus giving significant performance improvements on some benchmarks.


Introduction
Binary Decision Diagrams(BDDs) [3] enabled the efficient representation and manipulation of boolean functions.Symbolic techniques use BDDs to encode and manipulate sets of states for automatic verification techniques based on state-traversal, such as symbolic model checking.The most common encoding used for state-sets is the characteristic function representation.Boolean functional vectors provide an alternate, though not as widely used, representation.
Boolean functional vectors represent a bit-level decomposition of the state-set which is suitable for symbolic simulation.The decomposition is often more compact than the equivalent characteristic function representation.However, while there are straight-forward algorithms for set manipulations with characteristic functions, similar algorithms do not exist for Boolean functional vectors.
In an early paper on symbolic state traversal, Coudert, Berthet and Madre [6] used symbolic simulation for the image-computation step in reachability analysis, as shown in Figure 1.Starting with the set of initial states, they iteratively perform image computation on a given circuit model.The union of the obtained image and the previously reached states gives the set of reached states for that iteration.The procedure stops (Loop Control) when no new states are discovered, i.e., a fix-point is reached.A selection heuristic is used to choose the smaller of the newly discovered states and all the discovered states for the next iteration.
The image computation was performed by symbolic simulation with Boolean functional vectors.However, all setmanipulation was done with characteristic functions.The conversion between the two representations is costly and since it creates the characteristic function anyway, there are no benefits to using Boolean functional vectors.
In [7], Coudert and Madre replaced the symbolic simulation with a range computation by constraining the transition functions with the characteristic function.This avoids the conversion from characteristic functions to Boolean function vectors.To convert the Boolean functional vector (obtained from the range computation) to a characteristic function, they proposed two algorithms based on recursively splitting the vector.While most state-traversal programs today use the transition relation approach with partitioned transition relations and early quantification [8], the transition function approach of recursive splitting is still used sometimes, e.g., in [11] where a hybrid approach is used effectively.However, in almost all reachability analysis, the characteristic function is used as the primary set representation.McMillan's conjunctive decomposition [10] is one approach where this is not the case.As we show in Section 2.7, this decomposition is closely related to the Boolean functional vector representation.
Boolean functional vectors are also used in Symbolic Trajectory Evaluation (STE) [4] which is a symbolic simulation based model checking technique.However, the specification language is restricted and does not require fix-point computations, thus avoiding the need for set manipulations.
In this paper we present algorithms for set union, intersection and quantification that work directly on a canonical  Boolean functional vector representation (we do not have a negation algorithm).These algorithms do not construct the characteristic function either explicitly or implicitly, thus enabling a symbolic simulation based state-traversal using Boolean functional vectors without paying the price for unnecessary conversions.

Set Representation and Manipulation
A characteristic function represents a constraint which must be satisfied for a vector to be in the encoded set.Consider the set of vectors S 000 001 010 011 100 101 (we abbreviate bit-vectors with bit-strings, e.g., 000 for 0 0 0℄).In our example, if we use variables v 1 v 2 and v 3 for the first, second and third bits respectively, the characteristic function is χ S v 1 • v 2 (Table 1).This expresses the constraint that the first two bits cannot both be 1.
Formally, given variables V ´v0 v n µ and B 0 1 , a characteristic function χ S ´V µ represents the set: Boolean functional vectors, on the other hand, map a given input vector to a vector in the encoded set.For example, the set S 000 001 010 011 100 101 can be represented by the Boolean functional vector In general, a Boolean functional vector represents the set of Boolean vectors given by its range.Formally, given variables V ´v1 v m µ and B 0 1 , the Boolean functional vector F f 1 ´V µ f n ´V µ℄ represents the set:

Canonical Representation
Unlike the characteristic function, the Boolean functional vector representation of a set is not unique.The vec- Table 1.Representing a set by its characteristic function and a Boolean functional vector both represent the set 000 001 010 011 100 101 .Also, there is no Boolean functional vector for the empty set.However, for non-empty sets, it is possible to obtain a canonical Boolean functional vector representation by placing some restrictions [6,13].The empty set can be treated as a special case.
Firstly, we use exactly the same number of variables as the number of vector components.Each variable corresponds to a choice for one component.The functional vector then represents a mapping from the set of all n length vectors onto the set being represented.
The second restriction placed is that a vector which is in the set must be mapped to itself.Thirdly, we use a distance metric to map vectors not in the set to the nearest vector in the set.For vectors X x 1 x n ℄ and Y y 1 The distance metric ensures uniqueness, i.e., no two vectors are equi-distant from a given vector.Note that we have assigned decreasing weights to the bits, starting with the first bit.In general, a different ordering could be used to assign weights.This corresponds to a permutation of the bits.We refer to this order as the component order.
For the set S 000 001 010 011 100 101 , if we use the choice variables v 1 v 2 and v 3 for the first, second and third bits (vector components) respectively, we get the canonical symbolic encoding Given a characteristic function for a set, the corresponding Boolean functional vector can be obtained by using the conversion algorithms of Coudert, Berthet and Madre [6] or the parameterization procedure of [1].In our usage, however, we start with the canonical vectors for elementary sets and build other vectors by the set manipulation algorithms discussed below.The distance metric is never used explicitly, but the nearest distance property is maintained by the algorithms.

Interpreting Boolean Functional Vectors
In developing the set manipulation algorithms for Boolean functional vectors, we found it useful to interpret a Boolean functional vector as an ordered selection process, starting with the highest weighted component.
For S 000 001 010 011 100 101 , we can choose the first bit to be either 0 or 1.We use the choice variable v 1 to represent this free-choice.The value of the second bit is restricted by our selection of the first bit.When the first bit is chosen to be 0, the second bit can be either 0 or 1.However, if the first bit is chosen to be 1, the second bit is forced to be 0.This dependency is captured by the second component function v 1 v 2 .The third bit value, which can be chosen independent of the first two choices, is represented by the choice variable v 3 .
In general, the i-th component will depend on the first i variables only.The i-th component can be represented as where v i is the i-th choice variable and f 1 i and f c i are functions of the first ´i 1µ variables.The function f 1 i represents the condition under which f i is forcedto-one because of previous selection choices.f c i represent the condition under which we have a free-choice for f i .The forced-to-zero ( f 0 i ) condition does not appear in f i but is easily computed since the three conditions are mutually exclusive and complete.Any two of the three functions f 1 i f 0 i and f c i are sufficient to define the i-th component.

Set Union
With the selection interpretation of Boolean functional vectors, a naive union algorithm suggests itself.In selecting a vector from the union, we can choose from either of the operand sets.Hence, the i-th component is forced to a value in the union only when it is forced to that value in both sets.If we have a free-choice in either set, or if one set allows us to choose 1 and the other 0, we have a free-choice in the union for that component.
Consider computing the union S of the sets S 0 000 and S 1 011 , represented by the functional vectors 0 0 0℄ and 0 1 1℄ respectively.Since the first component is forced-to-zero in both sets, it is forced-to-zero in the union as well.For the second component, we get a freechoice since one set allows 0 while the other allows 1.Similarly, by our naive algorithm we would get a free-choice for the third component to give us the Boolean functional vector 0 v 2 v 3 ℄ corresponding to the set 000 001 010 011 , an over-approximation of the correct union.
The problem occurs because after we make a choice for the second bit in our example, we restrict ourselves to one of the two sets and, hence, we do not have a free choice for the third bit which is forced to one or zero depending on the choice made with v 2 .
In order to compute the Boolean functional vector H corresponding to the union of (the sets represented by) F and G, we compute exclusion conditions F x and G x .Initially, neither set is excluded.
Whenever we do make a choice that restricts us to one of the sets, we update the exclusion conditions to restrict the selection procedure to the remaining set.A set is excluded while selecting the i • 1 component either if it has already been excluded earlier or its i-th component is forced to a value and we selected the other value: The union can now be computed as: The i-th bit is forced-to-one in the union if it is forcedto-one in both the sets or if one set is excluded and the bit is forced-to-one in the other set.The forced-to-zero computation is similar.

Set Intersection
Consider the sets S 0 000 010 and S 1 001 010 011 represented by the vectors F 0 v 2 0℄ and G 0 v 2 v 2 • v 3 ℄ respectively.While selecting a vector for the intersection, we cannot choose the second bit to be 0 since that would give conflicting values for the third bit.
A conflict is introduced when a bit is forced-to-one in one set and forced-to-zero in the other.To handle these conflicts, we introduce elimination conditions E. The function e i represents the conditions which lead to a conflict downstream, irrespective of the remaining choices.Since there are no downstream components for the n-th bit, it's elimination condition e n is 0. The elimination condition e i 1 includes conflicting choices for the i-th bit position and also further downstream conflicts which cannot be resolved by either value of the i-th choice variable v i : For our example, the elimination conditions are E 0 v 2 0℄.
Before computing the intersection, we should normalize the operand sets by propagating the constraints imposed by the elimination conditions to remove the conflict inducing choices.In the example, we would get the normalized sets F n 0 1 0℄ and G n 0 1 v 3 ℄ by substituting 1 for v 2 to eliminate vectors with the second bit 0. However, we can compute an approximation to the intersection with the original sets and then make a final (forward) pass to propagate the elimination constraints.The approximation K is obtained by: The correct intersection is obtained by substituting the restricted choices for the choice variables: The intersection algorithm requires a quadratic number of BDD operations.In symbolic reachability analysis (Figure 2), however, we use symbolic simulation for image computation and thus avoid intersection as part of the relational cross product.

Cofactors and Quantification
We can compute the Shannon-cofactors of vector F by cofactoring the individual components, i.e., where c is either 0 or 1.This corresponds to fixing a value for a choice variable.Existential and universal quantification (set smoothing and consensus) can then be computed by the union and intersection of the cofactors with respect to the variable being quantified1 In reachability analysis (Figure 2), we will existentially quantify out the variables corresponding to the inputs and the choice variables for the current state elements as part of the re-parameterization procedure described below.

Re-Parameterization
Starting with a canonical representation of the inputs, we can obtain a Boolean functional vector for the outputs of a circuit by symbolic simulation.This vector, which represents each output value as a function of the input choice variables, must be re-parameterized to obtain a canonical representation for the output.
The re-parameterization is done by existentially quantifying the input choice variables.In general, any noncanonical Boolean functional vector can be made canonical by this procedure, by quantifying out the variables used in the non-canonical form.

Related Work
In [10], McMillan considers a canonical conjunctive decomposition of characteristic functions.We now show that this decomposition is closely related to the Boolean functional vector.
Given the choice variables V ´v1 v n µ and the (canonical) Boolean functional vector F f 1 f n ℄, the vector F v 1 °f1 v n °fn ℄ represents a conjunc- tive decomposition of the characteristic function for the set, i.e., χ IMG´F µ Î n i 1 ´vi °fi µ.The difference between F and F is that F maps an input vector to a vector in the represented set while F represents a vector of constraints, one for each bit, which must be satisfied for set membership.Thus, i is a constraint for the i-th bit.The paper also gives algorithms for set manipulation using the conjunctive decomposition which, given the connection above, are in essence performing the same operations as our algorithms.The algorithms for the conjunctive decomposition are based on the generalized cofactor operation [6].When the component order and the BDD variable order are the same, the BDD constrain operator can be used for the generalized cofactor.In this case, the algorithms with the conjunctive decomposition require fewer BDD operations than with Boolean functional vectors and are, hence, more efficient.
In [14], the authors present an algorithm to convert a non-canonical Boolean functional vector into a canonical form.They do so by maintaining a relation between the non-canonical and canonical expressions for each component.They also mention a possible optimization.However, without more details, it is hard to compare their algorithm with our re-parameterization procedure.Reachability Analysis with Boolean Functional Vectors.Our algorithms make it unnecessary to convert to characteristic functions for set manipulations, as is done in Figure 1.

Symbolic Reachability Analysis with Boolean Functional Vectors
Our approach to symbolic reachability analysis is shown in Figure 2. Unlike Figure 1, we do not convert to the characteristic function at any stage.Instead, the re-parameterization and set union are performed on the Boolean functional vectors using either our algorithms or by converting to the conjunctive decomposition.
In our experiments described below, we used fixed variable orderings.The reason is that in addition to BDD variable ordering, we need to develop component reordering for the components of the Boolean functional vector.We used the same order for component ordering and BDD variable ordering.In this case, it is more efficient to use the algorithms from [10], as explained in Section 2.7.
We performed reachability experiments on some of the non-trivial ISCAS89 benchmark circuits.Table 2 shows the results obtained and compares them with the reachability analysis implemented in VIS [2] , using the IWLS95 set of heuristics [12] with default settings.The results for our approach are listed under BFV.Each row in the table lists the name of the circuit, the variable ordering used, the runtime in seconds and the peak live BDD node count in thousands.The variable orders we used 2 were the static ordering obtained from VIS (S1), the static ordering obtained from our tool (S2), an ordering obtained after a VIS run with dynamic ordering enabled (D), orders from the pdtexp distribution [5] (P) and others (O) available to us 3 .The experiments were run on an UltraSPARC-II 336MHz with the memory limit set to 1GB and the time limit set to 10 hours.
From the table we see that the new approach performs 2 We only list those cases where at least one of the two tools completed 3 D and P are biased in favor of characteristic functions well with s3271 and s4863 but cannot complete s3330.On the other hand, we were able to complete s3330 with VIS, but not s3271.For s1512, BFV was much slower than VIS.
The two approaches differ in several aspects, e.g., we use a dynamic quantification schedule based on a simple support based cost heuristic.(Computing the cost dynamically does not impose much additional overhead, since we compute supports to avoid BDD operations on vector components that do not depend on the variable being quantified).The variable ordering requirements for the two representations can be quite different.For characteristic functions, it is essential that related variables occur together in the variable order.With Boolean functional vectors, the requirement is mainly that the important variables occur early in the order.Functional dependencies [9] are automatically factored out by the representation.Consider the characteristic function χ ´v1 °v2 µ ´v3 °v4 µ ´v5 °v6 µ where a good vari- able ordering is one in which the pairs ´v1 v 2 µ, ´v3 v 4 µ and ´v5 v 6 µ occur together.With the Boolean functional vector, all orderings are good in this case.We believe this property makes the variable ordering requirements less restrictive for Boolean functional vectors, e.g., in Table 2, BFV is able to complete the reachability of three of the benchmarks using a statically generated variable ordering (S1 or S2).
It is often the case that the Boolean functional vector representation is much smaller than the corresponding characteristic function.Table 3 lists the sizes of the characteristic function and the Boolean functional vector for the reachable states of s4863 (the size of the characteristic function BDD was obtained by converting the Boolean functional vector to the corresponding characteristic function).The size for BFV is the shared size of all the components; the individual components are usually much smaller, which can help avoid the intermediate blowup, e.g. in s4863 with order O, since .
the algorithms work on one component at a time.In summary, the Boolean functional vector approach forms a viable alternative to the traditional reachability analysis method.The property of Boolean functional vectors to factor out functional dependencies can often reduce the variable ordering requirements.

Figure 1 .
Figure 1.Reachability Analysis with the Coudert, Berthet, Madre [6] approach.State-sets are represented by their characteristic functions for set manipulation and Boolean functional vectors (BFVs) for image-computation by symbolic simulation.

Figure 2 .
Figure 2. Reachability Analysis with Boolean Functional Vectors.Our algorithms make it unnecessary to convert to characteristic

Table 2 .
Results for Reachability Analysis with fixed variable orders for some ISCAS89 circuits using VIS and with Boolean functional vectors.T.O.(M.O.) indicates that the time (memory) limit was exceeded.

Table 3 .
Sizes of the BDD for the characteristic function and the shared size of the BDDs for the Boolean functional vector for the reachable sets of s4863