Symbolic Model Checking for Probabilistic Processes

. We introduce a symbolic model checking procedure for Probabilistic Computation Tree Logic PCTL over labelled Markov chains as models. Model checking for probabilistic logics typically involves solving linear equation systems in order to ascertain the probability of a given formula holding in a state. Our algorithm is based on the idea of representing the matrices used in the linear equation systems by Multi-Terminal Binary Decision Diagrams (MTBDDs) introduced in Clarke et al [14]. Our procedure, based on the algorithm used by Hansson and Jonsson [24], uses BDDs to represent formulas and MTBDDs to represent Markov chains, and is efﬁcient because it avoids explicit state space construction. A PCTL model checker is being implemented in Verus [9].


Introduction
Probabilistic techniques, and in particular probabilistic logics, have proved successful in the specification and verification of systems that exhibit uncertainty, such as faulttolerant systems, randomized distributed systems and communication protocols.Models for such systems are variants of probabilistic automata (such as labelled Markov chains used in e.g.[24,34,35,17]), in which the usual (boolean) transition relation is replaced with its probabilistic version given in the form of a Markov probability transition matrix.The probabilistic logics are typically obtained by "lifting" a nonprobabilistic logic to the probabilistic case by constructing for each formula and a real number p in the 0; 1 -interval the formula p in which p acts as a threshold for truth in the sense that for the formula p to be satisfied (in the state s) the probability that holds in s must be at least p (see [26,32,25] for a different approach).
With such logics one can express quantitative properties such as "the probability of the message being delivered within t time steps is at least 0:75" (see e.g. the timing or average-case analysis of real-time or randomized distributed systems [24,23,5,6,2]) or (the more prevalent) qualitative properties, for which is required to be satisfied by almost all executions (which amounts to showing that is satisfied with probability 1, see e.g.[1,17,23,24,21,22,29,30,34]).
Much has been published concerning the verification methods for probabilistic logics.Probabilistic extensions of dynamic logic [26] and temporal and modal logics, e.g.[2,6,17,24,21,27,30,31,34], and automatic procedures for checking satisfaction for such logics have been proposed.The latter are based on reducing the calculation of the probability of formulas being satisfied to a linear algebra problem: for example, in [24], the calculation of the probability of 'until' formulas is based on solving the linear equation system given by an nn matrix where n is the size of the state space.Optimal methods are known (for sequential Markov chains, the lower bound is single exponential in the size of the formula and polynomial in the size of the Markov chain [18]), but these algorithms are not of much practical use when verifying realistic systems.As a result, efficiency of probabilistic analysis lags behind efficient model checking techniques for conventional logics, such as symbolic model checking [11,12,10,8,15,28], for which tools capable of tackling industrial scale applications are available (cf. smv).This is undesirable as probabilistic approaches allow one to establish that certain properties hold (in some meaningful probabilistic sense) where conventional model checkers fail, either because the property simply is not true in the state (but holds in that state with some acceptable probability), or because exhaustive search of only a portion of the system is feasible.
The main difficulty with current probabilistic model checking is the need to integrate a linear algebra package with a conventional model checker.Despite the power of existing linear algebra packages, this can lead to inefficient and time consuming computation through the implicit requirement for the construction of the state space.This paper proposes an alternative, which is based on expressing the probability calculations in terms of Multi-Terminal Binary Decision Diagrams (MTBDDs) [16].MTBDDs are a generalization of (ordered) BDDs in the sense that they allow arbitrary real numbers in the terminal nodes instead of just 0 and 1, and so can provide a compact representation for matrices.As a matter of fact, in [13] MTBDDs have been shown to perform no worse than sparse matrices.Thus, converting to MTBDDs ensures smooth integration with a symbolic model checker such as smv and has the potential to outperform sparse matrices due to the compactness of the representation, in the same way as BDDs have outperformed other methods.As with BDDs, the precise time complexity estimates of model checking for MTBDDs are difficult to obtain, but the success of BDDs in practice [8,28] serves as sufficient encouragement to develop the foundations of MTBDD-based probabilistic model checkers.
In this paper we consider a probabilistic extension of CTL called Probabilistic Computation Tree Logic (PCTL), and give a symbolic model checking procedure which avoids the explicit construction of the state space.We use finite-state labelled Markov chains as models.The model checking procedure is based on that of [24,18], but we use BDDs to represent the boolean formulas, and a suitable combination of BDDs and MTBDDs for probabilistic formulas.Currently, we are implementing the PCTL symbolic model checking in Verus [9].For reasons of space we omit much detail from this paper, which will be reported in [4].We assume some familiarity with BDDs, automata on infinite sequences, probability and measure theory [8,33,20].
We use discrete time Markov chains as models (we do not consider nondeterminism).
Let AP denote a finite set of atomic propositions.A labelled Markov chain over a set of atomic propositions AP is a tuple M = S; P; L where S is a finite set of states, P : S S !0; 1 a transition matrix, i.e.P t2S Ps;t = 1 for all s 2 S, and L : S ! 2 AP a labelling function which assigns to each state s 2 S a set of atomic propositions.We assume that there are 2 n states for some n, and that there are sufficiently many atomic propositions to distinguish them (i.e.Ls 6 = Ls 0 for all states s, s 0 with s 6 = s 0 Example 1.We consider a simple communication protocol similar to that in [24].The system consists of three entities: a sender, a medium and a receiver.The sender sends a message to the medium, which in turn tries to deliver the message to the receiver.With probability 1 100 , the messages get lost, in which case the medium tries again to deliver the message.With probability 1 100 , the message is corrupted (but delivered); with probability 98 100 , the correct message is delivered.When the (correct or faulty) message is delivered the receiver acknowledges the receipt of the message.For simplicity, we assume that the acknowledgement cannot be corrupted or lost.We describe the system in a simplified way where we omit all irrelevant states (e.g. the state where the receiver acknowledges the receipt of the correct message).We use the following four states: s init the state in which the sender passes the message to the medium s del the state in which the medium tries to deliver the message s lost the state reached when the message is lost s error the state reached when the message is corrupted The transition s del !s init stands for the acknowledgement of the receipt of the correct message, s error !s init for the acknowledgement of the receipt of the corrupted message.We use two atomic propositions a 1 , a 2 and the labelling function Ls init = ;, Ls del = fa 1 ; a 2 g, Ls lost = fa 2 g, Ls error = fa 1 g.
In this section we present the syntax and semantics of the logic PCTL (Probabilistic Computation Tree Logic) introduced by Hansson & Jonsson [24] 4 .PCTL is a probabilistic extension of CTL which allows one to express quantitative properties of probabilistic processes such as "the system terminates with probability at least 0:75".PCTL contains atomic propositions and the operators: next-step X and until U.The operators X and U are used in connection with an interval of probabilities.The syntax of PCTL is as follows: ::= tt j a j 1 ^2 j : j X wp j 1 U 2 wp where a is an atomic proposition, p 2 0; 1 , w is either or .Formulas of the form X or 1 U 2 , where , 1 , 2 are PCTL formulas, are called path formulas.
PCTL formulas are interpreted over the states of a labelled Markov chain, whereas path formulas are interpreted over paths.The subscript w p denotes that the probability of paths starting in the current state fulfilling the path formula is w p.Thus, PCTL is like CTL, except that the path operators A and E in CTL have been replaced by the operator wp .The usual derived constants and operators are: ff = :tt, 1 _ 2 = :: 1 : 2 , 1 ! 2 = : 1 _ 2 .Operators for modelling "eventually" or "always" can be derived by: 3 p = ttU p , 2 p = : 3: 1,p , and similarly for p .Let M = S; P; L be a labelled Markov chain.The satisfaction relation j = S P C T L is given by s j = tt for all s 2 S s j = 1 ^2 iff s j = 1 and s j = 2 s j = a iff a 2 Ls s j = : iff s 6 j = s j = X wp iff P r o b f 2 Path !s : j = X g w p s j = 1 U 2 wp iff P r o b f 2 Path !s : j = 1 U 2 g w p j = X iff 1 j = j = 1 U 2 iff there exists k 0 with i j = 1 , i = 0 ; 1; : : : ; k ,1 and k j = 2 .For a path formula f the set f 2 Path !s : j = fg is measurable [34,18].If s j = then we say s satisfies (or holds in s).The truth value of formulas involving the linear time quantifiers 3 and 2 can be derived: s j = 3 wp iff P r o b f 2 Path !s : k j = for some k 0g w p s j = 2 wp iff P r o b f 2 Path !s : k j = for all k 0g w p.
Given a probabilistic process P, described by a labelled Markov chain M = S; P; L with an initial state s, we say P satisfies a PCTL formula iff s j = .For instance, if a is an atomic proposition which stands for termination and P satisfies 3a p then P terminates with probability at least p.

Multi-terminal binary decision diagrams
Ordered Binary Decision Diagrams (BDDs) [7,8,15,28] are a compact representation of boolean functions f : f0; 1g n !f 0; 1g.They are based on the canonical representation of the binary tree of the function as a directed graph obtained through folding internal nodes representing identical subfunctions (subject to an ordering of the variables to guarantee uniqueness of the representation) and using 0 and 1 as leaves.In [16] it is shown how one can generalize BDDs to cogently and efficiently represent matrices in terms of so-called multi-terminal binary decision diagrams (MTBDDs).
Formally, MTBDDs can be defined as follows.Let x 1 ; : : : ; x n be distinct variables, which we order by x i x j iff i j.A multi-terminal binary decision diagram (MTBDD) over x 1 ; : : : ; x n is a rooted, directed graph with vertex set V containing two types of vertices, nonterminal and terminal.Each nonterminal vertex v is labelled by a variable varv 2 f x 1 ; : : : ; x n g and two children leftv, rightv 2 V .Each terminal vertex v is labelled by a real number valuev.For each nonterminal node v, we require varv varleftv if leftv is nonterminal, and similarly, varv v a r rightv if rightv is nonterminal.A suitable adaptation of the operator REDUCE [7] yields an operator which accepts an MTBDD as its input and returns the corresponding reduced MTBDD.
Each MTBDD Q over fx 1 ; : : : ; x n g represents a function F Q : f0; 1g n !IR, and, vice versa, each function F : f0; 1g n !IR can be described by a unique reduced MTBDD over x 1 ; : : : ; x n .In the sequel, by the MTBDD for a function F : f0; 1g n !IR we mean the unique reduced MTBDD Q with F Q = F.If all terminal vertices are labelled by 0 or 1, i.e. if the associated function F Q is a boolean function, the MTBDD specializes to a BDD over x 1 ; : : : ; x n .
MTBDDs are used to represent D-valued matrices as follows.Consider a 2 m 2 mmatrix A. Its elements a ij can be viewed as the values of a function f A : f1; : : : 2 m g f1; : : : 2 m g !D, where f A i; j = a ij .Using the standard encoding c : f0; 1g m !f1; : : : 2 m g of boolean sequences of length m into the integers, this function may be interpreted as a D-valued boolean function f : f0; 1g m !D where fx; y = f A cx; c y for x = x 1 : : : x m and y = y 1 : : : y m .This transformation now allows matrices to be represented as MTBDDs.In order to obtain an efficient MTBDDrepresentation, the variables of f are permuted.Instead of the MTBDD for fx 1 : : : x m ; y 1 : : : y m , we use the MTBDD obtained from fx 1 ; y 1 ; x 2 ; y 2 ; : : : x m ; y m .This convention imposes a recursive structure on the matrix from which efficient recursive algorithms for all standard matrix operations are derived [16].

Representing labelled Markov chains by MTBDDs
To represent the transition matrix of a labelled Markov chain by a MTBDD we abstract from the names of states and instead, similarly to [8,15], use binary tuples of atomic propositions that are true in the state.Let M = S; P; L be a labelled Markov chain.
We fix an enumeration a 1 ; : : : ; a n of the atomic propositions and identify each state s with the boolean n-tuple es = b 1 ; : : : ; b n where b i = 1 iff a i 2 Ls.In what follows, we identify P with the function F : f0; 1g 2n !0; 1 , Fx 1 ; y 1 ; : : : ; x n ; y n = Px 1 ; : : : ; x n ; y 1 ; : : : ; y n , and represent M by the MTBDD for P over x 1 ; y 1 ; : : : ; x n ; y n .The associated MTBDD is denoted by P.
Example 2. For the system in Example 1 we use the encoding es init = 0 0 , es del = 11, es lost = 0 1 es error = 1 0 .The values of the matrix P, the function F and the MTBDD P for F are are given by: 00 01 10 11 00 0 0 0 1 01 0 0 0 1 10 :98 0:01 (The thick lines stand for the "right" edges, the thin lines for the "left" edges.)

Operators on MTBDDs
Our model checking algorithm makes use of several operators on MTBDDs proposed in Bryant [7] and Clarke et al [14].We briefly describe them below.
Operator BDD: takes an MTBDD Q and an interval I, and returns the BDD rep-

Matrix and vector operators:
The standard operations on matrices and vectors have corresponding operations on the MTBDDs that represent them [13].If MTBDDs A and Q over 2n and n variables represent the matrix A and vector q respectively, then M V M U L T I A; Q denotes the MTBDD over n variables that represents the vector A q.
Operator SOLV E: [8] presents a method to decompose a regular matrix A into a lower and upper triangular matrices and a permutation matrix.Using this LU-decomposition we can obtain an operator SOLV EA; Q that takes as its input a MTBDD A over 2n variables where the corresponding matrix A is regular and a MTBDD Q over n variables which represents a vector q, and returns a MTBDD Q 0 over n variables which represents the unique solution of the linear equation system A x = q.Alternatively, we can use iterative techniques to solve the equations; our experiments indicate that this performs better.

Description of (MT)BDDs by relational terms of the -calculus
We will use the -calculus as a notation for describing (MT)BDDs.In the algorithm in the next section, all our (MT)BDDs are either over 2n variables (in which case they represent 2 n 2 n matrices), or over n variables (in which case they represent vectors of length 2 n ).For example, if B, C are BDDs over n variables and u = u 1 ; : : : ; u n , v = v 1 ; : : : ; v n , then D = u v Bu ^Cv is a BDD over 2n variables; if B;Crepresent the vectors b i 1in and c i 1in respectively, then represents the matrix whose element in the ith row and jth column is b i ^cj .The BDD E = u Bu ^Cu is a BDD over n variables, representing the vector b i ^ci 1in .
We write T R U E for the BDD over n variables which returns 1 in all cases of its arguments.We write :B instead of x :Bx , and B 1 ^B2 for the BDD x B 1 x B2 x .If x = x 1 ; : : : ; x n , y = y 1 ; : : : ; y n then x = y abbreviates the formula V 1in x i $ y i .
We require one further operator.If the labelled Markov chain M = S; P; L is represented by a MTBDD P as described in Section 4.1, and B 1 , B 2 are BDDs that represent the characteristic functions of subsets S 1 , S 2 of S, then REACHB 1 ; B 2 ; B D D P; 0 represents the set of states s 2 S from which there exists an execution sequence s = s 0 ; s 1 ; : : : ; s k with k 0 and s 0 ; : : : ; s k,1 2 S 1 , s k 2 S 2 , and which is used in the operator U N T I L defined in Section 5. Operator REACH Let B 1 , B 2 be BDDs with n variables and T a BDD with 2n variables.We define REACHB 1 ; B 2 ; T to be the BDD over n variables which is given by the -calculus formula Z x B 2 x _ B 1 x 9 y Zy ^Tx; y .This operator uses the method of [8] to obtain the BDD for a term involving the least fixed point operator .

Model checking for PCTL
Our model checking algorithm for PCTL is based on established BDD techniques (i.e.converting boolean formulas to their BDD representation), which it combines with a new method, namely expressing the probability calculation for the probabilistic formulas in terms of MTBDDs.In the case of X wp the probability is calculated by multiplying the transition matrix by the boolean vector set to 1 iff the state satisfies , whereas for 1 U 2 wp we derive an operator called U N T I L , based on [24], which we express in terms of MTBDDs.Let M = S; P; L be a labelled Markov chain which is represented by a MTBDD P over 2n variables as described in Section 4.1.For each PCTL formula , we define a BDD B over x = x 1 ; : : : ; x n that represents S a t = fs 2 S : s j = g.We compute the BDD representation B of a PCTL formula by structural induction: B X wp = BDD MV MULTI P;B ; w p B 1 U 2 wp = BDD U N T I L B 1 ; B 2 ; P ; w p The operator U N T I L B 1 ; B 2 ; P assigns to each state s 2 S the probability of the set of full paths from s satisfying 1 U 2 ; formally, it represents the function S !0; 1 , s 7 !p s , where p s = P r o b f 2 Path !s : j = 1 U 2 g: Our method for computing p s is based on the partition of S introduced in 18], but we must compute with BDDs.We first compute the set V = fs 2 S : p s 0g and then set V 0 = V n S a t 2 .We then have: p s = 1 if s j = 2 ; p s = 0 if s 6 2 V ; and for the remaining cases (i.e.those such that s 2 V 0 ) p s = X t2V 0 Ps;t p t + X t2Sat2 Ps;t p t + X t2SnV Ps;t p t : In the second term, each p t = 1 and in the third term, each p t = 0. Therefore p s (s 2 V 0 ) satisfies a jV 0 j-dimensional equation system of the form x = Ax + b, or equivalently I , Ax = b where I is the jV 0 j j V 0 j identity matrix.One can show this system has a unique solution using the method in [24,18].
We now demonstrate how U N T I L can be expressed in terms of MTBDDs.Let B i = B i , i = 1 ; 2. The set V is given by the BDD B = REACHB 1 ; B 2 ; B D D P; 0, V 0 by B 0 = x Bx : B 2 x .In order to avoid the BDD for the "new" transition matrix A with dlog 2 jV 0 je variables, we instead reformulate the equation in terms of the matrix P 0 = p 0 s;t s;t2S which is given by: p 0 s;t = Ps;t if s; t 2 V 0 and p 0 s;t = 0 in all other cases.The MTBDD P 0 for P 0 can be obtained from the MTBDD P representing the Markov transition matrix.The following lemma shows that I , P 0 is regular (we omit the proof).Lemma 1.Let V 0 , P 0 , I be as as above.Then, I , P 0 is regular.The unique solution x = x s s2S of the linear equation system I , P 0 x = q where q = q s , q s = P t2Sat2 Ps;t satisfies: x s = p s if s 2 V 0 .
The algorithm for the operator U N T I L is shown in Figure 1.It first calculates the MTBDDs B and B 0 , for V and V 0 .B 2 is used as a mask to obtain P 0 from P; it sets to 0 the entries not corresponding to states in V 0 .We next calculate the MTBDD Q for the vector q, and use the operator SOLV E to obtain the MTBDD Q 0 satisfying F Q 0 s = p s for all s 2 V 0 .The result, the MTBDD Q 00 for the vector p = p s s2S , is obtained from the MTBDD for the function Fx = maxf F B2 x; F Q 0 xF B 0 x g which uses Q 0 for all s 2 V 0 and ensures that 1 is returned as the probability of the states already satisfying 2 .Example 3. Let = try to deliver U correctly delivered 0:9 where try to deliver = a 2 and correctly delivered = :a 1 : a 2 .We consider the system in Example 1.Our algorithm first computes the BDDs B 1 for S a t try to deliver = fs del ; s lost g, B 2 for S a t correctly delivered = fs init g, and then applies Algorithm U N T I L B 1 ; B 2 ; P .V = fs init ; s del ; s lost g is represented by the BDD B, V 0 = fs del ; s lost g by the BDD B 0 .Thus, B 2 , P 0 and A stand for the matrices Algorithm: U N T I L B1; B 2; P Input: A labelled Markov chain represented by a MTBDD P over 2n variables, BDDs B1, B2 over n variables Output: MTBDD X over n variables which represents the function that assigns to each state the probability of a path from the state reaching a B2-state via an execution sequence through B1-states Method: B := REACHB1; B 2; B D D P; 0; B 0 := x Bx : B2x ; B 2 := x1y1 : : : x nyn B 0 x1; : : : ; x n ^B0 y1; : : : ; y n ; P 0 := A P P L Y P;B 2 ; ; I := x1y1 : : : x nyn x = y ; A := A P P L Y I ; 0 ; ,; Q := MV MULTI P;B2; Q 0 := SOLV EA; Q; Q 00 := A P P L Y B2; A P P L Y Q 0 ; B 0 ; ; max; Return(REDUCEQ 00 ).

Implementing PCTL model checking
We are integrating PCTL symbolic model checking within Verus [9], which is a tool specifically designed for the verification of finite-state real-time systems.Verus has been used already to verify several interesting real-time systems: an aircraft controller, a medical monitor, the PCI local bus, and a robotics controller.These examples have not been originally modeled using probabilities.However, these systems exhibit behaviors which can best be described probabilistically.The integration of PCTL model checking with Verus allows us to verify stochastic properties of these and other interesting applications.
The Verus language is an imperative language with a syntax resembling that of the C language with additional special primitives to express timing aspects such as deadlines, priorities, and delays.An important feature of Verus is the use of the wait statement to control the passage of time.In Verus time only passes when a wait statement is executed: non-wait statements execute in zero time.This feature allows a more accurate control of time and leads to models with less states, since consecutive statements not separated by a wait statement are compiled into a single state.To describe probabilistic transitions we extend the Verus language with the probabilistic select statement.
From the Verus description of the application, the tool generates automatically a labeled state-transition graph and the corresponding transition probability matrix using BDDs and MTBDDs respectively.
The first experimental results of our PCTL symbolic model checking implementation are promising: Parrow's Protocol (which is of a similar size to Example 1) can be verified in less than a second.We have modeled a fault tolerant system [23, p. 168-171] with three processors that has about 35000 reachable states (out of 10 8 states).A safety property of this system took only a few seconds to check.Next we plan to evaluate how well PCTL symbolic model checking performs as a formal verification tool in real applications by modeling industrial size systems.
. k denotes the k + 1 -th state of .An execution sequence is also called a path, and a full path iff it is infinite.Path !s is the set of full paths with f i r s t = s.For s 2 S, let s be the smallest -algebra on Path !s which contains the basic cylinders f 2 Path !s : is a prefix of g where ranges over all finite execution sequences starting in s.The probability measure P r o b on s is the unique measure with P r o b f 2 Path !s : is a prefix of g = P where Ps 0 s 1 : : : s k = Ps 0 ; s 1 Ps 1 ; s 2 : : : Ps k,1 ; s k .