Department of Computer & Information Science
Technical Reports (CIS) University of Pennsylvania
Year
Model Checking of Hierarchical State Machines Rajeev Alur
Mihalis Yannakakis
University of Pennsylvania,
Bell Laboratories,
University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-98-15. This paper is posted at ScholarlyCommons@Penn. http://repository.upenn.edu/cis reports/865
Model Checking of Hierarchical State Machines MS-CIS-98-15
Rajeev Alur and Mihalis Yannakakis
University of Pennsylvania School of Engineering and Applied Science Computer and Information Science Department Philadelphia, PA 19104-6389
Model Checking of Hierarchical State Machines (Extended abstract)
Rajeev Alur University of Pennsylvania and Bell Laboratories
[email protected]
Mihalis Yannakakis Bell Laboratories mihalisQresearch. bell-labs.com
April 13, 1998 Abstract Model checking is emerging as a practical tool for detecting logical errors in early stages of system design. We investigate the model checking of hierarchical (nested) systems, i.e. finite state machines whose states themselves can be other machines. This nesting ability is common in various software design methodologies and is available in several commercial modeling tools. The straightforward way to analyze a hierarchical machine is to flatten it (thus, incurring an exponential blow up) and apply a model checking tool on the resulting ordinary FSM. We show that this flattening can be avoided. We develop algorithms for verifying linear time requirements whose complexity is polynomial in the size of the hierarchical machine. We address also the verification of branching time requirements and provide efficient algorithms and matching lower bounds.
1
Introduction
Finite state machines (FSMs) are widely used in the modeling of systems in various areas. Descriptions using FSMs are useful to represent the flow of control (as opposed to data manipulation) and are amenable to formal analysis such as model checking. In the simplest setting, an FSM consists of a labeled graph whose vertices correspond to system states and edges correspond to system transitions. In practice, to describe complex systems using FSMs, several extensions are useful. We focus on hierarchical (Nested) FSMs in which vertices of an FSM can be ordinary states or superstates which are FSMs themselves. The ability to nest FSMs is common in many specification formalisms and methods. It was popularized by the introduction of STATECHARTS [Har87], and exists in related specification formalisms, eg. MODECHARTS [JM87], RSML[LHHR94]. It is a central component of various object-oriented software development methodologies that have been developed in recent years, such as OMT [RBPE91], ROOM[SGW94], and the Unified Modeling Language (UML [BJR97]). This capability is commonly available also in commercial computer-aided software engineering tools that are coming out, such as STATEMATE(by i-Logix), OBJECTIMEDEVELOPER(by ObjecTime), RATIONALROSE (by Rational). The nesting capability is useful also in formalisms and tools for the requirements and testing phases of the software development cycle. On the requirements side, it is used to specify scenarios (or use cases [Jac92]) in a structured manner. For instance, the new ITU standard 2.120 (MSC'96) for message sequence charts [MSC96, RGG961 formalizes scenarios of distributed systems in terms of hierarchical graphs built from basic MSG's. The Lucent u B E T toolset for behavioral requirements engineering (based on the MSC/POGA prototype tools [HPR97]) uses a similar formalism. Although formally these are not exactly hierarchical FSM's, our algorithms can be used to infer and check temporal properties of the whole hierarchical system from the properties of the basic MSG's. On the testing side, FSM's are used often to model systems for the purpose of test generation, and again the nesting capability is useful to model large systems. For example, Teradyne's commercial tool TESTMASTER[Apf95] is based on a hierarchical FSM model, and so is an internal Lucent test tool developed over many years for the testing of a large enterprise switch. Although these models are primarily developed for test generation, they can be used also for formal analysis. This is useful for systems with informal and incomplete requirements and design documentation, as is often the case, and especially for software that was developed and evolved over a long period of time, when the test models are updated for continued regression testing as the system evolves. As a simple example of a hierarchical FSM, consider specification of a digital clock. The top-level machine consists of a cycle though 24 superstates, one per hour of the day. Each such state, in turn, is a hierarchical state machine consisting of a cycle through 60 superstates counting minutes, each of which, in turn, is an (ordinary) state machine consisting of a cycle counting seconds. Hierarchical state machines have two descriptive advantages over ordinary FSMs. First, superstates offer a convenient structuring mechanism that allows us to specify systems in a stepwise refinement manner, and to look at it at different levels of granularity. Such structuring is particularly essential for specifying large FSMs via a graphical interface. Second, by allowing sharing of component FSMs (for instance, the 24 superstates of the top-level FSM of digital clock are mapped to the same hierarchical FSM corresponding to an hour), we need to specify components only once and then can reuse them in different contexts, leading to modularity and succinct system representations. In this paper, we consider algorithms for model checking when the description is given as a hierarchical state machine. Model checking is emerging as a practical method for automated debugging of complex reactive systems such as embedded controllers and network protocols (see [CK96, CW96] for surveys). Commercial tools for verification of hardware systems have appeared in the market in the last two years, eg. Lucent's FORMALCHECK system [FChk97]. On the software side, model checkers such as SPIN [Ho197] and VFSMVALID [FOG951 have been shown to be useful in the design and analysis of software in telecommunication and a wide variety of other areas (see for example the references in [Ho197] and the proceedings of the annual Spin workshop). In model checking, a high-level description of a system is compared against a logical correctness requirement to discover inconsistencies. Given a hierarchical FSM, one can obtain an ordinary FSM by flattening it, that is, by recursively substituting each superstate with its associated FSM. Such a flattening, however, can cause a blow-up, particularly when there is a lot of sharing. For instance, the hierarchical description of the digital clock has 24 + 60 + 60 = 144 vertices, while the flattened description
has 24 * 60 * 60 = 86,400 vertices. Thus, if we first flatten the machine, and then employ the existing model checking algorithms, the worst-case complexity would be exponential in the original description of the structure. Our results establish that such a flattening is unnecessary. Our first result concerns invariant verification problem, that is, establishing that all reachable states are included within a specified region. Invariant verification is the most common model checking- problem in practice, and can model safety requirements such as mutual exclusion and absence of deadlocks. We show that, even though some FSM may appear repeatedly in different contexts, it needs to be searched just once. We give a depth-first search algorithm that performs the reachability analysis with time complexity linear in the size of the hierarchical structure. Our second verification problem concerns verification of linear-time requirements [Pnu77, VW86, Ho197, Kur941 such as eventual reception. The commonly used formalisms for specifying requirements of system behaviors are automata and linear temporal logic. In the automata-theoretic formulation, we are given a hierarchical FSM I< and a Buchi automaton A that accepts undesirable behaviors, and we wish to check whether or not the languages of Ii' and A have a nonempty intersection. We show that this problem can be solved in time O ( ( K 1 . lAI2) (if li' was an ordinary FSM, this complexity would be O((Ii'1. JAl)). When the linear-time specification is given by a formula cp of (propositional) linear temporal logic (LTL) , using the known translations from LTL to Buchi automata [VW86, GPVW951, we get an algorithm for LTL model checking with time complexity O(IK( . 41~1). We note that usually the formulas 4 and automata A that specify correctness properties are very small (few temporal operators or states), while the system model Ii' is very large. Our third verification problem concerns branching-time requirements specified in the logic CTL [CE81, QS82]. The logic CTL can express both existential and universal pat11 properties, and is the specification logic of choice in symbolic model checkers. Unlike the linear-time case, the fact that the same FSM can appear in different contexts has a greater impact on verification of branching formulas. In fact, the complexity depends on the number of exit nodes of an FSM (exit-nodes of a hierarchical FSM Ii' are the states that are connected directly to the states of a higher level hierarchical FSM in which I< is embedded - some systems restrict to one exit node, while other systems allow multiple exit nodes). We give an algorithm for verifying that a hierarchical FSM K satisfies a CTLformula p with time complexity 0(1I<1 . 2191d), where each of the machines has a t most d exit nodes. We prove matching lower bounds by establishing that (1) the problem is P s ~ ~ c ~ - c o m p line tthe e size of the formula for single-exit machines, and (2) the problem is P s P A c E - c o ~ ~ ~ ~ ~ ~ in the size of the machine, when multiple exits are allowed. -
Related work Model checking for ordinary finite-state machines was first introduced in [CE81], and has been studied extensively since then. The complexity of analysis of concurrent finite-state machines is also well understood (see, for instance, [DH94]); concurrency makes the analysis problem exponentially harder. To our knowledge, there has been no previous study of the impact of hierarchical descriptions on the analysis problem. It is worth noting that hierarchical FSMs can be viewed as a special case of pushdown automata, where the length of the stack is bounded a priori. Model checking of pushdown automata is known to be decidable, both for linear-time and branching-time requirements in EXPTIME [BS92, BEM971.
2
Hierarchical State Machines
There are many variants of finite-state machines. We choose to present our results in terms of Kripke structures due t o their prevalent use in model checking literature [CE81]. Similar results hold for the other variants of FSMs, eg. Mealy and Moore type FSMs where inputs and/or outputs occur on the states or transitions. Kripke structures are state-transition graphs whose states are labeled with atomic propositions. Formally, given a finite set P of atomic propositions, a (flat) Kripke structure M over P consists of (1) a finite set W of states, (2) an initial state in E W, (3) a set R of transitions, where each transition is a pair of states, and (4) a labeling function L that labels each state with a subset of P. A hierarchical Ii'ripke structure Ii' over a set of propositions P is a tuple ( K l ,. . .I<,) of structures, where each K ihas the following components:
fail wait
Figure 1: A sample hierarchical structure A finite set Ni of nodes. A finite set Bi of boxes (or supernodes). An initial node ini E Ni. A subset Oi of Ni, called exit-nodes. A labeling function Xi : Ni H 2P that labels each node with a subset of P. : Bi H {i 1 . .. n) that maps each box to an index greater than i. An indexing function An edge relation E i . Each edge in Ea is a pair (u, v) such that (i) the source u is either a node of or a pair (wl, wZ),where wl is a box of ki,. with Y,(wl) = j and w2 is an exit-node of I$, and (ii) the sink v is either a node or a box of Ki.
+
lc
The edges connect nodes and boxes with one another. Since the structure associated with a box can have multiple exit-nodes, an edge out of a box specifies the exit-node also. An example of a hierarchical Kripke structure is shown in Figure 1. The top-level structure I
KF
for each structure K i , the set Wi of states of is defined inductively: (1) every node of 16 belongs then (u, v) belongs to Wi. t o Wi, and (2) if u is a box of Ka with x ( u ) = j, and v is a state of
IY,
KF
The set Ri of transitions of is defined inductively: (1) for (u, v) E Ei, if the sink v is a node then (u, v) E Ri, and if the sink v is a box with Yi(v) = j then ( u , (v, inj)) E Ri, and (2) if w is a box of Ki with x ( w ) = j , and (u, v) is a transition of K,: then ((w, u), (w, v)) belongs to Ri. The labeling function Ls : Wi H 2P of I<: is defined inductively: if w is a node of then Li(w) = Xi(w) , and if w = (u, v), where u is a box of I
x
Figure 2: The expanded structure The structure (W;,ini, R i , Li)is a flat Kripke structure over P, and is called the expanded structure of K ; , denoted h-r. We denote I<: also as K F , the expanded structure of Ii'. The size of K i , denoted IKi 1, is the sum of INi 1 , (BiI, IEil, and C b E BIOYz(b)I. , That is, for every box, we count the number of exit-nodes of the structure indexed by the box. The size of Ii' is the sum of the sizes of K i . The nesting depth of K , denoted nd(K), is the length of the longest chain i l , ia, . . . ij of indices such that a box of I<;,is mapped t o ir+1. Observe that the size of the expanded structure K F can be exponential in the nesting depth, and is o ( I K ( ~ ~ ( ~ ) ) . Other variants of this definition are possible. First, we can allow multiple entry-nodes in the definition. Such a structure can be translated t o our definition by replacing a structure with k entry-nodes with k structures, each with a single entry-node. We allow explicitly multiple exit-nodes because the number of exit-nodes has an important bearing on the complexity of the analysis problems t o be studied (particularly, for branching time). Second, we can allow edges of the form (u, v) where u is a box meaning that the edge may be taken whenever the control is inside the box u. That is, for an edge ( u , v), the expanded structure has an edge from every state with first component u t o v. Such a definition is useful in modeling interrupts, and can be translated to our definition by introducing a dummy exit node. Finally, the basic (unnested) nodes of the hierarchical structure could be themselves other types of objects; for example, they could be basic message sequence charts, in which case the hierarchical structure specifies a hierarchical (or high-level) MSC [HPR97, RGG961. In this case there is concurrency within each basic MSC, but the hierarchical graph provides a global view of the system. The propositions on the nodes reflect properties for the basic objects (eg. basic MSC's), and then from these we can infer properties of the executions of the whole hierarchical system. For example, a property may be whenever process A request a certain resource from B it is eventually granted; in this case there are two propositions (request and grant), we would label each node (basic MSC) with respect to them, and then check the required property in the hierarchical graph.
3 3.1
Verifying Linear-Time Requirements Reachability
For a hierarchical structure I<,a state v is reachable from state u if there is a path from state u to state v in the expanded structure K F . The input to the reachability problem consists of a hierarchical structure li',and a subset T C UiNi of nodes, called the target region. Given ( K , T ) , the reachability problem is t o determine whether or not some state whose last component is in the target region T is reachable from the entry-node inl. The target region is usually specified implicitly, for instance, by a propositional formula over P. We assume that given a node u, the test u E T can be performed in O(1) time. The reachability problem can be used t o check whether a hierarchical structure satisfies an invariant. The reachability algorithm is shown in Figure 3. We assume that the sets of nodes and boxes of all the structures are disjoint. We use N to denote UiNi, E t o denote UiEi etc. The algorithm performs a depth-first search, using the global data structure visited t o store the nodes and boxes. While processing a box b with Y ( b ) = i , the algorithm checks if the entry-node ini was visited before. The first time the algorithm visits some box b with index i, it searches the structure I<;by invoking DFS(ini). At the end of
Input: A hierarchical structure I< and a target region T . T). Output: The answer to the reachability problem (I<, visited: set of nodes and boxes, initially empty. done: for 1 i 5 n , done(i) is a subset of outi.
<
function DFS(u) if u E T then return YES; Insert(u, visited); if u E N then foreach ( u , V ) E E do if v # visited then DFS(v)
else i := Y(u);
if ini @ visited then DFS(ini); done(i) := visited foreach v E done(i) do foreach ((u, v), w) E E do if w @ visited then DFS(w)
n Oi;
end DFS. DFS(inl)
return NO. Figure 3: Reachability Algorithm this search, the set of exit-nodes of I
Theorem 1 (Reachability Algorithm) Algorithm 3 correctly solves the reachability problem (I<, T ) with time complexity O(I KI). For flat Kripke structures, deciding reachability between two states is in NLOGSPACE.For hierarchical structures, however, the reachability problem becomes PTIME-hardeven if we require a single exit for every structure.
.
Theorem 2 (Complexity of Reachability) T h e reachability problem (I<, T ) is PTIME-complete. Proof. By reduction from the alternating reachability problem (to appear in the full paper).
3.2
Cycle detection
The basic problem encountered during verification of liveness requirements is to check whether a specified state can be reached repeatedly. As in the reachability problem, the input t o the cycle-detection problem consists of a hierarchical Kripke structure I<,and a target region T C N. Given ( K , T ) , the cycle-detection problem is t o determine whether there exists a state u whose last component is in the target region T such that (1) u is reachable from the entry-node i n l , and (2) u is reachable from itself. The cycle-detection algorithm is shown in the appendix. The algorithm involves two searches, a primary and a secondary. The algorithm uses a global data structure visitedp to store the states encountered during the primary search, and visiteds to store the states encountered during the secondary search. The primary search is performed by the function DFSp. When the search backtracks from a node in the target region,
or from a box whose exit-node is in the target region, it initiates a secondary search. The secondary search is performed by the function DFSs, and it searches for a cycle. Since the stack stores a path leading to the node or the box from which the secondary search was initiated, secondary search terminates with answer YES when it encounters a node or a box on the stack. The interleaving of the two searches is similar to the cycle-detection algorithm for ordinary Kripke structures [CVWY92], except that we have both boxes and nodes here; the correctness is postponed to the full paper. For complexity analysis, observe that for every u E N U B , DFSp(u) is invoked a t most once, and DFSs(u) is invoked a t most once. This gives linear time complexity. Like the reachability problem, the cycle-detection problem is also PTIME-complete. Theorem 3 (Cycle-detection Algorithm) The cycle-detection problem ( K , T ) can be solved by a nested depth-first search with time complexity O(IK1).
3.3
Automata emptiness
Let M = (W, in, R , L) be a Kripke structure over proposition set P. For a state w E W , a source-w trajectory of M is an infinite sequence wowlw2.. . of states in W such that wo = w and W * R W ~for + ~all i 2 0. An initialized trajectory is a source-in trajectory. The trace corresponding to the trajectory wowlw2.. . is the infinite sequence L(wo)L(wl) . . . over 2P obtained by replacing each state with its label. The language C(M) consists of all the initialized traces of M . A Biichi automaton A over P consists of a Kripke structure M over P and a set T of accepting states. An accepting trajectory of A is an initialized trajectory wowlwa.. . of M such that wi E T for infinitely many i . The language C(A) consists of all traces corresponding to accepting trajectories of A. The input to the automata-emptiness problem consists of a hierarchical structure K over P and an automaton A over P. Given (I<, A), the automata-emptiness problem is to determine whether the intersection C(A) flC(lCF) is empty. This is the automata-theoretic approach to verification: if the automaton A accepts undesirable or bad behaviors, checking emptiness of C(A) n C(KF) corresponds to ensuring that the model has no bad behaviors. We solve the automata-emptiness problem (I<, A) by reduction to a cycle detection problem for the hierarchical structure obtained by constructing the product of K with A as follows. Let K = (kill . . .I<,) where each ICi = ( N i , B i , i n i , O i , X i , ~ , E i and ) , let A = ( W = { w l , . . . w , ) , w ~ , R , L , T ) . The product structure K @ A is the hierarchical structure ( K l l , . . . I
A box of ICij is a pair (b, w), where b E Bi and w E W ; the index of (b, w) is m(i' and w = wjl.
-
+
1) j r if Y (b) = ir
The entry-node of Kij is (ini, wj), and a node (u, w) is an exit-node of Kaj if u E Oi Consider a transition (w, x) of A. While pairing it with the edges in Ei, there are four cases to consider depending on whether the edge connects a node to a node, a node to a box, a box to a node, or a box to a box. (i) For an edge (u,v) of ICi with u , v E Nil X ( u ) = L(w), and X(v) = L(x), there is an edge ((u, w), (v, x)) in Kij. (ii) For an edge ( u , b) of Ii'i with u E Nil b E Bi, X(u) = L(w), and X ( Z ~ ~ = ( ~L(x), ) ) there is an edge ((u, w), (b, x)) in lCij. (iii) For an edge ((b, u), v) of Ki with b E Bi, u E OY(b)lv E Nil X(u) = L(w), and X(v) = L(z), for every y E W , there is an edge (((b, y), (u, w)), (v, x)) in Kij. (iv) For an edge ((b, u), c) of Ki with b, c E Bi, u E OY(b),X ( U ) = L(w), and X ( i n y(,)) = L(x), for every y E W , there is an edge (((b, y), (u, w)), (c, x)) in Kij . The correctness of the product construction is captured by the following lemma which says that some trace of lCF is accepted by A iff there is reachable cycle in lC@A containing a node of the form (u, w) with w E T. Lemma 4 (Product Construction) The intersection C(KF) cycle-detection question (Ii' 8 A, N x T ) is YES.
n C(A)
is nonempty iff the answer to the
Hence, we can solve the automata-emptiness question using the Cycle-detection Algorithm. In the construction of K @I A, the number of structures gets multiplied by the number of states of A, and the size of each structure gets multiplied by the size of A.
Theorem 5 (Automata-emptiness Algorithm) T h e a u t o m a t a - e m p t i n e s s question ( K ,A) c a n be solved by reduction t o t h e cycle-detection problem i n t i m e O(JA12.IK().
3.4
Linear Temporal Logic
Requirements of trace-sets can be specified using the temporal logic LTL. A formula cp of LTL over propositions P is interpreted over an infinite sequence over 2 P . A hierarchical structure K satisfies a formula cp, written Ii' (= cp, iff every trace in C ( K F ) satisfies the formula cp. The input t o the LTL m o d e l checking problem consists of a hierarchical structure K and a formula cp of LTL. The model checking problem (Ii', p ) is t o determine whether or not K satisfies p. To solve the model checking problem, we construct a Buchi automaton A,, such that the language C(A,,) consists of precisely those traces that do not satisfy cp. This can be done using one of the known translations from LTL t o Buchi automata [VW86, GPVW951. The size of A,, is 0(21,I). Then, the hierarchical structure Ii' satisfies p iff C(KF) fl C(A,,) is empty. Thus, the model checking problem reduces to the automata-emptiness problem. The complexity of solving the automata-emptiness problem, together with the cost of translating an LTL formula to a Biichi automaton, yields the following.
Theorem 6 (LTL Model Checking) T h e LTL model-checking problem (I<,cp) can be solved i n t i m e O(lI
Theorem 7 (LTL Model Checking Complexity) T h e LTL model-checking problem (Ii',cp) i s PSPACEcomplete.
4
Verifying Branching-Time Requirements
Now we turn our attention t o verifying requirements specified in the branching-time temporal logic CTL. Given a set P of propositions, the set of CTL formulas is defined inductively by the grammar
where p E P. For a Kripke structure M = (W, i n , R, L ) and a state w E W, the satisfaction relation w bMp is defined below: W ~ P iff p E L(w); w iff w cp; w b p A $ iff w k c p a n d w k g ; iff there exists a state u with wRu and u 1cp; w b 3 0 cp iff there exists a source-w trajectory wow1 . . . such that wi p for all i 2 0; w b 3ncp cp 3U$ iff there exists a source-w trajectory wow1 . . . such that w wk: b $ for some k 2 0, and wi cp for all 0 5 i < k. The Kripke structure M satisfies the formula cp, written M cp, iff i n bMcp. A hierarchical Kripke structure K satisfies the CTL formula cp iff KF 1 cp. The CTL model-checking problem is to determine, given a hierarchical Kripke structure Ii' and a CTL formula cp, whether K satisfies cp.
Input: A hierarchical structure Ii' and a CTLformula p . Output: The answer to tlie model-checking problem (K, p ) sub(p) := list of subformulas of p in increasing order of size. foreach $ E sub(p) do case $:
$ E P: skip; $ = YX: foreach u E N do if x @ X ( u ) then Insert($,X(u)); $ = $1 A $2: foreach u E N do if $1 E X ( u ) and $2 E X ( u ) then Insert ($,X(u)); $ = 3 0 X : Ii' := CheckNext(Ii',x); $ = 3 0 ~ K: := CheckAlways(Ii', x); 1C, = $1 3U$2: Ii' := CheckUntil(Ii', $2) if p E X ( i n l ) then return YES else return NO. Figure 4: CTLModel Checking
4.1
Single-exit case
First, we consider the case when every structure has a single exit-node. In this case, the edges from boxes need not specify the exit-node of the structure associated with the box, and we assume that each Ei is a binary relation over Ni U Bi. We will use outi to denote the unique exit-node of I - . The overview of the model checking algorithm is shown in Figure 4. At the beginning, the function X labels each node with a set of atomic propositions. The algorithm considers subformulas of p starting with the innermost subformulas, and labels each node with the subformulas that are satisfied at that node. At the end, if the entry-node of the first structure is labeled with p, then the original structure satisfies p. Processing of atomic propositions and subformulas of the form TX and $1 A $2 is straightforward. Handling of temporal operators requires modifying the structure, each subformula can double the number of structures. We will consider the cases corresponding to the operators r)and 3U. Processing of
r)
Consider a hierarchical structure K. Consider a formula $ = 3 0 p for a proposition p. We wish t o compute the set of nodes at which $ is satisfied. Consider a node u of K2. Multiple boxes of may be mapped t o li2,and hence, u may appear in multiple contexts in the expanded structure (i.e. there may be multiple states whose last component is u ) . If u is not the exit-node, then the successors of u do not depend on the context. Hence, the truth of $ is identical in all states corresponding to u, and can be determined from the successors of u within $ holds in u if some successor node of u is labeled with p or if the entry-node of the structure associated with some successor box of u is labeled with p. If u is the exit-node of Ii'2, then the truth of $ may depend on the context. For instance, the truth 3 0 abort at the exit-node fail of the structure K2 of Figure 1 is different in the two instances; the formula is false in ( t r y l , fail) and is true in (try2, fail). In this case, we create two copies of K2: li'; and I(:. The superscript indicates whether or not has some successor that satisfies p and is outside ICz. A box of the exit-node of which has a successor satisfying p is mapped t o Ii'i and t o Kg otherwise. The exit-node of I(: is labeled with $. The exit-node of Ii'; is labeled with $ only if it has a successor within K 2 that satisfies p. Now we proceed to define the computation of CheckNext more formally. The input to CheckNext is a hierarchical structure Ii' and a formula X. Let I<= (K1, . . . K,), where each K i = ( N i , Bi, ini, outi, X i , Ei). It returns a hierarcl~icalstructure I<' = ( I ~K:, ~ ,. . .I{:, IC;) by replacing each I(i with two structures Ii': = (Ni, Bi, ini, outi, X:, XI, E i ) and Ii't = (Ni, B i , ini, outi, X:, XI, Ei). Let us say that u E Ni U Bi has a X-successor if either (i) there exists a node v of with (u, v) E Ei and x E Xi(v), or (ii) there exists a box b of Ki with ( u , b) E Ei and x E Xk (ink) for k = X(b).
x,
For a box b of I6 with X(b) = j, if b has a X-successor then Xt(b) = 2 j , and otherwise, Y,'(b) = 2 j - 1.
For a node u of K:, if node u has a X-successor then 3 0 x E x:(u), and otherwise, 3 0 x $ x:(u). For a node u of K:, if node u has a X-successor or if u is the exit-node of Ii'i, then 3 0 x E X: (u), and otherwise, 3 0 x 4 X: (u). Observe that if the exit-node outi has a X-successor then K: and I(il are identical, and in this case, we can delete one of them (that is, there is no need to create two instances). P r o c e s s i n g of 3U Whether a node u of a structure Ii'i satisfies the until-formula $1 may depend on what happens after exiting K i , and thus, different occurrences may assign different truth values to $1 3U$2, requiring splitting of and $2. Let each structure into two. The input to CheckUntiE is a hierarchical structure Ii' and formulas K = (K1, . . .I&), where each Ki = (Ni, Bi, ini, outi, Xi, E l Ei). The computation proceeds in two phases. In the first phase, we partition the index-set (1,. . . n} into three sets, YES, NO, and MAYBE,with the following interpretation. An index i belongs to YES when the entry-node ini satisfies the until-formula $1 within the expanded-structure K.: Then, in K F , for every occurrence of K? the entry-node ini satisfies $1 3U$2. Now consider an index i that does not belong to YES. It belongs to MAYBEif within K: there is a (finite) trajectory from ini to exiti that contains only states labeled with $1. In this case, it depending on whether is possible that for some occurrences of ~i': in K ~the, entry-node satisfies $1 or not the exit-node satisfies $1 3Ugz. In the last case, the index i belongs to NO, and in every occurrence of K:, the entry-node does not satisfy the formula $1 %$2. The desired partitioning is computed by the following procedure. YES:= 0; NO:= 0; MAYBE:=0; For i = n downto 1 do If there exists a source-ini trajectory wo . . . wm over Ni U Bi such that (i) (wj1wj+l) E E for 0 j < m, (ii) for 0 j < m, wj E N with $1 E X(wj) or wj E B with Y(wj) €MAYBE,and (iii) w, E N with $2 E X(w,) or w, E B with Y(w,) EYES Then add i to YES Else If there exists a source-ini trajectory wo . . . w, over Ni U Bi such that (i) (wjl wj+l) E E for 0 5 j < m , (ii) for 0 5 j < m, wj E N with $1 E X(wj) or wj E B with Y(wj) €MAYBE,and (iii) w, = outi Then add i to MAYBE Else add i to NO.
<
<
The computation for each index i can be performed by a depth-first search starting a t the node ini in time 1Ki\. In the second phase, the new hierarchical structure K' along with the labeling with $1 3U$2 is constructed. Each structure K iis split into two: KP and K i . A box b that is previously mapped to will be mapped t o I<: if there is a path starting at b that satisfies $1 3U$2, and otherwise to I{:. Consequently, nodes within I<: can satisfy $1 3U$2 along a path that exits K i l while nodes within I(i0 can satisfy dl 3U$2 only if they satisfy it within Ki. The new structure K' = ( K y , I<:, . . . K:, I<:) is obtained by replacing each Ii'i of K with two structures K: = (Ni, Bi, ini, outi, X:, yZo, Ei) and I<: = (Ni, Bi, ini, outi, X:, q l , Ei). For u , v E Ni U Bi, let us say that v is $;-successor of u if there exists a finite sequence wo . . . w, over Ni U Bi such that (i) wo = u , (ii) w, = v, (iii) (wk, wk+l) E Ea for 0 5 k < m, (iv) if u E Bi then $1 E X(outy(,)), and (v) for 1 5 k m, wk E Ni with $1 E Xi(wk) or wk E Bi with Yi(wk) €MAYBE. For u E Ni U B i , let us say that u has a $2-successor if there exists v E Ni with (u, v) E Ei and $2 E X(v) or v E Bi with (u, v) E Ei and Y(v) EYES.
<
.
For the indexing of boxes of K:, consider a box b with Y,(b) = j . If b has a $;-successor u which has r , q o ( b ) = 2j, else xO(b)= 2 j - 1. a $ 2 - s u ~ ~ e s s othen For the indexing of boxes of I<:, consider a box b with Yi(b) = j . If b has a $;-successor u which has a $z-successor, or if outi is a $;-successor of b, then x l ( b ) = 2j, else xl(b)= 2 j - 1.
For labeling of a node u of K!, if u has a $T-successor v which has a $2-successor, then X;(U) equals X i ( u ) with g1 3Ut+h2added t o i t , else it equals Xi(u). For labeling of a node u of if u has a $;-successor v which has a $2-successor, or if outi is a $;-successor of u , then X:(u) equals Xi(u) with $1 3 7 4 9 added to it, else it equals X i ( u ) . The con~putationrequired in this phase can be performed in time O(I K 1) in a manner similar t o the labeling algorithm of CTL [CEsl].
Complexity Processing of each subformula requires time linear in the size of the current structure. Processing of every temporal connective at worst doubles the size of the current structure. This gives the following result.
Theorem 8 (CTL Model Checking: Single exit case) Algorithm 4 solves the CTL model checking probp), for a single-exit hierarchical structure K , in time O((Ii'(. 21~1). lem (I<, It is known that deciding whether a flat Kripke structure M satisfies a CTL formula p can be solved in space O(lp1 . log (MI) [BVW94]. For a hierarchical structure K , the size of the expanded structure K F is ~ ( l h - l ~ ~ it( follows ~ ) ) , that CTL model checking for hierarchical structures is in PSPACE.We now establish a PSPACElower bound for the case of single-exit structures.
.
Theorem 9 (CTL Model Checking Complexity: Single exit case) Model checking of CTL formulas for single-exit hierarchical structures is P s P A c E - c o ~ ~ ~ ~ ~ ~ . Proof. By reduction from QBF (proof left t o the full paper).
4.2
Multiple exit case
Now consider the case when the hierarchical structure K has multiple exits. The model checking algorithm is similar t o the algorithm for the single exit case, except now more splitting may be required. For instance consider a structure K i with 2 exit-nodes u and v , and a formula = 3 0 p. For different boxes mapped to K i , whether the exit-node u satisfies 14 can vary, and similarly, whether the exit-node v satisfies $ can vary. Consequently, we need to split Ki into four copies, depending whether both, only u , only v , or none, have $-successor outside 1 6 . In general, if there are d exit-nodes, processing of a single temporal subformula can generate 2d copies of each structure in the worst case.
Theorem 10 (CTL Model Checking) The CTL model checking problem (I<,p) can be solved in time O(lI{l . 21qld), where each structure of Ii: has at most d exit-nodes. The alternative approach of applying known model checking procedures t o the expanded structure gives a ~ )alternatively, ), a PSPACEbound. The next result states that the lower time bound of O((p1. I K J ~ ~ ( or bound of PSPACEapplies even for some small fixed CTL formula, and thus, CTL model checking becomes indeed harder with multiple exit nodes.
.
Theorem 11 (CTL Model Checking Complexity) The structure-complexity of CTL model-checking for hierarchical structures is P s P A c E - c o ~ ~ ~ ~ ~ ~ . Proof. By reduction from QBF (proof left t o the full paper).
5
Conclusions
In this paper, we have established that verification of hierarchical machines can be done without flattening them first. We presented efficient algorithms, and matching lower bounds, for the model checking problem for all the commonly used specification formalisms. Our results are summarized in Figure 5. Our results establish that hierarchical specifications offer exponential succinctness at a minimal price. A topic of further research concerns aiialysis problems for communicating hierarchical FSMs. Acknowledgements. We thank Sampath Kannan for fruitful discussions.
Figure 5: Summary of results
References [Apf95]
L. Apfelbaum. Automated functional test generation. In Proc. IEEE Autotestcon Conference, 1995. See also, http://teradyne.com/prods/sst/product~center/t~main.html.
[BVW94]
0 . Bernholtz, M.Y. Vardi, and P. Wolper. An automata-theoretic approach t o branching-time model checking. In Computer Aided Verification, Proc. 6th Int. Conference, LNCS 818, pages 142-155, 1994.
[BJR97]
G . Booch, I. Jacobson, and J . Rumbaugh. Unified Modeling Language User Guide. Addison Wesley, 1997. See also http://www.rational.com/uml/qr/.
[BEM97]
A. Bouajjani, J . Esparza, and 0 . Maler. Reachability analysis of pushdown automata: Application to model checking. Proc. CONCUR19'I, LNCS 1243, 1997.
[BS92]
0. Burkart and B. Steffen. Model checking for context-free processes. Proc. CONCUR'92, LNCS 630, pp. 123-137, 1992.
[CE81]
E.M. Clarke and E.A. Emerson. Design and synthesis of synchronization skeletons using branching time temporal logic. In Proc. Workshop on Logic of Programs, LNCS 131, pages 52-71. Springer-Verlag, 1981.
[CK96]
E.M. Clarke and R.P. Kurshan. Computer-aided verification. IEEE Spectrum, 33(6):61-67,1996.
[CW96]
E.M. Clarke and J.M. Wing. Formal methods: State of the art and future directions. ACM Computing Surveys, 1996.
[CVWY92] C. Courcoubetis, M.Y. Vardi, P. Wolper, and M. Yannakakis. Memory efficient algorithms for the verification of temporal properties. Formal Methods in System Design, 1:275-288, 1992. [DH94]
D. Drusinsky and D. Harel. On the power of bounded concurrency I: finite automata. JACM 41(3), 1994.
[FChk97]
Bell Labs Design Automation. FormalCheck, www.bel1-labs.com/formalcheck.
[FOG951
A.R. Flora-Holmquist, J.D. O'Grady, and M.G. Staskauskas. Telecommunications software design using virtual finite state machines. Proc. Intl. Switching Symposium (ISS95), pp. 103-107, 1995.
[GPVW95] R. Gerth, D. Peled, M.Y. Vardi, and P. Wolper. Simple on-the-fly automatic verification of linear temporal logic. Protocol Specification Testing and Verification, pp. 3-18, 1995. [Hart371
D. Harel. Statecharts: A visual formalism for complex systems. Science of Computer Programming, 8:231-274, 1987.
[Ho196]
G.J. Holzmann. Early fault detection tools. LNCS 1055, pp. 1-13, 1996. (invited paper, reprinted in Software Concepts and Tools, Vol. 17, No. 2, pp. 63-69, 1996).
[Ho197]
G.J. Holzmann. The model checker SPIN. IEEE Trans. on Software Engineering, 23(5):279-295, 1997.
[HPR97]
G.J. Holzmann, D. A. Peled, M. H. Redberg. Design tools for for requirements engineering. Bell Labs Technical Journal, 2(1):86-95, 1997.
[Jac92]
I. Jacobson. Object-oriented software engineering: a use case driven approach. Addison-Wesley, 1992.
[JM87]
F. Jahanian and A.K. Mok. A graph-theoretic approach for timing analysis and its implementation. IEEE Trans. on Computers, C-36(8):961-975, 1987.
[Kur94]
R.P. Kurshan. Computer-aided Verification of Coordinating Processes: the automata-theoretic approach. Princeton University Press, 1994.
[LHHR94] N.G. Leveson, M. Heimdahl, H. Hildreth, and J.D. Reese. Requirements specification for process control systems. IEEE Trans. on Software Engineering, 20(9), 1994. [MSC96]
ITU-T Recommendation 2.120. Message Sequence Chart (MSC). 1996.
[Pnu77]
A. Pnueli. The temporal logic of programs. I11 Proc. of the 18th IEEE Symposium on Foundations of Computer Science, pages 46-77, 1977.
[QS82]
J.P. Queille and J . Sifakis. Specification and verification of concurrent programs in CESAR. Proc. of the 5th Intl. Symp. on Programming, LNCS 137, pp. 195-220, 1982.
[RBPE91] J . Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-oriented modeling and design. Prentice-Hall, 1991. [RGG96]
E. Rudolph, J . Grabowski, P. Graubmann. Tutorial on Message Sequence Charts (MSC'96). FORTE/PSTV'96, 1996.
[SGW94]
B. Selic, G. Gullekson, and P. T. Ward. Real-time object oriented modeling and design. J . Wiley, 1994. See also http://www.objectime.com/.
[VW86]
M.Y. Vardi and P. Wolper. An automata-theoretic approach to automatic program verification. Proc. of the First IEEE Symp. on Logic in Computer Science, pp. 332-344, 1986.
Appendix: Cycle Detection Algorithm Input: A hierarchical structure K and a target region T. Output: The answer t o the cycle-detection problem (I<,T). uisitedp, visiteds: sets of nodes and boxes, initially empty done: for 1 i n , done(i) is a subset of outi Stack: Stack of nodes and boxes
< <
function DFSp ( u ) Push(u, Stack); Insert ( u , visitedp); if u E N then foreach ( u , u ) E E do if v @ visitedp then D F S p ( v ) ; if u E T and u (t visiteds then D F S s ( u ) else i := Y ( u ) ; if ini @ visitedp then D F S p ( i n i ) ; done(i) := visitedp n Oi; foreach v E done(i) do foreach ( ( u ,v ) , w ) E E do if w @ visitedp then D F S p ( w ) ; if v E T then foreach ( ( u ,v ) , w ) E E do if w E Stack then return Y E S ; if w @ visiteds then D F S s ( w ) ; Pop(Stack); end D F S p . function D F S s ( u ) Insert ( u , visiteds); if u E N then foreach ( u ,u ) E E do if v E Stack then return Y E S ; if v 6 visiteds then D F S s ( u ) else foreach u E done(i) do foreach ( ( u ,v ) , w ) E E do if w E Stack then return Y E S ; if w @ visiteds then D F S s ( w ) end D F S s . D F S p ( i n 1) return NO.
Symbolic Exploration of Transition Hierarchies* Rajeev Alur*
Thomas
A. Henzinger*
Sriram K. R a j a m a n i t
Abstract. In formal design verification, successful model checking is typically preceded by a laborious manual process of constructing design abstractions. We present a methodology for partidy-and in some cases, fully-bypassing the abstraction process. For this purpose, we provide to the designer abstraction operators which, if used judiciously in the description of a design, structure the corresponding state space hierarchically. This structure can then be exploited by verification tools, and makes possible the automatic and exhaustive exploration of state spaces that would otherwise be out of scope for existing model checkers. Specifically, we present the following contributions: - A temporal abstraction operator that aggregates transitions and hides intermediate steps. Mathematically, our abstraction operator is a function that maps a flat transition system into a two-level hierarchy where each atomic upper-level transition expands into an entire lower-level transition system. For example, an arithmetic operation may expand into a sequence of bit operations. - A BDD-based algorithm for the symbolic exploration of multi-level hierarchies of transition systems. The algorithm traverses a level-n transition by expanding the corresponding level-(n - 1) transition system on-the-fly. The level-n successors of a state are determined by computing a level-(n - 1) reach set, which is then immediately released from memory. In this fashion, we can exhaustively explore hierarchically structured state spaces whose flat counterparts cause memory overflows. - We experimentally demonstrate the efficiency of our method with three examples-a multiplier, a cache coherence protocol, and a multiprocessor system. In the first two examples, we obtain significant improvements in run times and peak BDD sizes over traditional state-space search. The third example cannot be model checked at all using conventional methods (without manual abstractions), but can be analyzed fully automatically using transition hierarchies.
1
Introduction
Formal design verification is a methodology for detecting logical errors in high-level designs. I n formal design verification, t h e designer describes a system i n a language w i t h a mathematical semantics, a n d t h e n t h e system description is analyzed against various
* This research was supported in part by the Office of Naval Research Young Investigator award N00014-95-1-0520, by the National Science Foundation CAREER award CCR9501708, by the National Science Foundation grant CCR-9504469, by the Air Force Office of Scientific Research contract F49620-93-1-0056, by the Army Research Office MURI grant DAAH-04-96-1-0341, by the Advanced Research Projects Agency grant NAG2-892, and by the Semiconductor Research Corporation contract 95-DC-324.036. ** University of Pennsylvania,
[email protected] University of California at Berkeley,
[email protected] University of California at Berkeley,
[email protected]
correctness requirements. The paradigm is called model checking when the analysis is performed automatically by exhaustive state-space exploration. A variety of model checkers, such as COSPAN [HZR96], Mur4 [Di196], SMV [CKSVG96], SPIN [HP96], and VIS [BSVH+96] have been proven effective aids in the design of error-prone system components such as cache coherence protocols [CK96]. As we seek t o enhance the applicability of model checking to complex designs, we are faced with the so-called state-explosion problem: the size of the state space grows exponentially with the size of the system description, making exhaustive state-space exploration infeasible. Consequently, to use the current tools effectively, one needs t o focus on a critical system component. Since the behavior of an individual con~ponent typically depends on its interaction with other components, a component cannot be analyzed in isolation; rather, for a meaningful analysis, all relevant aspects of the surrounding system need to be identified. This process, called abstraction, is usually performed in an informal, manual fashion, and requires considerable expertise. Indeed, it is not uncommon that a successful verification or falsification run, using a few seconds of CPU time, depends on months of manual labor for constructing abstractions that are neither too coarse to invalidate the correctness requirements, nor too detailed t o exhaust the tool capacities. The goal of our research is t o systematize and, whenever possible, automate the construction of useful abstractions. Our approach is to provide the designer, within the system description language, with operators for writing mental design abstractions into the system description. The judicious use of such operators is called design for verifiability, because it simplifies-and in some cases, eliminates-the process of "rediscovering" abstractions after the design is completed. Our abstraction operators are motivated by the two main, orthogonal structuring principles for designs: (1) spatial aggregation together with hiding of spatial details, and (2) temporal aggregation together with hiding of temporal details. Type-(1) abstractions enable the design of components at different levels of spatial granularity: an ALU can be designed by aggregating gates, then used as an atomic block (after hiding internal gates and wires) in the design of a processor. Type-(2) abstractions enable the design of components at different levels of temporal granularity: an arithmetic operation can be designed by aggregating bit operations, then used as an atomic step (after hiding intermediate results) in the design of an instruction. Spatial, type-(1) abstractions can be written into a system description using, for aggregation, the parallel composition of subsystems and, for hiding, the existential quantification of variables. According operators are provided by most system description languages. They are exploited heavily, both to facilitate the description of complex systems, and t o obtain heuristics for coping with state explosion. For instance, in symbolic state-space exploration using BDDs, instead of building a single transition relation for the entire system, one typically maintains a set of transition relations, one for each component [TSLSO]. By contrast, most system description languages do not provide operators for defining temporal, type-(2) abstractions. We have introduced such an operator, called next, and shown how it facilitates the description of complex systems, in a language called Reactive Modules [AH96]. In this paper, we show how the n e x t operator can be exploited in symbolic state-space exploration to enhance the power of model checking. Specifically, if M is a system description, and cp is a condition on the variables of M , then n e x t cp for M describes the same system at a more abstract temporal level: a -
-
single transition of n e x t cp for M aggregates as many transitions of M as are required to satisfy the condition cp, and hides the intermediate steps. For example, if M is agatelevel description of an ALU, and cp signals the completion of an arithmetic operation, then n e x t cp for M is an operation-level description of the ALU. Mathematically, the semantics of n e x t cp for M is defined as a two-level hierarchy of transition systems: each transition of the upper-level (e.g., operation-level) transition system abstracts an entire lower-level (e.g., gate-level) transit ion system. Then, by nesting n e x t operators we obtain multi-level hierarchies of transition systems. The structuring of a state space into a multi-level transition hierarchy makes possible the exhaustive exploration of very large state spaces. This is because after the traversal of a level-n transition, the computed reach set for the corresponding level-(n - 1) transition system represents hidden intermediate steps and can be removed from memory. In Section 2, we briefly review the language of Reactive Modules and give a simple example of a transition hierarchy. In Section 3, we introduce an algorithm for the symbolic exploration of transition hierarchies. In Section 4, we present experimental results that demonstrate the efficiency of our algorithm. For this purpose, we design a system comprising two processors with simple instruction sets, local caches, and shared memory. If we simply put together these components, using parallel composition but no n e x t operator, the resulting flat transition system is far beyond the scope of existing model checkers. If, however, we use the n e x t operator t o aggregate and hide internal transitions between synchronization points before composing the various subsysten~s, the resulting transition hierarchy can be explored using the search routines of VIS, and correctness requirements can be checked fully automatically. Thus, the description of a design using n e x t can eliminate the need for manual abstractions in verification. R e l a t e d work. The concept of temporal abstraction is inspired by the notion of multiform time in synchronous programming languages [BlGJSl, Ha1931, and by the notion of action rejinement in algebraic languages [AH89]. All of that work, however, concerns only the modeling of systems, and not automatic verification. Temporal abstraction is implicitly present also in the concept of stuttering [Lam83]: a stuttering transition of a system is a transition that leaves all observable variables unchanged. Ignoring differences in the number of stuttering transitions leads t o various notions of stutter-insensitive equivalences on state spaces (e.g., weak bisimulation). This suggests the following approach to model checking: for each component system, compute the appropriate stutter-insensitive equivalence, and before search, replace the component by the smaller quotient space. This approach, which has been implemented i11 tools such as the Concurrency Workbench [CPS93], requires the manipulation of the transition relations for individual components, and has not been shown competitive with simple search (cf. Section 3.1 vs. Section 3.2). Partial-order methods avoid the exploration of unnecessary interleavings between the transitions of component systems. Gains due to partial-order reduction, in space and time, for verification have been reported both in the case of enumerative [HP94] and BDD-based approaches [ABH+97]. By declaring sequences of transitions to be atomic, the n e x t operator also reduces the number of interleavings between concurrent transitions. However, while partial-order reductions need to be "discovered" a posteriori from the system description, transition hierarchies are specified a priori by the designer, as integral part of the system description.
2 2.1
Example: From Bit Addition to Word Addition Reactive Modules
We specify systems as Reactive Modules. A formal definition of reactive modules can be found in [AHSG]; here we give only a brief introduction. The state of a reactive module is determined by the values of three kinds of variables: the external variables are updated by the environment and can be read by the module; the interface variables are updated by the module and can be read by the environment; the private variables are updated by the module and cannot be read by the environment (this distinction is similar to 1/0 automata [LynSG]). For example, Figure 2 shows a module that adds two words. The environment of the word-adder consists of two modules: a command module, which provides the operands to be added and an instruction that they be added, and a bit-adder, which is called repeatedly by the word-adder. Hence the word-adder has the external variables add0pl and addOp2 of type WORD, which contain the two operands provided by the command module, the external variable doAdd of type BOOLEAN, which is set by the command module whenever the two operands should be added, and three output bits of the bit-adder: the sum bitResult, the carry-out cOut, and the flag doneBitAdd, which is set whenever a bit-addition is completed. The word-adder has the interface variables addResult of type WORD, which contains the sum, overJEow of type BOOLEAN, which indicates addition overflow, doneAdd of type BOOLEAN, which is set whenever a wordaddition is complete, and four input bits for the bit-adder: the operands bit1 and bit2, the carry-in cIn, and the flag doBitAdd, which instructs the bit-adder to perform a bitaddition. The word-adder has the private variables state of type FLAGTYPE, which indicates if an addition is being executed, and bitCount of type LOGWORD, which tracks the position of the active bits during the addition of two words. We assume that, once a word-addition is requested, the command module keeps the variable doAdd true until the word-adder signals completion of the addition by setting doneAdd to true. The state of a reactive module changes in a sequence of rounds. In the first round, the initial values of all interface and private variables are determined. In each subsequent round, the new values of all interface and private variables are determined, possibly dependent on some latched values of external, interface, and private variables from the previous round, and possibly dependent on some new values of external variables from the current round. No assumptions are made about the initial values of external variables, nor on how they are updated in subsequent rounds. However, in order to avoid cyclic dependencies between variables, it is not permitted that within a single round, a module updates an interface variable x dependent on the new value of an external variable y while the environment updates y dependent on the new value of x. This restriction is enforced by collecting variables into atoms that can be ordered linearly such that in each round, the variables within an atom can be updated simultaneously provided that all variables within earlier atoms have already been updated. Thus, a round consists of several subrounds-one per atom. A round of the word-adder consists of four subrounds: first, the command module may provide new operands and issue an add instruction; second, the word-adder may initiate a bit-addition; third, the bit-adder may perform a bit-addition; fourth, the word-adder may record the result of a bit-addition and signal a completion of the word-addition. Accordingly, the interface and private variables of the word-adder are
addO[l
iAdd
add";
addResult overflow doneAdd
doBitAdd
"
bztl
"
bzt2
" cIny
I T cout
BitAdder
batResult doneBitAdd
Fig. 1. Word-adder and bit-adder
grouped into two atoms: bill, bit2, cIn, doBitAdd, state, and bitcount are updated in the second subround of each round; addResult, ouerjlow, and doneAdd in the fourth. The first and third subrounds of each round are taken by the command module and the bit-adder, respectively. The bit-adder, shown in Figure 3, needs one round for bitaddition, but can choose to wait indefinitely before servicing a request. A word-addition of two n-bit numbers, therefore, requires a t least n rounds-one round for each bitwise addition. In the first of these rounds, the word-adder reacts to the command module, and the first bits may (or may not) be added. In the last of these rounds, the n-th bits are added, and the word-adder signals completion of the addition. Figure 1 gives a block diagram of the word-adder composed with the bit-adder, and Figures 2 and 3 provide formal descriptions of both components. For each atom, the initial values of the variables and their new values in each subsequent (update) round are specified by guarded commands (as in UNITY [CM88]). In each round, unprimed symbols, such as x , refer to the latched value of the variable x , and primed symbols, such as x', refer to the new value. An atom reads the variable x if its guarded commands refer to the latched value of x. The atom awaits x if its guarded commands refer to the new value x'. The await dependencies between variables are required to be acyclic. A variable is history-free if it is not read by any atom. For obvious reasons, the values of history-free variables do not have to be stored during state-space traversal. 2.2
Flat vs. Hierarchical Models
We discuss two operations for building complex reactive modules from simple ones. The parallel-composition operator abstracts spatial complexities of a system by collecting the atoms of several modules within a single n~odule.The next operator abstracts temporal complexities of a system by combining several rounds of a module as subrounds of a single round. Intuitively, if M is a reactive module and cp is a condition on the
Module WordAdder type F L A G T Y P E : { I D L E , W O R K I N G ) ; interface addResult : W O R D ; b i t l , bite, c l n , doBitAdd, doneAdd, overflow : B O O L E A N private state : F L A G T Y P E ; bitcount : L O G W O R D external a d d O p l , addOp2 : W O R D ; doAdd, doneBitAdd, bitResult, c O u t : B O O L E A N a t o m controls state, b i t c o u n t , doBitAdd, b i t l , bite, c I n reads doneBitAdd, bitResult, cOut awaits a d d O p l , addOp2, doAdd init 1 true -+ state' := IDLE update ( s t a t e = I D L E ) A doAddl -+ state' := W O R K I N G ; bitCountf := false; doBitAddl := true; bitl' := addOpll[O]; bit2' := addOp2'[0]; cln' :=false 0 ( s t a t e = W O R K I N G ) A (bitCount < W O R D L E N G T H - 1 ) doneBitAdd bitl' := a d d O p l t [ b i t C o u n t+ I ] ; bit2' := addOp2'[bitCount + 11; cln' = c O u t ; bitCountl = b i t c o u n t + 1 1 ( s t a t e = W O R K I N G ) A ( b i t c o u n t = W O R D L E N G T H - 1 ) A doneBitAdd doBitAdd1 = false; state' := I D L E endatom
-
a t o m controls addResult, doneAdd, overflow reads b i t c o u n t awaits bitResult, doneBitAdd, cOut init 1true doneAddl :=false update 1( b i t c o u n t < W O R D L E N G T H - 1 ) doneBitAddl addResultf[bitCount] := bitResultf 1 ( b i t c o u n t = W O R D L E N G T H - 1 ) doneBitAddt addResultf[bitCount] := bitResuIt; doneAddl := true; overflow' = c O u t l endatom
-
-
endmodule Fig. 2 . Word-adder
variables of M , then n e x t cp for M is a module that, in a single round, iterates as many rounds of M as are necessary to make the so-called aggregation predicate cp true. Formal definitions of parallel composition and next-abstraction can be found in [AH96]. Consider, for example, the following two modules: C o n c r e t e A d d e r = W o r d A d d e r I( B i t A d d e r A b s t r a c t A d d e r = n e x t ( d o A d d j d o n e A d d ) for C o n c r e t e A d d e r
Each of the two modules is a model for an addition unit that consists of two components,
Module BitAdder interface doneBitAdd, bitResult, cOut : BOOLEAN external bitl, bit2, cIn, doBitAdd : BOOLEAN a t o m doneBitAdd, bitResuIt, cOut awaits bitl , bit2, cIn,doBitAdd init true doneBitAddl :=false update 0 doBitAddl + bitResultl := bitl' $ bit2' $ cInl; cOutl := (bit1'&bit2~)l(bit2'&cIn')I(cIn~& bitl I); doneBitAddl := true 0 true doneBztAddl :=false; endatom endmodule
-
-
Fig. 3. Bit-adder a word-adder composed with a bit-adder. The two models differ only in their level of temporal granularity. In the flat model ConcreteAdder, the addition of two n-bit words takes a t least n rounds. In the hierarchical model AbstractAdder, the addition of two n-bit words takes a single round. This is because the next-abstracted module combines into a single round as many rounds as are necessary t o make either doAdd false or doneAdd true. In other words, in the flat model, bit-additions are atomic. Thus the flat model is adequate under the assumption that the addition unit is put into an environment that interacts with the addition unit only before and after bit-additions, but does not interrupt in the middle of a bit-addition. By contrast, in the hierarchical model, word-additions are atomic. Therefore the hierarchical model is adequate only under the stronger assumption that the addition unit is put into an environment t h a t interacts with the addition unit only before and after word-additions, but does not interrupt in the middle of a word-addition. While the flat model is adequate in more situations, we will see that the hierarchical model can be verified more efficiently, and therefore should be preferred whenever it is adequate.
3
Symbolic Exploration of Hierarchical State Spaces
The reason why next-abstraction can be exploited by model checking can be seen as follows. If the semantics of a reactive module is viewed as a state-transition graph, with each round corresponding t o a transition, then the hierarchical module next cp for M may have many fewer states than the flat module M. In particular, the next operator removes all states of M in which the aggregation predicate cp is not true; these states are called transient. When exploring the state space of the flat module, both transient and nontransient reachable states need to be stored as they are con~puted.By contrast, when exploring the state space of a hierarchical module, transient states need t o be computed in order to find nontransient successor states, but once these states are found, the transient states need not be stored. Moreover, some of the variables that are not history-free in the flat model may become independent of (nontransient) predecessor
states in the hierarchical model, and thus in effect history-free. This results in further savings in memory for storing states. The savings are particularly pronounced when hierarchical models are composed. Consider two flat models M and N , and two hierarchical models M' = ( n e x t cp f o r M ) and N' = ( n e x t ?I, f o r N ) . The hierarchical composition M ' )I N' is an adequate model for the composed systems if the aggregation predicates cp and 4 characterize the synchronization points of the two components; that is, if M interacts with N whenever cp becomes true, and N interacts with M whenever $ becomes true. The possible interleavings of transitions of M and N may lead to an explosion of transient states of M 11 N (states in which neither cp nor $ is true), which can be avoided by exploring instead M' 11 N'. In other words, if the interaction between two component systems can be restricted, then some of the state-explosion problem may be avoided. Indeed, as we shall see, in complex systems with many components but well-defined interactions between the components, the computational savings, both in time and memory, can be enormous. In the following, we first define the transition relations of composite and hierarchical modules from the transition relations of the components. Then we present a nestedsearch algorithm that explores the state space of a hierarchical module efficiently. The nested-search algorithm uses an implicit, algorithmic representation of the transition relation of a hierarchical module for image computation.
3.1
Explicit Definition of T r a n s i t i o n R e l a t i o n s
The state-transition graph of a reactive module can be specified by a symbolic transition relation. Given a module M with variables X , the symbolic transition relation of M is a boolean function TM(X,X'). Let x and X' be two states of M , i.e., valuations to the variables of M . Then the function TM(X, X I ) evaluates to true iff there is an edge in the state-transition graph from state x to state x ' . All modules are built from individual atoms using parallel composition and the n e x t operator. It is straightforward to construct the symbolic transition relation of an atom. For complex modules, the symbolic transition relation is defined inductively. Parallel composition. Consider the module M = M1 11 M2. Let TMI(XllX i ) and TM,(Xz, Xa) be the symbolic transition relations of MI and M2, respectively. Then the symbolic transition relation of M is given by the conjunction
N e x t a b s t r a c t i o n . Consider the module M = ( n e x t cp f o r N ) . Let TN(X, X') be the symbolic transition relation of N. For all natural numbers i 2 0, define
TL
Let = T Z 1 be the limit of the fixpoint computation sequence T&,Th,T&,. . . (finite convergence is guaranteed for finite-state systems). Then the symbolic transition relation of M is given by TM(X,XI) = p ( X ) A p(X1) A T ~ (X'). x,
The reachable state set of a module can be computed by iterated application of the transition relation. For this purpose, it is theoretically possible to construct, using the above definitions, a BDD for the symbolic transition relation of a hierarchical module. In practice, however, during the construction the intermediate BDDs often blows up and results in memory overflow. For parallel composition, it is a common trick to leave the transition relation conjunctively decomposed and represent it as a set of BDDs, rather than computing their conjunction as a single BDD [TSLSO]. Early quantification heuristics are then used to efficiently compute the image of a state set under a conjunctively partitioned transition relation. For next abstraction, we propose a similar approach. 3.2
Efficient Computation with Implicit Transition Relations
For model checking, it suffices to represent the symbolic transition relation of a module not explicitly, as a BDD, but implicitly, as an algorithm that given a state set, computes the set of successor states. This algorithm can then be iterated for reachability analysis and more general verification problems. Given a module M , the single-step successor f u n c t i o n of M is a function R b from state sets of M to state sets of M. Let a be a set of states of M. Then R ~ ( uis) the set of all states X' of M such that there is a state x E u with T M ( X , X I ) ; that is, R h ( a ) is the image of a under the transition ) single atoms. For complex relation TM. It is straightforward to compute R ~ ( ufor modules, the single-step successor function is computed recursively. Parallel composition. Consider the module M = MI 11 M2 and a set a of states ( a ) and ( a ) be the images of a for MI and M2, respectively. Then of M . Let
RL~
R ~ ( u=) RL, ( u )A R h 2 ( a ) . Next abstraction. Consider the module M = (next cp for N) and a set a of states of M . Let R h ( u ) be the image of a for N. Then R h ( a ) is computed by the nestedsearch procedure described in Algorithm 3.2. Algorithm 3.2 {Given module M = (next p for N), single-step successor function RL, and state set a, compute R h (a)} {We will assume a C p} FirstLevelImage := { } SecondLevelImage := R ~ ( c T ) SecondLevelReachSet := { } loop FirstLevelImage := FirstLevelImage U (SecondLevelImage n p ) SecondLevelReachSet := SecondLevelReachSet U (SecondLevelImage n p) SecondLevelImage := RL(SecondLeue1ReachSet) until (SecondLevelReachSet does not change) return (FirstLeveZImage)
In contrast to a BDD for TM(X,XI), which explicitly represents the transition relation of module M , the recursive algorithm for computing the function R b implicitly represents the same information. In practice, a mixture of explicit symbolic representation of
transition relations (for small modules) and implicit image computation (for complex modules) will be most efficient. We report on our experiences with nested search in the following section.
Experiments
4
The aim of our experiments is to investigate the efficiency of the proposed method for the automatic reachability analysis of complex designs. All experimental results reported in this paper were obtained by modeling the systems in Verilog and using the vl2mv Verilog compiler along with VIS [BSVHt96]. We implemented a new command in VIS, called abstract-reach, based on Algorithm 3.2. 4.1
Multiplier
We model a word-multiplier that functions by repeated addition, using the word-adder described earlier. With help of the next operator, we can model the multiplier a t various levels of temporal detail. We experiment with two options: -
-
Model 1: For addition, use the flat module ConcreteAdder explained in Section 2. In this model, bit-additions appear as atomic actions of the multiplier. Model 2: For addition, use the hierarchical module AbstractAdder. In this model, word-additions appear as atomic actions.
We perform reachability analysis with both models. Model 1 is given to VIS directly, and reachability analysis is performed using the compute-reach command of VIS. In order t o analyze Model 2, we use the abstract-reach command with the aggregation predicate doAdd j doneAdd. As a result, the states in which doAdd is true and doneAdd is false become transient states. We experiment with two 4-bit operands and an 8-bit result. In this case, Model 1 has 68 latches and 1416 gates. After the next abstraction, 24 of these latches become history-free; that is, their values are independent of previous liontransient values. In particular, the local variables of the adder become history-free, and hence, are represented by trivial functions in the BDD that represents the reachable states. Table 1 shows the peak BDD sizes for both models. 4.2
Cache Coherence Protocol
We describe the various components of a generic cache coherence protocol before discussing our results. Each cache block can be in one of three states: INVALID, READSHARED, or WRITEEXCLUSIVE. Multiple processors can have the same memory location in their caches in the READ-SHARED state, but only one processor can have a given location in the WRITE-EXCLUSIVE state. There is a directory that, for each memory location, has a record of which processors have cached that location and what states (READ-SHARED, WRITE-EXCLUSIVE) these blocks are in. Due to want of space, we will not explain the protocol formally. An example scenario gives the general flavor. Suppose that Processor 1 has a location in WRITEXXCLUSIVE, and Processor 2 wants to read this location. First Cache 2 records a write miss and communicates that to the directory. The directory then sends a message to Processor 1, requesting it to move the state of the block under consideration from WRITE-EXCLUSIVE
to READSHARED. Cache 1 acknowledges this request and also sends the latest version of the data in this block to the directory. The directory then services Cache 2 with this data, and Cache 2 gets the block as READSHARED. Each of these steps involves a transaction on the bus, which could take an arbitrary number of rounds due to the asynchronous nature of the bus. We experiment with two levels of temporal granularity. Model 1 is a flat model of the memory system, and Model 2 is a hierarchical model that abstracts temporal detail about the bus. While a bus transaction can consume multiple rounds in Model 1, it is forced to always complete in a single round in Model 2. For our experiments, we choose a 1-bit address bus and 1-bit data bus. In this case, Model 1 has 44 latches, of which 6 latches become history-free in Model 2. The peak BDD sizes during reachability analysis for both models are reported in Table 1. 4.3
Processor-Memory System
Aiming for a more dramatic improvement over flat modeling, we compose several systems whose interactions are limited. We put together two processors, each with an ALU consisting of the adder and multiplier described earlier, and the cache protocol, to obtain a complete processor-memory system. A block diagram of the system is shown in Figure 4. The processors have a simple instruction set: load/store register to/from memory, add two register operands, multiply two register operands, compare two registers, and a conditional branch. Again we experiment with two models. Model 1 is flat, and Model 2 is coiistructed by composing next-abstracted versions of the multipliers, adders, and bus protocol.
Fig. 4. Processor-Memory System
We choose an 1-bit wide address bus and a 2-bit wide data bus. In this case, Model 1
has 147 latches, of which 36 latches become history-free in Model 2 (15 latches in each multiplier, and 6 in the cache protocol). Here, reachability analysis for Model 1 is beyond the capability of current verification tools. However, fully automatic reachability analysis succeeds for Model 2, which structures the design using the n e x t operator. Consider, for exanlple, the situation where both processors start a multiplication a t the same time. In Model 1, there are several transient states due t o the interleaving of independent suboperations of the two multipliers. These transient states are entirely absent in Model 2. Indeed, nested search (Algorithm 3.2) is the key t o verifying this example: we run out of memory when trying to compute an explicit representation of the transition relation for Model 2.5 4.4
Processor-Memory System with Programs
Finally, we add programs to the processors of the processor-memory system. Processor 1 computes A * B. Processor 2 computes C * D. Processor 1 then adds up both results and computes A * B C * D. The two processors synchronize through a flag in memory. The last entry of the table shows the reachability results for the processormemory system constrained by these programs. Again, Model 2 completes reachability using abstract-reach, whereas Model 1 does not. Thus we are able to verify invariants on Model 2, such as confirming that the results computed by the programs are correct.
+
Time for reachability (sec) Peak BDD size Example Model 2 Model 1 Model 2 Model 1 Multiplier 122 6561 26979 157 Cache Coherence 21498 42467 227 310 * Proc/Mem System space out 53103 816 * with Program space out 9251 153 T a b l e 1. Experimental Results
5
Conclusions
We introduced a formal way of describing a design using both temporal and spatial abstraction operators for structuring the description. The temporal abstraction operator next induces a hierarchy of transitions on the state space, where a high-level transition corresponds t o a sequence of low-level transitions. We exploited transition hierarchies in symbolic reachability analysis and presented an algorithm for proving invariants of reactive modules using hierarchical search. We tested the algorithm on arithmetic circuits, cache coherence protocols, and processor-memory systems, using an extension of VIS. The experimental results are encouraging, giving fully automatic results even on systems that are amenable to existing tools only after manual abstractions. Transition hierarchies can be exploited to give efficiencies in enumerative reachability analysis as The transition relation for the multiplier module alone, as computed by VIS, has 7586 BDD nodes and is composed of 6 conjunctive components. Even "and9'-ing the components together results in memory overflow.
well [AH96]. We are currently building a formal verification tool for reactive modules, called MOCHA, which will incorporate both symbolic and enumerative hierarchical search as primitives. While the next operator is ideally suited for abstracting subsystems that interact with each other at predetermined synchronization points, it does not permit the "out-oforder execution" of low-level transitions. We currently investigate additional abstraction operators, such as operators that permit the temporal abstraction of pipelined designs.
References [AH891 L. Aceto and M. Hennessy. Towards action refinement in process algebras. In Proc. of LICS 89: Logic in Computer Science, pages 138-145. IEEE Computer Society Press, 1989. [ABH+97] R. Alur, R.K. Brayton, T.A. Henzinger, S. Qadeer, and S.K. Rajamani. Partialorder reduction in symbolic state-space exploration. In Proc. of CAV 97: Computer-Aided Verification, LNCS 1254, pages 340-351. Springer-Verlag, 1997. [AH961 R. Alur and T.A. Henzinger. Reactive modules. In Proc. of LICS 96: Logic in Computer Science, pages 207-218. IEEE Computer Society Press, 1996. [BlGJgl] A. Benveniste, P. le Guernic, and C. Jacquemot. Synchronous programming with events and relations: The SIGNAL language and its semantics. Science of Computer Programming, 16:103-149, 1991. [BSVHt96] R.K. Brayton, A. Sangiovanni-Vincentelli, G.D. Hachtel, F. Somenzi, A. Aziz, S.-T. Cheng, S. Edwards, S. Khatri, Y. Kukimoto, S. Qadeer, R.K. Ranjan, T.R. Shiple, G. Swamy, T . Villa, A. Pardo, and S. Sarwary. VIS: a system for verification and synthesis. In Proc. of CAV 96: Computer-Aided Verification, LNCS 1102, pages 428-432. Springer-Verlag, 1996. [CK96] E.M. Clarke and R.P. Kurshan. Computer-aided verification. IEEE Spectrum, 33(6):61-67, 1996. [CKSVG96] E.M. Clarke, K.L. McMillan, S. Campos, and V. Hartonas-Garmhausen. Symbolic model checking. In Proc. of CAV 96: Computer-Aided Verification, LNCS 1102, pages 419-422. Springer-Verlag, 1996. [CM88] K.M. Chandy and J. Misra. Parallel Program Design: A Foundation. AddisonWesley, 1988. [CPS93] R.J. Cleaveland, J. Parrow, and B. Steffen. The Concurrency Workbench: a semantics-based tool for the verification of finite-state systems. ACM Transactions on Programming Languages and Systems, 15(1):36-72, 1993. [Di196] D.L. Dill. The M u r d verification system. In Proc. of CAV 96: Computer-Aided Verification, LNCS 1102, pages 390-393. Springer-Verlag, 1996. [I-Id931 N. Halbwachs. Synchronous Programming of Reactive Systems. Kluwer Academic Publishers, 1993. [HP94] G.J. Holzmann and D.A. Peled. An improvement in formal verification. In Proc. of FORTE 94: Formal Description Techniques, pages 197-211. Chapman & Hall, 1994. [HP96] G.J. Holzmann and D.A. Peled. The state of SPIN. In Proc. of CAV 96: ComputerAided Verification, LNCS 1102, pages 385-389. Springer-Verlag, 1996. [HZR96] R.H. Hardin, Z. Har'El, and R.P. Kurshan. COSPAN. In Proc. of CAV 96: Computer-Aided Verification, LNCS 1102, pages 423-427. Springer-Verlag, 1996. [Lam831 L. Lamport. What good is temporal logic? In Proc. of Information Processing 83: IFIP World Computer Congress, pages 657-668. Elsevier Science Publishers, 1983. [Lyn96] N.A. Lynch. Distributed Algorithms. Morgan-Kaufmann, 1996.
[TSLSO] H. J. Touati, H. Savoj, B. Lin, R.K. Brayton, and A. Sangiovanni-Vincentelli. Implicit s t a t e enumeration of finite-state machines using BDDs. In Proc. of ICCAD 90: ComputerAided Design, pages 130-133. I E E E Computer Society Press, 1990.