Gray-box extraction of execution graphs for anomaly detection

Many host-based anomaly detection systems monitor a process by observing the system calls it makes, and comparing these calls to a model of behavior for the program that the process should be executing. In this paper we introduce a new model of system call behavior, called an <i>execution graph</i>. The execution graph is the first such model that both requires no static analysis of the program source or binary, and conforms to the control flow graph of the program. When used as the model in an anomaly detection system monitoring system calls, it offers two strong properties: (i) it accepts only system call sequences that are consistent with the control flow graph of the program; (ii) it is maximal given a set of training data, meaning that any extensions to the execution graph could permit some intrusions to go undetected. In this paper, we formalize and prove these claims. We additionally evaluate the performance of our anomaly detection technique.


INTRODUCTION
Many host-based intrusion detection systems (e.g., [4,11,17,18]) and related sandboxing and confinement systems (e.g., [12,19]) monitor the system calls emitted by a process in order to detect deviations from a previously constructed model of system call behavior. We coarsely divide these systems into "black box", "gray box" and "white box" approaches, based on the information they use to build the model to which they compare system calls at run time. On the one hand, black-box and gray-box methods build a model of system-call behavior by monitoring sample executions. Within this space, black-box detectors employ only the system call number (and potentially the arguments, though we do not consider arguments in this paper) that pass through the system call interface when system calls are made (e.g., [4,16]). A gray-box detector extracts additional runtime information from the process, e.g., by looking into the process' memory (e.g., [3,15]). On the other hand, white-box approaches obtain the model by statically analyzing the source code or binary (e.g., [2,6,7,18]).
By their nature, black-box and gray-box detectors detect anomalous behavior, i.e., behavior different from "normal" runs, regardless of whether it results from an intrusion or an execution path that was not encountered during training. In contrast, white box detectors detect actual deviations from the program text, for which an intrusion is virtually the only conceivable explanation (assuming that the program is not self-modifying). As such, white-box detectors can be designed to have a zero false-positive rate, in the sense that an alarm always indicates an intrusion. Since minimizing false positives is a significant factor in gaining user acceptance, this is an important advantage of white-box approaches.
White-box approaches, however, are not always viable. First, source code is often not available, and the complexity of performing static analysis on, e.g., x86 binaries is well documented. 1 In fact, all white-box intrusion detection sys- 1 This complexity stems from difficulties in code discovery and module discovery [13], with numerous contributing factors, including: variable instruction size (Prasad and Chiueh claim that this renders the problem of distinguishing code from data undecidable [10]); hand-coded assembly routines, e.g., due to statically linked libraries, that may not follow familiar source-level conventions (e.g., that a function has a single entry point) or use recognizable compiler idioms [14]; and indirect branch instructions such as call/jmp reg32 that make it difficult or impossible to identify the target location [10,13]. Due to these issues and others, binary analysis/rewrite tools for the x86 platform have strict restrictions on their applicable targets [9, 10, 13, 14]. tems of which we are aware (including those cited above) have eschewed this common platform. Static analysis is also difficult for programs protected by obfuscation or digital rights management (DRM) technologies that are designed to render static analysis of control flow all but impossible (e.g., [1]). Finally, white-box techniques can fail on selfmodifying programs. We thus believe that examination of gray-box and black-box approaches can play an important role where white-box approaches are unavailable.
In this paper we present a new gray-box model, called an execution graph, that is, to our knowledge, the first grayor black-box technique for which a positive relationship to what is achievable via common white-box techniques can be proved analytically. Intuitively, the goal that we set for our technique is to build a model that accepts the same system call sequences as would be accepted by a model built from the control flow graph of the program, which is the basis of many white-box techniques. This, of course, is not achievable, since our gray-box technique can train only on observed runs of the program, which may miss entire branches of the program that static analysis would uncover. Nevertheless, using gray-box techniques alone, our approach constructs an execution graph with the following two useful properties: First, the system call sequences (the language) accepted by the execution graph are a subset of those accepted by the control flow graph of the program. Second, the language accepted by the execution graph is maximal for the training sequences it was provided. Specifically, we show that there exists a program of which the control flow graph would accept the same language as the execution graph. In other words, if the execution graph were to accept any other system call sequence s, then there is a program that can emit exactly the same training sequences but for which the control flow graph would not accept s.
In some sense, this is the best one can hope to achieve toward using a gray-box technique to mimic the power of a control flow model obtained via white-box analysis. Moreover, as the control flow model that we set for our goal is equivalent to the most restrictive white-box models known in the literature-the model is sensitive not only to the sequence of system calls, but the sequence of active function calls when each system call occurs-our approach mimics some of the best white-box techniques known today, using only gray-box analysis. Additionally, we demonstrate through a prototype implementation that monitoring via an execution graph is very efficient.
The rest of the paper is organized as follows. Section 2 discusses related work in this area. Section 3 defines an execution graph, and the language accepted by an execution graph. We show the claimed relationships between execution graphs and control flow graphs in Section 4. Performance evaluations are discussed in Section 5. Finally, we conclude in Section 6.

RELATED WORK
Numerous white-box approaches to intrusion detection have focused on monitoring a process' system-call conformance with the control flow graph of the program it is ostensibly running. One of the earliest works, due to Wagner and Dean [18], generates a range of models based on the control flow graph of the program, generated via static analysis of the source. Their most accurate model, equivalent to the control flow model that we adopt here (Section 4), resulted in very substantial runtime monitoring overheads. This cost, as well as the need for analyzing source code, were addressed in following works due to Giffin et al. [6,7] and Feng et al. [2]. These works included modifying the binary to permit the runtime monitor to perform more efficiently.
Black-box approaches were pioneered by Forrest et al. [4], who introduced an approach to characterize normal program behavior in terms of sequences of system calls. System call sequences are broken into patterns of fixed length, which are learned and stored in a table. Wespi et al. [20,21] extend this approach to permit variable-length patterns of system calls. To our knowledge, Sekar et al. [15] proposed the first gray-box approach, coupling the system call number with the program counter of the process when the system call is made. Feng et al. [3] proposed extending the gray-box information used to include return addresses on the call stack of the process when a system call is made. While the benefits and costs of many of these approaches have been studied [5], the behavior of none of these approaches has been formally related to that of the white-box system call models. In fact it is generally easy to confirm that these prior black-and gray-box models neither contain nor are contained by the white-box models, in terms of the languages of system call sequences they accept. Black-box approaches have also been extended to monitor system call arguments [8], however we do not consider them in the paper.

EXECUTION GRAPHS
In this section we describe our model, called an execution graph, for anomaly detection, which is built using a gray-box technique. Our technique assumes that the program being monitored is implemented in a programming language for which the runtime utilizes a call stack, where each stack frame corresponds to a function call in the program and includes a return address. Every implementation of the C and C++ programming languages known to us satisfies this criteria, and these languages are the primary motivations for our work.
The execution graph technique we describe in this paper works, during both training and monitoring, by observing system calls along with additional runtime information that it extracts upon each system call, namely the return addresses on the call stack of the monitored process when the system call is made. We define a system call along with the return addresses on the call stack when a system call is made as an observation. Each such observation can be represented by an arbitrary-length vector of integers, each in the range of [0, 2 32 ) assuming a 32-bit platform. The last element of the vector is the system call number, and the preceding elements are the return addresses on the call stack when the system call is made, with the first address being an address in main(), i.e., an address in the first function executed.
We formally define the concept of observation below, and we call a sequence of observations an execution. An execution is an arbitrary-length sequence of observations. ✷ In particular, for an observation r1, r2, . . . , r k , r1 is an address in main(), r k−1 is the "return address" which corre-sponds to the instruction that makes the system call, 2 and r k is the system call number.
We next introduce the concept of an execution graph, which is built by observing executions as defined above. The goal we set for this new model, as briefly stated in Section 1, is to build a model that accepts the same system call sequences that will be accepted by most models built from white-box techniques. Informally, we need to extract function call structures from observations so that a graph similar to the control flow graph can be built. To achieve this we analyze every two consecutive system calls and the return addresses on the call stack when each system call is made, i.e., to analyze two consecutive observations. Since each return address represents a stack frame, consecutive observations reveal some information about the function call structure of the program. In the following definition, we show how this information is used to build an execution graph.
The execution graph is one of the most important concepts in this paper, especially the inductive definition of the edges in the graph. Intuitively, we use Ertn to represent the returning of a function to its calling location, use Ecrs to represent the execution flow within a function, and use E call to represent the calling from a function call site to its call target. These three sets of edges are defined in the base case by processing consecutive observations. The inductive part of the definition is used to post-process these sets of edges, and to discover "missing" edges, where the relationship between two nodes could be derived from the executions, but not by processing any individual pair of observations. (This induction is further explained in an example after we formally present the definition.) Definition 3.2 (Execution graph, leaf node, crs → ). An execution graph for a set of executions X is a graph eg(X ) = (V, E call , Ecrs, Ertn), where V is a set of nodes, and E call , Ecrs, Ertn ⊆ V × V are directed edge sets, defined as follows: • For each execution X ∈ X and each observation r1, r2, . . ., r k ∈ X, V contains nodes r1, r2, . . . , r k . r k is called a leaf node of the execution graph eg(X ). In the case where r1, r2, . . . , r k is the first observation in an execution, r k is also denoted as an enter node; in the case where r1, r2, . . . , r k is the last observation in an execution, r k is also denoted as an exit node. (Note that an execution graph could have more than one enter node and more than one exit node.) • The sets E call , Ecrs, Ertn are defined inductively to contain only edges obtained by the following rules: -(Base case) For each execution X in X and each pair of consecutive observations r1, r2, . . ., r k , r 1 , r 2 , . . . , r k in X, Though on most platforms, system calls are implemented differently from function calls, r k−1 can be retrieved from the stack in a similar fashion, and we still refer to it as a return address.
. . , r k = r 1 , r 2 , . . . , r k arg maxj : r1, r2, . . . , rj = r 1 , r 2 , . . . , r j + 1 otherwise If r k is an enter node, If r k is an exit node, -(Induction) Define the relation r crs → r to be true if there exists a path from r to r consisting of only edges in Ecrs.
✷ Note that the integers in an observation serve as labels for the nodes created. For simplicity, we do not differentiate a node and its label, i.e., in the above definition, r and x denote both the nodes and their labels. With the inductive definition in Definition 3.2, execution graph as shown in Figure 1 can be obtained even if the value of a and b are fixed in executions X . 3 ✷ Recall that we want to mimic the power of the most restrictive control flow graph model known in the literature, where not only the system call sequence, but also the sequence of active function calls when each system call occurs, are captured. To do this, we use the notion of execution stack to capture the active function calls allowed in an execution graph.
• r1 corresponds to an address in main(), i.e., an address in the first function executed; and • rn is a leaf node.

✷
Intuitively, an execution stack captures a system call and the active functions (functions that have not returned) when the system call is made, which is also what an observation captures. However, an execution stack might or might not have a corresponding observation in the executions X that are used to construct the execution graph eg(X ).
We next define the notion of successor. Intuitively, if observation x follows another observation x in an execution, then x corresponds to an execution stack that is a successor of the execution stack corresponding to x. The notion of successor in an execution graph defines whether a system call (and the corresponding active functions) are allowed to follow another system call.
Definition 3.6 (Execution path). An execution path δ is a sequence of execution stacks in an execution graph eg(X ) = (V , E call , Ecrs, Ertn), say δ = s1, s2, . . . , sn , si = ri,1, ri,2, . . ., ri,m i , where • r1,m 1 is an enter node; and Intuitively, an execution path is a sequence of execution stacks that corresponds to a possible execution of the program that emitted the executions X . Notice that it only requires the sequence of execution stacks to be allowed by the execution graph (captured by the notion of successor, which is defined in Definition 3.5), which might or might not have appeared in the executions X from which the execution graph eg(X ) is built. Definition 3.7 (Language accepted by eg(X )). The language accepted by eg(X ), denoted L eg(X) , is the set of all execution paths in eg(X ). ✷ Each string in the language accepted by an execution graph is a sequence of execution stacks. Each execution stack consists of a sequence of integers, which intuitively represents a system call and the return addresses of the active functions when the system call is made. Though we have defined execution graphs built from observations including return addresses, they also have a black-box variant: In the case where only the system call number is used to describe a system call (return addresses are not extracted), an execution stack consists of only one integer, which is the system call number. Consequently a string in the language will become a sequence of system call numbers. We do not discuss this variation further.

PROPERTIES OF EXECUTION GRAPHS
We briefly stated in Section 1 that the goal of our technique is to build a model that accepts system call sequences that would be accepted by a model built from the control flow graph of the program. In this section, we formalize two important properties of an execution graph. First, it accepts only system call sequences that are consistent with the control flow graph of the program. Second, it is maximal given a set of training data, meaning that any extensions to an execution graph could permit some intrusions to go undetected. To formalize these two properties, we first define control flow graphs, the language a control flow graph accepts, and well-behaved executions. With these definitions, the theorems are presented in Section 4.3.

Control Flow Graphs
A control flow graph is an abstract representation of a procedure or program. In this paper, it is convenient to consider a variation on the traditional control flow graph for a program P , denoted cfg(P ). First, cfg(P ) consists of a number of control flow subgraphs, one per function F in P , denoted cfsg(F ). Second, since we are interested only in function calls and system calls in P , each cfsg(F ) has one node per function call and two nodes per system call that it contains, in addition to its entry and exit node, and no other nodes. Though these variations render cfg(P ) different from a traditional control flow graph, we will still refer to it as one.
In this paper, we refer to a jump as a nonsequential transfer of control, distinct from a function call or a system call. With this, we define the relationship between two instructions in a function.

Definition 4.1 (Follow). Instruction t follows instruction t iff t and t are in the same function and
• (Base case) t is at a higher address than t, and there is no jump, function call or system call between t and t ; • (Induction) There exists a jump c and a corresponding jump target c , such that t follows c and c follows t.

✷
The above definition defines the relative position of two instructions in a function. Next we define control flow subgraph (cfsg) and call nodes in a cfsg. In order to simplify the definition, we assuming that there are two no-op instructions in each function F denoting the starting and ending of F respectively. Definition 4.2 (Control flow subgraph, call node). A control flow subgraph for a function F is a directed graph cfsg(F ) = (V, E). V contains • A function call node per function call in F ; • A system call node per system call in F ; • A system call number node per system call in F ; • A designated F.enter node and a designated F.exit node.
Function call nodes and system call nodes are the call nodes of cfsg(F ). Each node is identified by a label.
• The instruction that corresponds to v follows (as defined in Definition 4.1) the instruction that corresponds to u; or • u is a system call node and v is the corresponding system call number node.

✷
Each node in a cfsg has a label. The label of a call node could be assigned as the address of the instruction that immediately follows the call if static analysis is applied on binaries, as assumed in Section 4.3 for convenience. The label of a system call number node is the corresponding system call number. As in the definition of execution graphs, we do not differentiate a node and its label, i.e., in the above definition, u and v denote both the nodes and their labels.
The control flow graph cfg(P ) of a program P is obtained by connecting control flow subgraphs of each function in P together to form a new graph.   Figure 2 shows the control flow graph of the program in Example 3.1 (the source code is shown in Figure 1).

✷
The path defined in Definition 4.5 is called observable because it induces a system call, and thus intuitively would be visible to an intrusion detection system monitoring system calls. Numerous white-box process monitors additionally keep track of the active function calls in the process running the program, based on information gathered from static analysis of the program. We define active calls on an observable path as follows.
Definition 4.6 (Active calls on an observable path). Let π = v1, v2, . . . , vn be an observable path in cfg(P ) = (V, E). We define the sequence of active calls on π, denoted A(π), to be the result of the following procedure.

✷
Since v n (the last node on an observable path) does not belong to any call cycles, it is not deleted in the first step of the procedure in Definition 4.6. As such, vi k = vn in the second step of the procedure in Definition 4.6, and this node is not deleted in the second step either (since only nodes vi j for 1 ≤ j < k are eligible to be deleted). In other words, vn is always the second last element in the output of A(π), with the last element being the system call number.
Definition 4.7 (Language accepted by cfg(P )). Let Π be the set of all observable paths in cfg(P ), and for any π ∈ Π, let pre(π) = π1, π2, . . . , πn denote all the observable prefixes of π in order of increasing length, where πn = π. Then, the language accepted by cfg(P ) is [∃π ∈ Π : pre(π) = π1, . . . , πn ]} ✷ Notice that we define the language accepted by cfg(P ) in terms of the system calls it makes and the active functions when each system call is made. A string in the language is a sequence of symbols, each of which describes a system call made by the program.
Example 4.1. Figure 3 shows the source code and the control flow graph of a very simple program, which consists of four functions and makes four different system calls. In the program shown in Figure 3, the second system call made is read, which corresponds to the system call number 3. The following is an observable path from main.enter to the node that makes this system call.   (5) sys_call ( the labels.) Intuitively this similarity is what we are trying to achieve and why execution graphs are very useful in anomaly detection. In Section 4.3 we will formally introduce the relationship between the two, by showing two very useful properties of execution graphs.

Well-Behaved Executions
To this point in the paper, we have not specified the program executions that are useful to build an execution graph (though any execution results in one). However, to prove a relationship with the control flow graph of the program, it is necessary to specify which executions are useful for this purpose. Intuitively, these executions are ones that do not include an attack, and more specifically, for which the return addresses are a reliable reflection of the intended execution of the underlying program. We refer to such executions as well-behaved.
More precisely, denote the execution of program P on input I by P (I). Input string I includes all inputs to the process running P since its initialization, and can include multiple "invocations" if program P is a server program. In this case, the multiple invocations of P are separated in I in a canonical way. The runtime process that executes P (I) maintains a call stack in conformance with certain conventions, induced via the function call and return code emitted by the compiler for the language. While we do not detail these conventions here, we expect that the return address of each stack frame is inserted when the function call occurs and is not modified until return from the function-at which time the stack frame is destroyed. We say that a program P is "well-behaved" on an input I if the execution P (I) conforms strictly to this expectation, i.e., that return address fields in stack frames are modified only in this fashion, and the stack frames are created only when function calls are made by the program P .

Definition 4.8 (Well-behaved executions). Program P is well-behaved on input I if execution P (I) maintains a call stack consisting of stack frames, one per active function call, and such that the return address in each stack frame is not modified while the corresponding function call is active. ✷
Of course, a common method of exploiting a vulnerable program P involves running P on an input I for which it is not well-behaved, i.e., that modifies a return address on the stack when the function call is still active.
The anomaly detector that we describe in this paper is assumed to be trained on the observed behaviors (emitted system calls) in executions P (I1), . . . , P (I k ) where P is well-behaved on each Ij. In this way, the return addresses extracted from the stack (as in [3]) reflect the execution of the program. We denote these executions P (I) = {P (I1), . . . , P (I k )}.

Properties of Execution Graphs
Recall that an execution graph is a model constructed by a gray-box technique. None of the previous gray-box techniques, to our knowledge, has been formally related to the control flow graph of the underlying program. The execution graph differs from these approaches in the sense that the language accepted by an execution graph can be directly related to the language accepted by the control flow graph of the underlying program. Moreover, this relationship can be proved analytically. This is a significant improvement since goals of many white-box techniques can now be achieved using gray-box techniques, i.e., without static analysis on the source code or binary.
Here we show two theorems of the execution graph and the control flow graph of a program. Without loss of generality, we assume that the label of a call node in the control flow graph is the address of the instruction that immediately follows the function call or system call, which is easily obtained by static analysis of the binary. If this is not the case, e.g., if static analysis is applied on the source code, there is always a one-to-one mapping between the labels and these addresses. For convenience, we omit this mapping in the following theorems and the proofs in the Appendices.
Theorem 4.1. If P is a program that is well-behaved on input I, then L eg(P (I)) ⊆ L cfg(P ) .
The proof of this theorem is in Appendix A. Theorem 4.1 says that the language accepted by an execution graph is a subset of the language accepted by the control flow graph of the program, which is a property unavailable in most other gray-box techniques. It provides another level of confidence: if some execution is allowed by an execution graph, it is guaranteed that the execution is not only normal ("similar" to past executions), but also valid (allowed by the control flow graph). Such a property could only be achieved previously by white-box techniques.
Theorem 4.1 only says that L eg(P (I)) ⊆ L cfg(P ) . They are not equal because, e.g., the input I might not cover all possible executions of the program, in which case there is no way for eg(P (I)) to safely accept such a missing execution, even with the inductive definition in Definition 3.2.
Theorem 4.2 shows that if the execution graph were to be extended to allow any additional strings in the language, it could accept some intrusions that program P does not allow.
Theorem 4.2. Let I be a set of inputs, and eg(P (I)) be an execution graph where P is well-behaved on I. There exists a program P , which is also well-behaved on I, such that L cfg(P ) = L eg(P (I)) . Theorem 4.2 states that for any input I and the execution graph obtained on input I, there exists a program P which is well-behaved on I, such that the language accepted by the control flow graph of this program is the same as the language accepted by the execution graph. This means that the execution graph is the "accurate" model of some program P . Since there exists such a program P , if the execution graph were to be extended to accept any additional string in its language, it will allow an intrusion to the program P . Informally, this means that the execution graph is a maximal graph given the set of input.
Please refer to Appendix B for a proof of Theorem 4.2.

PERFORMANCE EVALUATION
In this section we provide insight into the likely performance of our technique in an anomaly detection system. During program monitoring there are two tasks the anomaly detector needs to perform for each system call: (i) to walk through the stack frames and obtain all return addresses; (ii) to determine whether the current system call is allowed. We previously measured the cost of extracting program return addresses and found that for a Linux kernel compilation it adds less than 6% to the overall execution time. Therefore, extracting return addresses from the running process should introduce only moderate overhead.
Second, we measure the time it takes to process system calls when using our execution graph model. We observe the executions of four common FTP and HTTP server programs, wu-ftpd, proftpd, Apache httpd, and Apache httpd with a chroot patch, and extract the execution graphs from them. Information, including return addresses, of every system call is recorded into log files, and subsequently processed to detect anomalies. We measure the time it takes to process these system calls by running the anomaly detector on a desktop computer with an Intel Pentium IV 2.2 GHz CPU. Results are shown in Table 1.
Although the average processing time per system call is very different for these four programs (due to the different number of functions in the program and consequently the different number of return addresses to be processed for each system call), results show that program monitoring is extremely efficient when using the execution graph model.

CONCLUSION
We introduce a new model of system call behavior for anomaly detection systems, called an execution graph. Execution graph is the first gray-box model that conforms to the control flow graph of the program. We show that: an execution graph accepts only strings (defined as sequences of system calls and the active function calls when each system call occurs) that are consistent with the control flow graph of the program; and it is maximal given a set of training data, i.e., any extensions to the execution graph might make some intrusions undetected. Finally, we provide evidence that program monitoring using the execution graph is very efficient.

A. PROOF OF THEOREM 4.1
We first prove the following lemmas. As stated in Section 4.3, without loss of generality, we assume that the label of any call node in the control flow graph is in fact the address of the instruction that immediately follows the call. If this is not the case, e.g., if the control flow graph is obtained by static analysis of the source code, there is always a oneto-one mapping between the labels and these addresses. For convenience, we omit this mapping in the following proofs.
In the following proofs, we use µ to denote the length of a function call or system call instruction. Since we assume that the label of any call node x in the control flow graph is the address of the instruction that immediately follows the call, x − µ represents the address of the corresponding call instruction. We use Fv to denote the function in P that consists of node v.
Lemma A.1. Let P be a program that is well-behaved on input I. Let eg(P (I)) = (V, E call , Ecrs, Ertn) and cfg(P ) = (V , E ), then V ⊆ V .

Proof.
v ∈ V ∧ v is a leaf node ⇒ P is able to make a system call with system call num- ⇒ v is one of the return addresses observed when P makes a system call ⇒ (v − µ) is the address of a call instruction ⇒ (v − µ) corresponds to some function or system call Notice that there could be v / ∈ V while v ∈ V , because input I does not necessarily cover all possible executions of P , and that some executions allowed by cfg(P ) might never appear in actual runs.
Lemma A.2. Let P be a program that is well-behaved on input I . If r1, r2, . . . , r l , . . . and r 1 , r 2 , . . . , r l , . . . are two observations in P (I), such that for each 1 ≤ i < l, ri = r i , r l = r l , then for some function F ∈ P , r l and r l are both in cfsg(F ).

Proof.
r1, r2, . . . , r l , . . . and r 1 , r 2 , . . . , r l , . . . are two observations ⇒ (r l − µ) is in a function that is called from (r l−1 − µ); (r l − µ) is in a function that is called from (r l−1 − µ) For each 1 ≤ i < l, ri = r i , r l = r l ⇒ r l and r l are addresses in the same function (Lemma A.2) ⇒ any function call nodes on the path from r to r must form a (series of) call cycles (completed function calls) ⇒ there must be a path in cfg(P ) from r to r that satisfies the claimed properties. ✷ Lemma A.4. Let P be a program that is well-behaved on input I. Let eg(P (I)) = (V, E call , Ecrs, Ertn) and cfg(P ) = (V , E ). If (r, r ) ∈ E call and r is not a leaf node, then there exists a sequence of nodes v1, v2, . . . , vn in cfg(P ) such that Proof. According to Definition 3.2, (r, r ) ∈ E call results from at least one of the following three conditions. We prove Lemma A.4 in all these three conditions.  o = r1, r2, . . . , r l , . . . , o =  r 1 , r 2 , . . . , r l , . . . , r, r , . . . , and for each 1 ≤ i < l, ri = r i and (r l , r l ) ∈ Ecrs ⇒ after the system call corresponding to o is executed, execution has to return to cfsg(Fr l ) and then follow the path as described in Lemma A.3 and subsequently enter cfsg(Fr) and cfsg(F r ) in order to make system call that corresponds to o ⇒ instruction at (r−µ) calls function F r , and there must be a path in cfg(P ) from F r .enter to r that satisfies the claimed properties.
• (First induction of Definition 3.2) Given (x0, x1) ∈ E call , x1 crs → x2, (x2, x3) ∈ Ertn, x3 = r and x1 = r (x0, x1) ∈ E call ⇒ there exists a path in cfg(P ) from Fx 1 .enter to x1 that satisfies the claimed properties. (Base case in this proof) Since we have already found the path from Fx 1 .enter to x1 that satisfies the claimed properties, it only remains to prove that (x3, -When x3 = r and x1 = r (x0, x1) ∈ E call ⇒ there exists a path in cfg(P ) from Fx 1 .enter to x1 that satisfies the claimed properties.
(Base case in this proof) Since we have already found the path from Fx 1 .enter to x1 satisfying the claimed properties, it only remains to prove that (x3, Proof. s is a successor of s ⇒ there exists an integer k such that rm rtn → r k , (r k , r k ) ∈ Ecrs, r k call → r m and for each 1 ≤ i < k, ri = r i (Definition 3.5) ⇒ there exist three paths from rm−1 to r k (Lemma A.5), from r k to r k (Lemma A.3) and from r k to r m −1 (Lemma A.4) ⇒ connecting the above 3 paths together forms the sequence of nodes with the claimed properties. Proof. According to Lemma A.6, there exists a path β0 = main.enter, . . ., r1,1, Fr 1,2 .enter, . . ., r1,2, . . ., r1,m 1 −1 in cfg(P ). Since for each 1 ≤ i < m1, (r1,i, r1,i+1) ∈ E call (Definition 3.6), r1,m 1 −1 is the only system call node on β0 (Lemma A.4).

B. PROOF OF THEOREM 4.2
To show the existence of such a program P , we (i) build a graph G from the execution graph eg(P (I)); (ii) show that G is the control flow graph of some program P that is well-behaved on input I, i.e., cfg(P ) = G ; and (iii) show that L cfg(P ) = L eg(P (I)) .
Definition B.1 (E2G). The operation E2G takes as input an execution graph eg(P (I)) = (V , E call , Ecrs, Ertn) and performs the following operations: Operation E2G returns the graph G . ✷ In the above definition, M is the set of nodes that represent addresses in main(). With Definition B.1, we are done with the first step in our proof. The next step is to prove that cfg(P ) = G for some program P that is also well-behaved on input I. Lemma B.1. If graph G = (V , E ) is the output of operation E2G on an execution graph eg(P (I)), then there exists some program P which is well-behaved on input I, such that cfg(P ) = G .
Though we do not provide the proof of Lemma B.1, the following is the intuition. From Definition B.1, one can notice that graph G contains a set of subgraphs, which are connected by directed edges from a function call node to the entry node of the function subgraph, and from the exit node of the function subgraph to the same function call node. Besides that, each subgraph contains function call nodes and system call nodes, as well as one entry node and one exit node. When given this graph G , programming languages such as C and C++ can be used to implement each subgraph as a function, and implement the entire graph G as a program P . If implemented correctly, the implementation output P will be well-behaved on input I, and the control flow graph of P will be the same as G .
The last step in our proof of Theorem 4.2 is to show that L cfg(P ) = L eg(P (I)) , where cfg(P ) = G = (V , E ). To prove this we need to show that (i) L eg(P (I)) ⊆ L cfg(P ) , and (ii) L cfg(P ) ⊆ L eg(P (I)) . The proof of (i) is very similar to the proof of Theorem 4.1 and it is skipped in this paper. We only show the important lemmas for the proof of (ii). Notice that the difference between these two proofs and those in Appendix A is that here V and E are given as in Definition B.1, whereas in Theorem 4.1 they are not given.
Lemma B.2. Let P be a program that is well-behaved on input I, and E2G (eg(P (I))) = G . If π is an observable path in G , then there exists an execution stack s in eg(P (I)) such that s = A(π).