Model Checking C++ Programs

In the last three decades, memory safety issues in system programming languages such as C or C++ have been one of the significant sources of security vulnerabilities. However, there exist only a few attempts with limited success to cope with the complexity of C++ program verification. Here we describe and evaluate a novel verification approach based on bounded model checking (BMC) and satisfiability modulo theories (SMT) to verify C++ programs formally. Our verification approach analyzes bounded C++ programs by encoding into SMT various sophisticated features that the C++ programming language offers, such as templates, inheritance, polymorphism, exception handling, and the Standard C++ Libraries. We formalize these features within our formal verification framework using a decidable fragment of first-order logic and then show how state-of-the-art SMT solvers can efficiently handle that. We implemented our verification approach on top of ESBMC. We compare ESBMC to LLBMC and DIVINE, which are state-of-the-art verifiers to check C++ programs directly from the LLVM bitcode. Experimental results show that ESBMC can handle a wide range of C++ programs, presenting a higher number of correct verification results. At the same time, it reduces the verification time if compared to LLBMC and DIVINE tools. Additionally, ESBMC has been applied to a commercial C++ application in the telecommunication domain and successfully detected arithmetic overflow errors, potentially leading to security vulnerabilities.


INTRODUCTION
Formal verification techniques can significantly positively impact software reliability as security becomes a significant concern [1]. For more than 30 years now, memory safety issues in system programming languages such as C or C++ have been among the major sources of security vulnerabilities [2]. For instance, the Microsoft Security Response Center reported that approximately 70% of their security vulnerabilities every year are due to memory safety issues in their C and C++ code [3]. Beyond memory safety, undefined behavior (e.g., signed integer overflow) also represents another crucial source of errors that could potentially lead to security issues [4].
Software verification plays an essential role in ensuring overall product reliability. Over the last 15 years, formal techniques dramatically evolved [5], its adoption in industry has been growing [6][7][8], and several tools to formally verify C programs have been proposed [9]. However, there exist only a few attempts with limited success to cope with the complexity of C++ program verification [10][11][12][13][14][15]. The main challenge here is to support sophisticated features that the C++ programming language offers, such as templates, sequential and associative template-based containers, strings & streams,

Bounded Model Checking
In BMC, the program to be analyzed is modeled as a state transition system, which is extracted from the control-flow graph (CFG) [33]. This graph is built as part of a translation process from program code to static single assignment (SSA) form. A node in the CFG represents either a (non-) deterministic assignment or a conditional statement, while an edge in the CFG represents a possible change in the program's control location. Given a transition system M, a property φ, and a bound k, BMC unrolls the system k times and translates it into a VC ψ, such that ψ is satisfiable if and only if φ has a counterexample of length k or less [16]. The associated model checking problem is formulated by constructing the following logical formula: given that φ is a safety property, I is the set of initial states of M and T (s i , s i+1 ) is the transition relation of M between steps i and i + 1. Hence, I(s 0 ) ∧ j−1 i=0 T (s i , s i+1 ) represents the executions of M of length j and the formula (1) can be satisfied if and only if, for some j ≤ k, there exists a reachable state at step j in which φ is violated. If the formula (1) is satisfiable, then the SMT solver provides a satisfying assignment, from which we can extract the values of the program variables to construct a counterexample. A counterexample for a property φ is a sequence of states s 0 , s 1 , · · · , s k with s 0 ∈ S 0 and T (s i , s i+1 ) with 0 ≤ i < k.
If the formula (1) is unsatisfiable, we can conclude that no error state is reachable in k steps or less. In this case, BMC techniques are not complete because there might still be a counterexample that is longer than k. Completeness can only be ensured if we know an upper bound on the depth of the state space. This means that if we can ensure that we have already explored all the relevant behavior of the system, and searching any deeper only exhibits states that have already been verified [34].

STATIC TYPE CHECKING OF C++ PROGRAMS
The first steps when verifying C++ programs are the source-code parser and the type-checker, which are language-specific in ESBMC (see Fig. 1). For C++, the parser is heavily based on the GNU C++ Compiler (GCC) [39], which allows ESBMC to find and report most of the syntax errors already reported by GCC. Type-checking provides all information used by the model; thus, a better type-checker means it is possible to model more programs. The code is statically analyzed on type-checking, including assignment checks, type-cast checks, pointer initialization checks, and function call checks. Furthermore, ESBMC handles three major C++ features on typechecking: template instantiation (i.e., after type-checking, all referenced templates are instantiated with concrete types), compile-time and runtime polymorphism, and inheritance (i.e., it replicates the methods and attributes of the base classes to the inherited class, which will have direct access). By the end of the type-check, the Intermediate Representation (IR) creation is completed and used by the GOTO converter to generate the GOTO program. The verification of C programs is slightly different as it uses clang as a front-end to parse and type-check the program, as described in our previous work [19,20]; the output, however, it is the same: a type-checked IR.
The GOTO converter converts the type-checked IR into GOTO expressions; this conversion simplifies the IR of the original program (e.g., replacing of switch and while by if and goto statements). The symbolic engine converts the GOTO program into SSA form [40] by unrolling loops up to bound k. Assertions are inserted into the resulting SSA expressions to verify memorysafety properties (e.g., array out-of-bounds access, arithmetic under-and overflow, memory leaks, double frees, division-by-zero, etc.). Also, most of the exception handling is carried out in this step, such as the search for valid catch, assignment of a thrown object to a valid catch object, replacement of throw statements by GOTO expressions and exception specs for function calls (cf. Section 5).
Finally, two sets of quantifier-free formulae are created based on the SSA expressions: C for the constraints and P for the properties, as previously described. The two sets of formulae will be used as input for an SMT solver that will produce a counterexample if there exists a violation of a given property, or an unsatisfiable answer if the property holds.

Template Instantiation
Templates are not runtime objects [41]. When a C++ program is compiled, classes and functions are generated from templates. Those templates are removed from the final executable. ESBMC has a similar process in which templates are only used until the type-checking phase, where all templates are instantiated and the classes and functions are generated. Any instantiated functions and classes are no longer templates. Hence, at the end of the type-checking phase, all templates are completely discarded. In ESBMC, the entire verification process of C++ programs, which make use of templates, is essentially split into two steps: creation of templates and template instantiation. The creation of templates is straightforward. It happens during the parsing step when all generic data types of the generated C++ IR are properly marked as generic and each specialization is paired with its corresponding primary template. No instantiated function or class is created during parsing because ESBMC does not know which template types will be instantiated. A template instantiation happens when a template is used, instantiated with data types (e.g., int, float, or string). ESBMC performs an in-depth search in the C++ IR during the type-checking process to trigger all instantiations. When a template instantiation is found, ESBMC firstly identifies which type of template it is dealing with (i.e., either class or function template) and which template arguments are used. It then searches whether an IR of that type was already created, i.e., whether its arguments have been previously instantiated. If so, no new IR is created; this avoids duplicating the IR, thus reducing the memory requirements of ESBMC. If there is no IR of that type, a new IR is created, used in the instantiation process, and saved for future searches. To create a new IR, ESBMC must select the most specialized template for the set of template arguments; therefore, ESBMC performs another search in the IR to select the proper template definition. ESBMC then checks whether there is a (partial or explicit) template specialization, matching the set of data types in the instantiation. If ESBMC does not find any template specialization, which matches the template arguments, it will select the primary template definition. Once the most specialized template is selected, ESBMC performs a transformation to replace all generic types for the data types specified in the instantiation; this transformation is necessary because, as stated previously, at the end of the C++ type-checking phase, all templates are removed.  In order to concretely demonstrate the instantiation process in ESBMC, Fig. 2 illustrates an example of function templates usage, which is based on the example spec29 extracted from the GCC test suite. 1 The first step, the template creation, happens when the declaration of a template function (lines 5-19) is parsed. At this point, the generic IR of the template is created with a generic type. The second step, template instantiation, happens when the template is used. In Fig. 2, the template is instantiated twice (lines 23 and 24). It is also possible to determine the type implicitly (line 23) or explicitly (line 24). In implicit instantiation, the data type is determined by the types of the used parameters. In contrast, in the explicit instantiation, the data type is determined by the value passed between the < and > symbols. Fig. 3 illustrates the generic IR and the instantiated IRs generated from the code in Fig. 2. Fig. 3a illustrates the generic IR generated from the qCompare function template and its specialization, while Fig. 3b  float (line 23) and int (line 24). The function body is omitted in this figure, but it follows the same instantiation pattern. The generic IR is built with the function name, which is used as a key for future searches, the IR's arguments and return type, as can be seen in Fig. 3a. Note that the data type is labeled as generic, which means that the type is generic. In Fig. 3b, the data types that were previously labeled as generic are now labeled as float for the first instantiation and int for the second instantiation, which means that these instantiated IRs are not templates anymore and will not be removed at the end of the type-check phase. Finally, as described earlier, at the end of the type-check phase, the generic IR illustrated in Fig. 3a is discarded.  After the template instantiation, the verification process resumes, as described by Cordeiro et al. [42]. ESBMC is currently able to handle the verification of C++ programs with template functions, class templates, and (partial and explicit) template specialization, according to the C++03 standard [43]. The implementation of template instantiation in ESBMC is based on the formalization previously presented by Siek and Taha [44] who introduced the first proof of type safety of the template instantiation process for C++03 programs.

Inheritance
In contrast to Java, which only allows single inheritance [45], where derived classes have only one base class, C++ also allows multiple inheritances, where a class may inherit from one or more unrelated base classes [46]. This particular feature makes C++ programs harder to model check than programs in other object-oriented programming languages (e.g., Java) since it disallows the direct transfer of techniques developed for other, simpler programming languages [47,48]. Multiple inheritance in C++ includes features that raise exciting challenges for model checking such as repeated and shared inheritance of base classes, object identity distinction, and dynamic dispatch [49].
In ESBMC, inheritance is handled by replicating the methods and attributes of the base classes to the derived class, obeying the rules of inheritance defined in the C++03 standard [43]. In particular, we follow these specifications to handle multiple inheritance and avoid issues such as name clashing when replicating the methods and attributes. For example, if two or more base classes implement a method that is not overridden by the derived class, every call to this method must specify which "version" inherited it is referring to. The rules are checked in the type-check step of the verification (cf., Section 3). A formal description to represent the relationship between classes can be described by a class hierarchy graph. This graph is represented by a triple C, ≺ s , ≺ r , where C is the set of classes, ≺ s ⊆ C × C refers to shared inheritance edges (i.e., if there exists a path from class X to class Y whose first edge is virtual), and ≺ r ⊆ C × C are replicated inheritance edges (i.e., if a class inherits from a base class that does not contain virtual methods). We also define the set of all inheritance edges ≺ sr = ≺ s ∪ ≺ r . Thus, (C, ≤ sr ) is a partially ordered set [50] and ≤ sr is anti-symmetric (i.e., if one element A of the set precedes B, the opposite relation cannot exist). Importantly, during the replication process of all methods and attributes from the base classes to the derived ones, the inheritance model considers the access specifiers related to each component (i.e., public, protected, and private) and its friendship [46]; therefore, we define two rules to deal with such restrictions: (i) only public and protected class members from base classes are joined in the derived class and (ii) if class X ∈ C is a friend of class Y ∈ C, all private members in class X are joined in class Y.
As an example, Fig. 4 shows an UML diagram that represents the Vehicle class hierarchy, which contains multiple inheritance. The replicated inheritance in the JetCar class relation can be formalized by C, ∅, {(JetCar, Car), (JetCar, Jet)} .  ESBMC creates an intermediate model for single and multiple inheritance, handling replicated and shared inheritance where all classes are converted into structures and all methods and attributes of its parent classes are joined. This approach has the advantage of having direct access to the attributes and methods of the derived class and thus allows an easier validation, as the tool does not search for attributes or methods from base classes on each access. However, we replicate information to any new class, thus wasting memory resources.
In addition, we also support indirect inheritance, where a class inherits features from a derived class with one or more classes not directly connected. Indirect inheritance is automatically handled due to our replication method: any derived class will already contain all methods and attributes from their base classes, which will be replicated to any class that derives from them. In Fig. 4, we have JetCar ≤ sr Car and Car ≤ sr Vehicle. Thus, the JetCar class can access features from the Vehicle class, but they are not directly connected.
In object-oriented programming, the use of shared inheritance is very common [46]. In contrast to other approaches (e.g., the one proposed by Blanc, Groce, and Kroening [12]), ESBMC is able to verify this kind of inheritance. A pure virtual class does not implement any method and, if an object tries to create an instance of a pure virtual class, ESBMC will fail with a CONVERSION ERROR message (since it is statically checked during type-checking).

Polymorphism
In order to handle polymorphism, i.e., allowing variable instances to be bound to references of different types, related by inheritance [51], ESBMC implements a virtual function table (i.e., vtable) mechanism [52]. When a class defines a virtual method, ESBMC creates a vtable, which contains a pointer to each virtual method in the class. If a derived class does not override a virtual  Consider the program in Fig. 5, which contains a simplified version of the class hierarchy presented in Fig. 4. In the program, a class Vehicle is base for two classes, Motorcycle and Car. The class Vehicle defines a pure virtual method number of wheel(), and both classes Motorcycle and Car implement the method, returning 2 and 4, respectively. The program creates an instance of Motorcycle or Car, depending on a nondeterministic choice, and assigns the instance to a Vehicle pointer object v. Finally, through the polymorphic object v, the program calls number of wheel() and checks the returned value. We omit a call to delete (that would free the pointer v) to simplify the GOTO instructions. Fig. 6a shows the GOTO program (resulted from the type-checking phase) generated for the program in Fig. 5. Note that, when building the polymorphic object v, the vtable's pointer for the method number of wheel() is first assigned with a pointer to the method number of wheel() in class Vehicle (see lines 10 and 17 in Fig. 6a); this happens because the constructor for both Car and Motorcycle first call the base constructor in the original program (see lines 13 and 20 in Fig. 5). They are then assigned the correct method address (see lines 12 and 19 in Fig. 6a) in the constructors of the derived classes, i.e., Motorcycle and Car, respectively.
In the SSA form shown in Fig. 6b, every branch creates a separate variable, which are then combined when the control-flow merges. In Fig. 6b

C++ OPERATIONAL MODEL
The C++ programming language offers a collection of libraries, called STL, to provide most of the functionalities required by a programmer [43]. However, the direct inclusion of the STL into the verification process over-complicates the analysis of C++ programs, as it contains code fragments not relevant for verification (e.g., optimized assembly code) [15,21]. Its implementation is based on a pointer structure that degrades the verification performance [12]. In particular, existing BMC tools adopt two different memory models: a fully byte-precise [10] or an object-based [53,54] memory model. Note that BMC tools reduce bounded program traces to a decidable fragment of first-order logic, which requires us to eliminate pointers in the model checker. They use static analysis to approximate each pointer variable the set of data objects (i.e., memory chunks) at which it might point at some stage in the program execution. For a fully byte-precise memory model, BMC tools treat all memory as a single byte array, upon which all pointer accesses are decomposed into byte operations. This can lead to performance problems due to the repeated updates to the memory array that needs to be reflected in the SMT formula. For an object-based memory model, this approach's performance suffers if pointer offsets cannot be statically determined, e.g., if a program reads a byte from an arbitrary offset into a structure. The resulting SMT formula is large and unwieldy, and its construction is error-prone.
To reduce verification complexity, ESBMC uses an abstract representation of the STL, called the C++ Operational Model (COM), which adds function contracts [55] (i.e., pre-and post-conditions)  [12], has been used to verify preconditions on programs. However, ESBMC extends that approach by also checking the post-conditions, which improves its effectiveness, as shown in our experimental evaluation (cf., Section 6). Fig. 7a shows a code snippet considered as the best-accepted answer for a Stack Overflow question 1 . Nevertheless, line 10 could lead to an out-of-bound violation (CWE-125 vulnerability) [56]. ESBMC detects the erroneous state through the operational model for vector::operator[] (see Fig. 7b), which contains an assertion to check for out-of-bound accesses. The model also keeps track of the values stored in the container using a buffer (buf), so it also guarantees the post-condition for the operator, i.e., return a reference to the element at specified location i.   Our COM mimics the structure of the STL, as shown in Table I. All ANSI-C libraries are natively supported by ESBMC, as described by Cordeiro et al. [17]. For all libraries under categories General, Language Support, Numeric, and Localization, COM adds pre-conditions extracted directly from documentation [43], specifically designed to detect memory-safety violations (e.g., nullness and out-of-bounds checks).
One of the challenges of modeling COM is the support for containers, strings, and streams, which requires the injection of pre-and post-conditions to check for functional properties correctly, as shown in the example illustrated in Fig. 7b (cf. the pre-conditions in lines 4-5). In this specific example, we check the vector upper and lower bounds before retrieving its content to detect an out-of-bounds read in line 10 of Fig. 7a. COM models sequential and associative containers along with their iterators. In particular, libraries list, bitset, deque, vector, stack, and queue belong to the sequential group, while libraries map, multimap, set, and multiset belong to the associative group. COM models strings and streams objects as arrays of bytes to properly encode them using the theory of arrays (cf., Section 2.2); therefore, string and all Stream I/O libraries also belong to the sequential group.

Core Language
The gist of COM enables ESBMC to encode features of standard containers, strings, and streams using the theory of arrays T A . To properly formalize the verification of our model, we extend the previous core container language presented by Ramalho et al. [21] to include a representation for keys, which allows us to reason about associative containers as well. The core language defines the syntactic domains values V, keys K, iterators I, pointers P, container C and integers N as follows, Here v, k, p, i, c and n are classes of variables of type V, K, P, I, C and N, respectively. For iterators, we use the notation * i v to denote the value stored in the memory location i v . Based on such domains, we also define P(+ | −)P as valid pointer operations and N(+ | * | . . .)N as valid integer operations. Each operation shown in the core container syntax (e.g., C.insert(I, V)) is explained in Sections 4.2 and 4.3.
All methods from the sequential and associative groups can be expressed as combinations/variations of three main operations: insertion (C.insert(I, V)), deletion (C.erase(I)), and search (C.search(V)). Each operation is described in our model as a Hoare triple {P} C {Q} that represents the function contract scheme implemented by COM. Normally all side-effects would be stated in the post-condition Q for verification. However, as part of the SSA transformation, side effects on iterators and containers are made explicit. Operations return new iterators and containers with the same contents, except for the fields that have just been updated. Thus, the translation function C contains primed variables (e.g., c and i ) to represent the state of model variables after the respective operation. Finally, all models take advantage of memcpy pattern through lambda terms [37], which enables us to describe array operations over multiple indices on a clear and concise manner (cf., Section 2.2).

Sequential Containers
Sequential containers are built into a structure to store elements with a sequential order [46]. In our model, a sequential container c consists of a pointer c v that points to a valid memory location and an integer size that stores the number of elements in the container. Similarly, an iterator i is modeled using two variables: an integer i pos , which contains the index value of the container pointed by the iterator and a pointer i v , which points to the memory location referred by the iterator. In our model, the defined notation * i is equivalent to select(i v , i pos ). Fig. 8 gives an overview of our abstraction for all sequential containers. The statement c.insert(i, v) becomes (c , i ) = c.insert(i, v) increases the container size, move all elements from position i.pos one memory unit forward, and then insert v into the specified position. Therefore 1 , that induces the following pre-and post-conditions, where null represents an uninitialized pointer/object. Thus, we define as pre-conditions P that v and i can not be uninitialized objects as well as i.pos must be within c .c v bounds; similarly, we define as post-conditions Q that v was correctly inserted in the position specified by i as well as c .c v and i .i v are equivalent, i.e., both point to the same memory location. Importantly, we implement the memory model for containers essentially as arrays, therefore, the range to select elements from memory varies from 0 to c.size − 1. Furthermore, the main effect of the insert method is mainly captured by Eq. (2) that describes the contents of the container array c .c v after the insertion in terms of update operations to the container array c.c v before the insertion.
The erase method works similarly to the insert method. It uses iterator positions, integer values, and pointers, but it does not use values since the exclusion is made by a given position, regardless of the value. It also returns an iterator position (i.e., i ), pointing to the position immediately after the erased part of the container [43]. Therefore, that induces the following pre-and post-conditions, 1 Note that SMT theories only have a single equality predicate (for each sort). However, here we use the notation ":=" to indicate an assignment of nested equality predicates on the right-hand side of the formula.
where we assume as pre-conditions P that i must be a valid iterator pointing to a position within the bounds of array c.c v and c must be non-empty; similarly, we assume as post-conditions Q that i must point to the element immediately after the erased one and c .c v and i .i v point to the same memory location. Finally, a container c with a call c.search(v) performs a search for an element v in the container. Then, if such an element is found, it returns an iterator that points to the respective element; otherwise, it returns an iterator that points to the position immediately after the last container's element (i.e., select(c .c v , c .size)). Hence, that induces the following pre-and post-conditions, where we assume as pre-conditions P that v and c can not be an uninitialized objects; similarly, we assume as post-conditions Q that c is equivalent to its previous state c, c .c v and i .i v point to the same memory location, and i must point to the found element or to select(c .c v , c .size). Associative containers consist of elements with a key k and a value v, where each value is associated with a unique key. All elements are internally sorted by their keys based on a strict weak ordering rule [43]. In our model, an associative container c consists of a pointer c v , for the container's values, a pointer c k , for the container's keys, and an integer size, for the container's size. Fig. 9 gives an overview of our abstraction for all associative containers. The relationship between c k and c v is established by an index, thus, an element in a given position n in c k (i.e., select(c.c k , n)) is the key associated with the value in the same position n in c v (i.e., select(c.c v , n)). Similarly, iterators for associative containers consist of a pointer i k that points to the same memory location as c k , a pointer i v that points to the same memory location as c v , and an integer i pos that indexes both i k and i v . All operations for associative containers can be expressed as a simplified variation of the three main ones, i.e., insertion (C.insert(K, V)), deletion (C.erase(I)), and search (C.search(K)). The order of keys matters in the insertion operation for associative containers. Therefore, given a container c, the method calls c.insert(k, v) inserts the value v associated with the key k into the right order (i.e., obeying a strict weak ordering rule). Here, we use the operator ≺ to represent precedence; thus, x ≺ y means x precedes y. The insertion returns an iterator that points to the inserted position. However, if k exists, the insertion is not performed and the method returns an iterator that points to the existing element. We checked for three cases, which correspond to each ite condition: (i) the empty case first, then (ii) we check whether each position contains a corresponding key or (iii) if we should insert the value based on its precedence. Thus,

Associative Containers
that induces the following pre-and post-conditions, where we assume as pre-conditions P that v and k must be initialized objects, as well as the order of elements, obey a strict weak ordering rule. Similarly, we assume as post-conditions Q that the iterator i will point to the container c , and the strict weak ordering rule will be maintained. We also check whether the size of the container will grow if the key k was not used before; however, this check is bypassed for containers that allow multiple keys. Remove operations are represented by c.erase(i), where i is an iterator that points to the element to be removed. Similarly to sequential containers (cf., Section 4.2), the model for such operation basically shifts backwards all elements followed by that specific position i. Thus, that induces the following pre-and post-conditions, which have similar properties as the ones held by the erase method from sequential containers, except that i .i k must point to the position immediately after the erased one and the equivalency of c .c k and i .i k . Finally, search operations over associative containers are modeled by a container c with a method call c.search(k). Then, if an element with key k is found, the method returns an iterator that points to the corresponding element; otherwise, it returns an iterator that points to the position immediately after the last container's element. Hence, that induces the following pre-and post-conditions, that are also similar to the properties held by the search operation from sequential containers, except that the search happens over keys.

EXCEPTION HANDLING
Exceptions are unexpected circumstances that arise during the execution of a program, e.g., runtime errors [46]. In C++, the exception handling is split into three (basic) elements: a try block, where a thrown exception can be directed to a catch statement; a set of catch statements, where a thrown exception can be handled; and a throw statement that raises an exception.
To accurately define the verification of exception handling in C++, we formally define two syntactic domains, including exceptions E and handlers H as follows: In this context, e and h are classes of variables of type E and H, respectively. We use the notation e [] to denote a thrown exception of type array, e f () is a thrown exception of type function, e * is a thrown exception of type pointer, and e null is an empty exception used to track when a throw expression does not throw anything. Similarly, we use the notation h [] to denote a catch statement of type array, h f () is a catch statement of type function, h * is a catch statement of type pointer, h v is a catch statement of type void pointer (i.e., void * ), h ... is a catch statement of type ellipsis [43], and h null is an invalid catch statement used to track when a thrown exception does not have a valid handler.
Based on such domains, we must define a 2-arity predicate M(e, h), which evaluates whether the type of thrown exception e is compatible with the type of a given handler h as shown in Eq. (20). Furthermore, we declare the unary function ζ : H * −→ H that removes qualifiers const, volatile, and restrict from the type of a catch statement c. We also define the 2-arity predicates unambiguous base U(e, h) and implicit conversion Q(e, h). On one hand, U(e, h) determines whether the type of a catch statement h is an unambiguous base [43] for the type of a thrown exception e as shown in Eq. (21). On the other hand, Q(e, h) determines whether a thrown exception e can be converted to the type of the catch statement h, either by qualification or standard pointer conversion [43] as shown in Eq. (22).
, type of e is matches to the type of h ⊥, otherwise U(e, h) def = , c is an unambiguous base of e ⊥, otherwise Q(e, h) def = , e can be implicit converted to h ⊥, otherwise The C++ language standard defines rules to connect throw expressions and catch statements [43], which are all described in Table II. Each rule represents a function r k : E −→ H for k = [1 .. 9], where a thrown exception e is mapped to a valid catch statement h. ESBMC evaluates every thrown exception e against all rules and all catch statements in the program through the (n + 1)-arity function handler H. As shown in Eq. (23), after the evaluation of all rules (i.e., h r 1 , ..., h r 9 ), ESBMC returns the first handler h r k that matched the thrown exception e.
To support exception handling in ESBMC, we extended our GOTO conversion code and the symbolic engine. In the former, we had to define new instructions and model the throw expression as jumps. In the latter, we implemented the rules for throwing and catching exceptions, as shown in Table II, and the control flows for the unexpected and terminate handlers (cf., Section 5.2).  Catches an exception if its type is a pointer of a given type x and the type of the thrown exception is an array of the same type x.
Catches an exception if its type is a pointer to function that returns a given type x and the type of the thrown exception is a function that returns the same type x.
Catches an exception if its type is an unambiguous base type for the type of the thrown exception.
ite(∃h · U(e, h), h r5 = h, h r5 = h null ) Catches an exception if the type of the thrown exception e can be converted to the type of the catch h, either by qualification or standard pointer conversion [43].
Catches an exception if its type is a void pointer h v and the type of the thrown exception e is a pointer of any given type.
Catches any thrown exception if its type is ellipsis.
ite(∀e · ∃h · h = h ... , h r8 = h ... , h r8 = h null ) The GOTO conversion slightly modifies the exception handling blocks H. The following instructions model a try block: a CATCH instruction to represent the start of the try block, the instructions representing the code inside the try block, a CATCH instruction to represent the end of the try block and a GOTO instruction targeting the instructions after the try block. Each catch statement is represented using a label, the instructions representing the exception handling and a GOTO instruction targeting the instructions after the catch block.
We use the same CATCH instruction to mark the beginning and end of the try block. However, CATCH instructions at the beginning and at the end differ by the information they hold; the CATCH instruction that marks the beginning of a try block has a map from the types of the catch statements and their labels in the GOTO program, while the second CATCH instruction has an empty map. The GOTO instruction targeting the instructions after the catch block shall be called in case no exception is thrown. The GOTO instructions at the end of each catch are called so that only the instructions of the current catch is executed, as shown in Fig. 10.
During the SSA generation, when the first CATCH instruction is found, the map is stacked because there might be nested try blocks. If an exception is thrown, ESBMC encodes the jump to a catch statement according to the rules defined in Table II  triggers a verification error, i.e., it represents an exception thrown that can not be caught. If a suitable exception handler is found, then the thrown value is assigned to the catch variable (if any); otherwise, if there exists no valid exception, an error is reported. If the second CATCH instruction is reached and no exception was thrown, the map is freed for memory efficiency. The try block is handled as any other block in a C++ program. Destructors of variables in the stack are called by the end of the scope. Furthermore, by encoding throws as jumps, we also correctly encode memory leaks. For example, suppose an object is allocated inside a try block, and an exception is thrown and handled. In that case, it will leak unless the reference to the allocated memory is somehow tracked and freed. Our symbolic engine also keeps track of function frames, i.e., several pieces of information about the function it is currently evaluating, including arguments, recursion depth, local variables, and others. These pieces of information are essential not only because we want to handle recursion or find memory leaks but also allow us to connect exceptions thrown outside the scope of a function and handle exception specification (as described in Section 5.1).

Exception Specification
The exception specification (illustrated in Fig. 11) defines which exceptions can be thrown by a function or method (including constructors). It is formed by an exception list and can be empty, i.e., the function or method cannot throw an exception. Exceptions thrown and handled inside a function or method are not affected by the exception specification.  Figure 11. Example of exception specification.
To support the verification of programs with exception specifications, an instruction THROW DECL is inserted at the beginning of the given function or method. This instruction contains a list of allowed exceptions that are checked whenever an exception is thrown outside the scope of the function or method. Similar to the catch map, they are stacked due to the possibility of nested exception specifications and are freed at the end of the function or method. An exception thrown from inside a function follows the same rules defined in Table II. Exception specifications check any exception thrown outside the function scope. If the type of the exception was not declared in the exception specialization, a different exception is raised and a separate path in the program is taken: the unexpected handler.

Terminate and Unexpected Handlers
During the exception handling process, errors can occur, causing the process to be aborted for any given reason (e.g., throwing an exception outside a try block or not catching a thrown exception). When this happens, the terminate handler is called.    Fig. 12a shows the terminate handler implementation. The terminate handler is a function that has the default behavior of calling the abort function. However, this behavior can be slightly changed by the developer, using the function set terminate(f), where f is a function pointer to a function that has no parameter and no return value (type void). By setting the new terminate function, it will be called before the abort function.
For the verification of programs that override the terminate handler, we define a function default terminate(), as illustrated in Fig. 12a, that contains the default termination behavior, calling abort. ESBMC also keeps a global function pointer to the terminate function, which can either point to the default behavior or the user-defined behavior. Finally, when the terminate function is called, we should guarantee that the abort function will be called, even if the terminate function is replaced (as shown in label E in Fig. 12a).
However, there is one case where the unexpected handler is called instead of the terminate handler. When an exception not allowed by the exception specification (Section 5.1) is thrown by a function or method, when this happens, the unexpected handler is called.
The unexpected handler works similarly to the terminate handler. It will either call terminate or re-throw the not allowed exception. Similar to set terminate, there exists a function set unexpected(f), where f is function pointer to a function that has no parameter and no return value (type void). Fig. 12b illustrates the unexpected handler implementation. The default behavior is to rethrow the thrown exception, and, as the exception specification already forbids it, we should call terminate to finish the program. ESBMC also keeps a global function pointer to the unexpected function, which either points to the default behavior or the user-defined behavior. If the unexpected handler was replaced, we must still guarantee that an exception will be thrown, so the forbidden exception will be re-thrown (as shown in line 27 in Fig. 12b). If the replaced unexpected function throws an exception that is not forbidden by the function, the code will not terminate.  Finally, we also need to model the unexpected behavior when using bad exception. Fig. 13 shows an example of code using bad exception. In this example, the user replaced the unexpected function with a function containing a re-throw. The code then calls myfunction(), which tries to throw a forbidden char exception. At this moment, myunexpected function is called and tries to re-throw the char exception, which is forbidden. ESBMC matches the compiler's behavior and checks whether bad exception is one of the allowed exceptions in the exception specification; if this is true, a bad exception exception will be thrown instead of the original forbidden exception.

EXPERIMENTAL EVALUATION
Our experimental evaluation compares ESBMC against LLBMC and DIVINE regarding correctness and performance in the verification process of C++03 programs; DIVINE was developed by Baranová et al. [14], and LLBMC was developed by Merz, Falke, and Sinz [10]. Section 6.1 shows a detailed description of all tools, scripts, and benchmark dataset, while Section 6.2 presents the results and our evaluation. Our experiments are based on a set of publicly available benchmarks. All tools, scripts, benchmarks, and results of our evaluation are available on a replication package [57], including all data to generate the percentages. More information about ESBMC is also available at the project's webpage http://esbmc.org/.

Experimental Design, Materials and Methods
Our experiments aim at answering two experimental questions regarding correctness and performance of ESBMC: i. (EQ-I) How accurate is ESBMC when verifying the chosen C++03 programs?
ii. (EQ-II) How does ESBMC performance compare to other existing model checkers?
To answer both questions, we evaluate all benchmarks with ESBMC v2.1, DIVINE v4.3, and LLBMC v2013.1. ESBMC v2.1 contains the last stable version of our C++ front-end, since the changes necessary to introduce a new C front-end on ESBMC v3.0 were disruptive. The new C front-end is based on the clang's AST [19], which completely changes the way ESBMC processes source files. Update the C++ front-end to also use clang's AST is part of our future work (cf. Section 8). We also applied CBMC [22]  the results in the experimental evaluation because the tool aborts during parser in 1,500 cases and reproduces false-negative results in the remaining 3. The vast majority of our benchmarks use STL functionalities, which CBMC does not support. The lack of support for C++ features in CBMC was also reported by Merz et al. [10], Monteiro et al. [15], and Ramalho et al. [21].
To tackle modern aspects of the C++ language, the comparison is based on a benchmark dataset that consists of 1,513 C++03 programs. In particular, 290 programs were extracted from the book "C++ How to Program" [46], 432 were extracted from C++ Resources Network [58], 16 were extracted from NEC Corporation [59], 16 programs were obtained from LLBMC [10], 39 programs were obtained from CBMC [22], 55 programs were obtained from the GCC test suite [39], and the others were developed to check several features of the C++ programming language [21]. The benchmarks are split into 18 test suites: algorithm contains 144 benchmarks to check the Algorithm library functionalities; cpp contains 357 general benchmarks, which involves C++03 libraries for general use, such as I/O streams and templates; this category also contains the LLBMC benchmarks and most NEC benchmarks. The test suites deque (43), list (72), queue (14), stack (14), priority queue (15), stream (66), string (233), vector (146), map (47), multimap (45), set (48), and multiset (43) contain benchmarks related to the standard template containers. The category try catch contains 81 benchmarks to the exception handling and the category inheritance contains 51 benchmarks to check inheritance and polymorphism mechanisms. Finally, the test suites cbmc (39), templates (23) and gcc-template (32) contain benchmarks from the GCC 1 and CBMC 2 test suite, which are specific to templates.
Each benchmark is tested and manually inspected in order to identify and label bugs. Thus, 543 out of the 1, 513 benchmarks contain bugs (i.e., 35.89%) and 970 are bug-free (i.e., 64.11%). This inspection is essential to compare verification results from each model checker and properly evaluates whether real errors were found. We evaluate three types of properties: (i) memory-safety violations (e.g., arithmetic overflow, null-pointer dereferences, and array out-of-bounds), (ii) userspecified assertions, and (iii) proper use of C++ features (e.g., exception-handle violations). We only exclude LLBMC from the evaluation of exception handling since the tool does not support this feature. All tools support all the remaining features and properties under evaluation.
All experiments were conducted on a computer with an i7-4790 processor, 3.60GHz clock, with 16GB RAM and Ubuntu 14.04 64-bit OS. ESBMC, LLBMC, and DIVINE were set to a time limit of 900 seconds (i.e., 15 minutes) and a memory limit of 14GB. All presented execution times are CPU times, i.e., only the elapsed periods spent in the allocated CPUs. Furthermore, memory consumption is the amount of memory that belongs to the verification process and is currently present in RAM (i.e., not swapped or otherwise not-resident). Both CPU time and memory consumption were measured with the times system call (POSIX system). Neither swapping nor turbo boost was enabled during experiments and all executed tools were restricted to a single process.
The tools were executed using three scripts: the first one for ESBMC, 3 which reads its parameters from a file and executes the tool; the second one for LLBMC, which first compiles the program to bitcode, using clang, 4 [60] then it reads the parameters from a file and executes the tool; 5 and the last one for DIVINE, which also first pre-compiles the C++ program to bitcode, then performs the verification on it. 6 The loop unrolling defined for ESBMC and LLBMC (i.e., the B value) depends on each benchmark. In order to achieve a fair comparison with ESBMC, an option from LLBMC had to be disabled. LLBMC does not support exception handling and all bitcodes were generated without exceptions (i.e., with the −fno − exceptions flag of the compiler). If exception handling is enabled, then LLBMC always aborts the verification process.

Results & Discussion
In this section, we present the results using percentages (concerning the 1,513 C++ benchmarks), as shown in Fig. 14. Correct represents the positive results, i.e., percentage of benchmarks with and without bugs correctly verified. False positives represent the percentage of benchmarks reported as correct, but they are incorrect; similarly, False negatives represent the percentage of benchmarks reported as incorrect, but that are correct. Finally, Unknown represents the benchmarks where each tool aborted the verification process due to internal errors, timeout (i.e., the tool was killed after 900 seconds) or a memory out (i.e., exhausted the maximum memory allowed of 14GB). In the Exception Handling category, LLBMC is excluded since it does not support this feature; if exception handling is enabled, then LLBMC continuously aborts the verification process. Furthermore, to better present the results of our experimental evaluation, the test suites were grouped into four categories: • Standard Containers -formed by algorithm, deque, vector, list, queue, priority queue, stack, map, multimap, set and multiset test suites (631 benchmarks); • Inheritance & Polymorphism -formed by the inheritance test suite (51 benchmarks).
On the Standard Containers category (see Fig. 14), ESBMC presented the best results and reached a successful verification rate of 78.45%, while LLBMC reported 70.36% and DIVINE 44.69%. ESBMC's noticeable results for containers are directly related to its COM. The majority of the benchmarks for this category contain standard assertions to map the support of container-based operations, e.g., to check whether the operator[] from a vector object is called with an argument out of range, which is undefined behavior [43]. We place standard C++ assertions in the benchmarks to evaluate how each verifier handles container-based operations. ESBMC reports a false-positive rate of 2.54% and a false-negative rate of 8.87%, which is due to internal implementation issues during pointer encoding (cf., Section 4). We are currently working to address them in future versions. ESBMC also reported 10.14% of unknown results due to limitations in templates-related features such as SFINAE [43] and nested templates. LLBMC reports a false-positive rate of 2.85% and a false-negative rate of 17.60%, mostly related to erroneously evaluating assertions (e.g., assertions to check whether a container is empty or it has a particular size). It also reports an unknown rate of 9.19% regarding timeouts, memory outs, and crashes when performing formula transformation [10]. DIVINE does not report any timeout, memory out, or false-positive results for this category, but an expressive false-negative rate of 49.92%, resulting from errors to check assertions (similarly to LLBMC). DIVINE also reports an unknown rate of 5.39% due to errors with pointer handling, probably due to imprecise (internal) encoding. On the Inheritance & Polymorphism category (see Fig. 14), ESBMC presented the best results and reached a successful verification rate of 84.32% while LLBMC reported 68.63% and DIVINE 54.90%. ESBMC does not report any timeout or memory out, but it reports a false-negative rate of 15.68%, due to implementation issues to handle pointer encoding. LLBMC does not report any false positives, timeouts, or memory outs results. However, it reports a false-negative rate of 5.88%, which is related to failed assertions representing functional aspects of inherited classes. It also reported an unknown rate of 25.49% regarding multiple inheritance. DIVINE does not report any timeout, memory out, or false-positive results for this category, but a false-negative rate of 23.53% and an unknown rate of 21.57%, which is a result of errors when handling dynamic casting, virtual inheritance, multiple inheritance, and even basic cases of inheritance and polymorphism.
On the Exception Handling category (see Fig. 14), ESBMC presented the best results and reached a successful verification rate of 87.66% while DIVINE reported 62.96%. ESBMC does not report any timeout or memory out, but it reports a false-positive rate of 3.70% and a false-negative rate of 2.47%. These bugs are related to the implementation of rule r 6 from Table II in ESBMC, i.e., "catches an exception if the type of the thrown exception e can be converted to the type of the catch h, either by qualification or standard pointer conversion"; we are currently working on fixing these issues. ESBMC also presents an unknown rate of 3.70% due to previously mentioned template limitations. DIVINE does not report any timeout or memory out. However, it reports a false-positive rate of 7.40% and a false-negative rate of 17.30%. It incorrectly handles re-throws, exception specification, and the unexpected as well as terminate function handlers. DIVINE also presents an unknown rate of 12.34% due to errors when dealing with exceptions thrown by derived classes, instantiated as base classes, which is probably related to the imprecise encoding of vtables.
To evaluate how these model checkers perform when applied to general C++03 benchmarks, we evaluate them against the category C++03. In this category, model checkers deal with benchmarks that make use of the features discussed in this paper (e.g., exception handling and containers), and a wider range of libraries from the STL, manipulation of strings and streams, among other C++03 features. ESBMC presented the highest successful verification rate, 89.20%, followed by DIVINE 67.20% and LLBMC 62.27%. The successful expressive rate of ESBMC in this category not only correlates to its support for core C++03 features (i.e., templates, inheritance, polymorphism, and exception handling) or its ability to check functional aspects of the standard containers but also because COM contains abstractions for all standard libraries shown in Table I. For instance, the operational model for the string library enables ESBMC to achieve a success rate of 99.14% in the string test suite, which contains benchmarks that target all methods provided in C++03 for string objects. Note that running ESBMC without COM over the benchmarks, 98.08% fail since the majority uses at least one standard template library. ESBMC does not report any memory out, but it reports a false-positive rate of 1.26%, a false-negative rate of 3.00%, and an unknown rate of 6.54%, which are all due to the same issues pointed by the previous experiments. DIVINE does not report any false positives, timeout, or a memory out, but a false-negative rate of 22.27%, which is a result of errors when checking assertions representing functional properties of objects across all STL (similar to LLBMC). DIVINE reports one false positive regarding the instantiation of function template specialization and an unknown rate of 10.13% due to crashes when handling pointers. LLBMC reports a false-positive rate of 1.73% and a false-negative rate of 26.00%, which is related to errors when checking assertions that represent functional properties of objects (e.g., asserting the size of a string object after an operation) or dealing with stream objects in general. It also reported an unknown rate of 10.00%, mainly regarding operator overloading errors and the ones mentioned in the previous categories. A small number of counterexamples generated by the three tools were manually checked, but we understand that this is far from ideal. The best approach is to use an automated method to validate the counterexample, such as the witness format proposed by Beyer et al. [61]; however, the available witness checkers do not support the validation of C++ programs. Implementing such a witness checker for C++ would represent a significant development effort, which we leave it for future work.  Fig. 15 illustrates the accumulative verification time and memory consumption for the tools under evaluation. All the tools take more time to verify the test suites algorithm, string, and cpp, due to a large number of test cases and the presence of pointers and iterators. ESBMC is the fastest of the three tools, 3.2 times faster than LLBMC and only 155.7 seconds faster than DIVINE. In terms of verification time, DIVINE is the only tool that did not use more than the defined limit of 900 seconds, while ESBMC and LLBMC aborted due to timeout in 4 and 25 benchmarks, respectively. DIVINE is the only tool that did not use more than the defined limit of 14GB per benchmark in terms of memory consumption. At the same time, ESBMC and LLBMC aborted due to exhaustion of the memory resources in 3 and 11 of them, respectively. Even so, LLBMC consumes less memory overall (614.92GB) when compared to DIVINE (627.97GB) and ESBMC (2, 210.91GB).
Overall, ESBMC achieved the highest success rate of 84.27% in 15, 761.90 seconds (approximately 4 hours and 23 minutes), faster than LLBMC and DIVINE, which positively answers our experimental questions EQ-I and EQ-II. LLBMC correctly verified 62.52% in 50, 564.10 seconds (approximately 14 hours) and can only verify the programs if exception handling is disabled, which is not a problem for both ESBMC and DIVINE. DIVINE correctly verified 57.17% in 15, 917.60 seconds (approximately 4 hours and 26 minutes). Regarding memory usage, ESBMC has the highest usage among the three tools, which is approximately 3.5 times higher than DIVINE and LLBMC, respectively. This high consumption is due to the generation process of SSA forms (cf., Section 3). However, its optimization is under development for future versions.
In conclusion, our experimental evaluation indicates that ESBMC outperforms two stateof-the-art model checkers, DIVINE and LLBMC, regarding the verification of inheritance, polymorphism, exception handling, and standard containers. The support for templates in ESBMC needs improvements. However, the current work-in-progress clang front-end will not only cover this gap (because clang will instantiate all the templates in the program) but will also allow ESBMC to handle new versions of the language (e.g., C++11). Even with its current support for templates, our experimental results allow us to conclude that ESBMC represents the state-of-the-art regarding applying model checking in C++ programs.

Sniffer Application
This section describes the results of the verification process using ESBMC and LLBMC in a sniffer program. We were unable to use DIVINE to verify the code because the tool does not offer support for the verification of some libraries used in the program (e.g., boost [62]), which makes the verification process an infeasible task, i.e., DIVINE would report incorrect results.  The following properties were verified in the sniffer program: arithmetic underand overflow, division by zero, and array bounds violation. Due to confidentiality issues, we were only able to verify 50 of 85 methods since INdT did not provide some classes required by the unverified methods. From the verified code, ESBMC was able to identify five errors, related to arithmetic under-and overflow while LLBMC was able to identify only three of them. All errors were reported to developers, who confirmed them. As an example of an error found, Fig. 16 shows the getPayloadSize method from the PacketM3UA class. In this method, an arithmetic overflow can occur. The method returns ntohs, an unsigned int, but the getPayloadSize method must return a signed int. In this case, a possible solution is to change the return type of the getPayloadSize method to unsigned int.

RELATED WORK
Conversion of C++ programs into another language makes the verification process easier since C++ model checkers are still in the early development stages. There are more stable verification tools written for other programming languages, such as C [9]. This conversion, however, can unintentionally introduce or hide errors in the original program. In particular, the converted program's verification may present different results if compared to the verification of the original C++ program, unless we check the equivalence of both the original and the modified program [64], which can become undecidable in the presence of unbounded memory usage.
When it comes to the verification of C++ programs, most of the model checkers available in the literature focus their verification approach on specific C++ features, such as exception handling, and end up neglecting other features of equal importance, such as the verification of the STL [66,67]. Table III shows a comparison among other studies available in the literature and our approach.
Merz, Falke, and Sinz [10,65] describe LLBMC, a tool that uses BMC to verify C++ programs. The tool first converts the program into LLVM intermediate representation, using clang [60] as an off-the-shelf front-end. This conversion removes high-level information about the structure of C++ programs (e.g., the relationship between classes). However, the code fragments that use the STL are inlined, which simplifies the verification process. From the LLVM intermediate representation, LLBMC generates a quantifier-free logical formula based on bit-vectors. This formula is further simplified and passed to an SMT solver for verification. The tool does not verify programs with   [68], still uses an old version of LLVM (v3.4) due to the significant effort to update its internal structure. Blanc, Groce, and Kroening [12] describe the verification of C++ programs using containers via predicate abstraction. A simplified operational model using Hoare logic is proposed to support C++ programs that make use of the STL. The purpose of the operational model is to simplify the verification process using the SATABS tool [69]. SATABS is a verification tool for C and C++ programs that supports classes, operator overloading, references, and templates (but without supporting partial specification). In order to verify the correctness of a program, the authors show that it is sufficient to use an operational model by proving that, if the pre-and postconditions hold, the implementation model also holds. The approach is efficient in finding trivial errors in C++ programs. The preconditions are modeled to verify the library containers using an operational model similar to the ESBMC tool's model for the same purpose. Regarding the operational model, the authors present only preconditions. In contrast, our operational model verifies preconditions and replicates the STL behavior, which increases the range of applications that can be adequately verified by the tool (i.e., postconditions).
Clarke, Kroening, and Lerda [22] present CBMC, which implements BMC for C/C++ programs using SAT/SMT solvers. CBMC uses its parser, based on Flex/Bison [17], to build an AST. The typechecker of CBMC's front-end annotates this AST with types and generates a language-independent intermediate representation of the original source code. The intermediate representation is then converted into an equivalent GOTO-program (i.e., control-flow graphs) that the symbolic execution engine will process. ESBMC improves the front-end, the GOTO conversion and the symbolic execution engine to handle the C++03 standard. CBMC and ESBMC use two functions C and P that compute the constraints (i.e., assumptions and variable assignments) and properties (i.e., safety conditions and user-defined assertions), respectively. Both tools automatically generate safety conditions that check for arithmetic overflow and underflow, array bounds violations, and null pointer dereferences, in the spirit of Sites' clean termination [70]. Both functions accumulate the control-flow predicates to each program point and use these predicates to guard both the constraints and the properties so that they properly reflect the semantics of the program. A VC generator (VCG) then derives the verification conditions from them. CBMC is a well-known model checker for C programs, but its support for C++ is rather incomplete (cf. Section 6). In particular, CBMC has Copyright © 2021 John Wiley & Sons, Ltd.
(2021) Prepared using stvrauth.cls DOI: 10.1002/stvr problems instantiating template correctly and lacks support for STL, exception specialization and terminate/unexpected functions. Baranová et al. [14] present DIVINE, an explicit-state model checker to verify single-and multithreaded programs written in C/C++ (and other input formats, such as UPPAAL 1 and DVE 2 ). Another language supported by DIVINE is the LLVM intermediate representation; for this reason, the base of its verification process is the translation of C++ programs into that representation. Using clang [60] as front-end, DIVINE translates C++ programs into the LLVM intermediate representation, thereby, applying its implementation of the C and C++ standard libraries in order to ensure a consistent translation. Nonetheless, this translation process might cause some irregularities to the verification process once it loses high-level information about the C++ program structure (i.e., the relationship between the classes). To tackle such issues in the verification process of exception handling structures,Štill, Ročkai and Barnat [67] propose a new API for DIVINE to properly map and deal with exception handling in C++ programs, based on a study about the C++ and LLVM exception handling mechanisms [66]. The authors also claim DIVINE as the first model checker that can verify exception handling in C++ programs, as opposed to what has been stated by Ramalho et al. [21]. However, ESBMC v1.23 (i.e., the version used by Ramalho et al. [21]) is able to correctly verify the example presented by Ročkai, Barnat and Brim [67], generating and verifying 10 VCs in less than one second. Our experimental evaluation shows that ESBMC outperforms DIVINE in handling exceptions as well as for the support of standard containers, inheritance, and polymorphism (cf. Section 6).

CONCLUSIONS & FUTURE WORK
We have described a novel SMT-based BMC approach to verify C++ programs using ESBMC. We started with an overview of ESBMC's type-checking engine, which includes our approach to support templates (similar to conventional compilers) that replace the instantiated templates before the encoding phase. We also describe our type-checking mechanism to handle single and multiple inheritance and polymorphism in C++ programs. We then present the significant contributions of this work: the C++ operational model and the support for exception handling. We describe an abstraction of the standard template libraries, which replaces it during the verification process. The purpose is twofold: reduce complexity while checking whether a given program uses the STL correctly. Finally, we present novel approaches to handle critical features of exception handling in C++ (e.g., unexpected and termination function handlers).
To evaluate our approach, we extended our experimental evaluation by approximately 36% if compared to our prior work [21]. ESBMC is able to verify correctly 84.27%, in approximately 4 hours, outperforming two state-of-art verifiers, DIVINE and LLBMC (cf., Section 6). ESBMC and DIVINE were also able to verify programs with exceptions enabled, a missing feature of LLBMC that decreases the verification accuracy of real-world C++ programs. Besides, ESBMC was able to find undiscovered bugs in the Sniffer code, a commercial application of medium-size used in the telecommunications domain. The developers later confirmed the respective bugs. LLBMC was able to discover a subset of the bugs discovered by ESBMC, while DIVINE was unable to verify the application due to a lack of support for the Boost C++ library [62].
Our verification method depends on the fact that COM correctly represents the original STL. Indeed, the correctness of such a model to trust in the verification results is a significant concern [15, [71][72][73][74][75][76]. The STL is specified by the ISO International Standard ISO/IEC 14882:2003(E) -Programming Language C++ [43]. Similar to conformance testing [77,78], to certify the correlation between STL and COM, we rely on the translation of the specification into assertions, which represents the pre-and post-conditions of each method/function in the SCL. Although COM is an entirely new implementation, it consists in (reliably) building a simplified model of the related STL, using the C/C++ programming language through the ESBMC intrinsic functions (e.g., assert and assume) and the original specification, which thus tends to reduce the number of programming errors. Besides, Cordeiro et al. [17,79,80] presented the soundness for such intrinsic functions already supported by ESBMC. Although proofs regarding the soundness of the entire operational model could be carried out, it represents a laborious task due to the (adopted) memory model [81]. Conformance testing concerning operational models would be a suitable approach [15,78] and represents a promising approach for future research.
For future work, we intend to extend ESBMC coverage in order to verify C++11 programs. The new standard is a huge improvement over the C++03, which includes the replacement of exception specialization by a new keyword noexcept, which works in the same fashion as an empty exception specialization. The standard also presents new sequential containers (array and forward list), new unordered associative containers (unordered set, unordered multiset, unordered map and unordered multimap), and new multithreaded libraries (e.g., thread) in which our COM does not yet support. Finally, we will develop a conformance testing procedure to ensure that our COM conservatively approximates the STL semantics.
Furthermore, we intend to improve the general verification of C++ programs, including improved support for templates. Although the current support of templates was sufficient to verify realworld C++ applications (cf., Section 6) it is still work-in-progress. For instance, the handling of SFINAE [43] in ESBMC is limited, and limitations on the support of nested templates, as shown in the experiments, directly affect the verification process. This limitation is because template instantiation is notoriously hard, especially if we consider recent standards. Although our frontend can handle many real-world C++ programs, maintaining the C++ front-end in ESBMC is a herculean task. For that reason, we decided to rewrite our front-end using clang [60] to generate the program AST. Importantly, we do not intend to use the LLVM intermediate representation but the AST generated by clang. In particular, if we use clang to generate the AST, then it solves several problems: (i) the AST generated by clang contains all the instantiated templates so we only need to convert the instantiated classes/functions and ignore the generic version; (ii) supporting new features will be as easy as adding a new AST conversion node from the clang representation to ESBMC representation; (iii) we do not need to maintain a full C++ front-end since ESBMC will contain all libraries from clang. Thus, we can focus on the main goal of ESBMC, the SMT encoding of C/C++ programs.
We already took the first step towards that direction and rewrote the C front-end [19], and the C++ front-end is currently under development.