Definition: Abstract Domain and Abstraction Relation

We call *abstract domain* a pair made of a set 𝔸 and an ordering relation ⊑ over that set. Given a concrete domain (ℂ, ⊆), an *abstraction* is defined by an abstract domain (𝔸, ⊑) and an abstraction relation (⊧) ⊆, ℂ × 𝔸, such that,

for all \(c \in ℂ, a_0, a_1 \in 𝔸\), if \(c \models a_0\) and \(a_0 \sqsubseteq a_1\), then \(c \models a_1\)

for all \(c_0, c_1 ∈ ℂ, a ∈ 𝔸\), if \(c_0 \subseteq c_1\) and \(c_1 \models a\), then \(c_0 \models a\)

The first one interpretation: if \(c\) satisfies the property described by the abstract element \(a_0\), and if the abstract element \(a_1\) expresses a weaker property than that expressed by \(a_0\), then \(c\) should also satisfy the abstract property \(a_1\).

A relation \(\models\) is sometimes replaced by "concretization function" and "abstraction function".

Definition: Concretization Function

A *concretization function* (or, for short, *concretization*) is a function\(\gamma: 𝔸 → ℂ\) such that, for any abstract element \(a\), \(\gamma(a)\) satisfies \(a\) (i.e., \(\gamma(a) \models a\)) and \(\gamma(a)\) is the maximum element of \(ℂ\) that satisfies \(a\).

Definition: Abstract Function

Let \(c\) be a concrete element. We say that \(c\) has a best abstraction if and only if there exists an abstract element \(a\) such that

\(a\) is an abstraction of \(c\), and

any other abstraction of \(c\) is greater than \(a\)

If it exists, this element is unique and called the *best abstration* of \(c\).

An *abstraction function* (or, for short, *abstraction*) is a function \(\alpha: ℂ \rightarrow 𝔸\) that maps each concrete element to its best abstraction.

Notes:

either of \(\gamma\) and \(\alpha\) fully describes the underlying abstraction relation

the existence of a best abstraction function is

**not**guaranteed

Definition: Galois Connection

A *Galois connection* is a pair made of a concretization function \(\gamma\) and an abstraction function \(\alpha\) such that

Galois connection's interesting properties:

α and γ are monotone functions , which means that they map logically comparable inputs into logically comparable outputs

\(\forall c \in ℂ, c \subseteq \gamma(\alpha(c))\) , which means that applying the abstraction function and concretization the result back yield a less precise result (or, equivalently, a conservative approximation)

\(\forall a \in 𝔸, \alpha(\gamma(a)) \sqsubseteq a\) , which means that concretizing an abstract element and then abstracting the result back refines the information available in the initial abstract element; this refinement is known as

*reduction*

Intuitively, this abstraction

treats each variable independently

applies the value abstraction to each variable separated from the others

defines the order relation by the

**pointwise extension**to abstract elements

In general, the choice of an efficient computer representation for abstract domains that describe relational constraints is more difficult than in the case of non-releational abstract domains.

Examples:

linear equalities

convex polyhedra

octagons

\([\text{p}]^{\#}_𝒫\): the static analysis function (

*abstract semantics*)

Abstraction of a sequence of commands: compose each abstraction

\[ [C_0;C_1]^{\#}_𝒫(M^{\#}) = [C_1]^{\#}_𝒫([C_0]^{\#}_𝒫(M^{\#})) \]This can be generalized into the following theorem:

Theorem: Approximation of Compositions

Let \(F_0, F_1: 𝒫(𝕄) \rightarrow 𝒫(𝕄)\) be two monotone functions, and let \(F_0^{\#}, F_1^{\#}: 𝔸 \rightarrow 𝔸\) be two functions that over-approximate them, that is such that \(F_0 \circ \gamma \subseteq \gamma \circ F_0^{\#}\) and \(F_1 \circ \gamma \subseteq \gamma \circ F_1^{\#}\).

Then \(F_0 \circ F_1\) can be over-approximated by \(F_0^{\#} \circ F_1^{\#}\).

proof

\(F_1 \circ \gamma \subseteq \gamma \circ F_1^{\#}\) (by the soundness assumption on \(F_1\))

\(F_0 \circ F_1 \circ \gamma \subseteq F_0 \circ \gamma \circ F_1^{\#}\) (\(F_0\) is monotone)

\(F_0 \circ F_1 \circ \gamma \subseteq \gamma \circ F_0^{\#} \circ F_1^{\#}\) (by the soundness assumption on \(F_0\))

Intuitive understanding: we can decompose the over-approximation of a composition of operations into the composition of over-approximations of each operation.

Semantics of conditional branching:

\[ [\texttt{if}(B)\{C_0\} \texttt{else}\{C_1\}]_𝒫(M) = [C_0]_𝒫(ℱ_B(M)) \cup [C_1]_𝒫(ℱ_{\neg B}(M)) \]\(ℱ_{B}\): returns the memory states in \(M\) such that the condition \(B\) evaluates to **true**.

The abstract join operator \(\sqcup^{\#}\) should satisfy the following soundness property

\[ \gamma(M_0^{\#}) \cup \gamma(M_1^{\#}) \subseteq \gamma(M_0^{\#} \sqcup^{\#} M_1^{\#}) \]\(\sqcup^{\#}\) can be defined as a pointwise application of a join operator \(\sqcup^{\#}_𝒱\), which works in the value abstract domain:

\[ \text{Fro all variables } \texttt{x}, (M_0^{\#} \sqcup^{\#} M_1^{\#})(x) = M_0^{\#}(x) \sqcup^{\#}_𝒱 M_1^{\#}(x) \]And the definition of \(\sqcup^{\#}_𝒱\) will really depend on the abstract domain.

Semantics of conditional branching

\[ [\texttt{if}(B)\{C_0\} \texttt{else}\{C_1\}]^{\#}_𝒫(M^{\#}) = [C_0]^{\#}_𝒫(ℱ_B^{\#}(M^{\#})) \sqcup^{\#} [C_1]^{\#}_𝒫(ℱ_{\neg B}^{\#}(M^{\#})) \]Semantics of loops:

\[ [\texttt{while}(B)\{C\}]_𝒫(M) = ℱ_{\neg B} (\cup_{i \ge 0}([C]_𝒫 \circ ℱ_B)^i(M)) \]Let \(F = [C]_𝒫 \circ ℱ_B\)

if the sequence of abstract iterates converges (which can be observed simply by checking that two consecutive iterates are equal), the its final values over-approximates

*all*the concrete behaviors of the program \(\texttxt{while}(B){C}\)**under the assumption that the sequence converges**(<- not always true)

=> ensure convergence

Convergence:

make the abstract domain has

*finite height*: ensure \(M_k^{\#} ⊏ M_{k+1}^{\#}\) cannot hold infinitely many times=> converges when the abstract iteration doesn't discover any new behavior

*widening*: necessary when the abstract domain doesn't have finite heightDefinition: Widening OperatorA widening operator over an abstract domain 𝔸 is a binary operator \(\triangledown\), such that

for all abstract elements \(a_0, a_1\), we have

for all sequences \((a_n)_{n \in ℕ}\) of abstract elements, the sequence \((a^\prime_n)_{n \in ℕ}\) defined below is ultimately stationary

\(a^\prime_0 = a_0\)

\(a^\prime_{n+1} = a^\prime_n \triangledown a_n\)

widening could be built by

*dropping constraints that are not stable*e.g. interval abstract domain

\([n, p] \triangledown [n, q] = [n, p] (\text{if } p \ge q), [n, +\infty) (\text{if } p < q)\)

fix the reference concrete semantics

select the abstraction

derive analysis algorithms

© Shuhei Kadowaki. Last modified: 2022-01-17. Website built with Franklin.jl and the Julia programming language.