Realistic application for C programs:
where it dereference a null pointer
where it writes over a dangling pointer
Abstraction is not unique – some abstractions yield simpler computer representations and less costly algorithms than others:
We say that $a$ is the best abstraction of the concrete set $S$ if and only if
$S \in \gamma(a)$, and
for any $a^{\prime}$ that is an abstraction of $S$ (i.e., $S \in \gamma(a^{\prime})$), then $a^{\prime}$ is a coarser abstraction than $a$
If $S$ has a best abstraction, then the best abstraction is unique. When it is defined, we let α denote the function that maps any concrete set of states into the best abstraction of that set of states.
The best abstraction may not be available. But the impossibility to define or compute the best abstraction is in no way a serious flow for the analysis, as it just lead to conservative but sound results.
Think of the "reachability" semantic property of a program that acts on 2D space:
abstraction  idea  

intervals abstraction  the abstract elements of the interval abstract are defined by constraints of the form $l_{\rm{X}} ≤ \rm{X}, \rm{X} ≤ h_{\rm{X}}, l_{\rm{Y}} ≤ \rm{Y}, \rm{Y} ≤ h_{\rm{Y}}$ 

convex polyhedra abstraction  the abstract elements of the convex polyhedra abstract domain are conjunctions of linear inequality constraints, e.g.


A compositional approach to static analysis: to analyze a sequence of commands, "composes" the analyses of each subcommand
abstract precondition: an abstraction of the states that can be observed before a program fragment
abstract postcondition: an abstraction of the states that can be observed after that program fragment
transfer function: an abstract operation that accounts for the effect of a basic program statement
analysis
that inputs a program and an abstract precondition and returns an abstract postcondition. We say that analysis
is sound if and only if the following condition holds: p
from a state (x
, y
) generates that state (x′
, y′
),x
, y
) ∈ γ($a$),x′
, y′
) ∈ γ(analysis
(p
, $a$))The definition above entails that:
the analysis will produce sound results in the sense of the soundness definition when considering the property 𝒫 of interest. Since the analysis overapproximates the states the program may reach, if it claims that 𝒫 is not reachable, then we are sure that the program cannot reach 𝒫.
the analysis is not complete in the sense of the soundness definition, since it accepts analyses that produce coarse overapproximations
Abstract interpretation will produce an overapproximation of both cases, as the union
of two sets of abstract elements.
setup:
program p
consists of a loop with body b
: p
$::=$ iter{ b }
$\texttt{b}_k$: program that iterates b
$k$ times
$\texttt{p}_k$: program that iterates b
at most $k$ times
implies: $\texttt{p}_{k+1}$ is equivalent to $\texttt{p}_k$ or $\{\texttt{p}_k; \texttt{b}\}$
idea: recursively applies analysis
$\texttt{analysis}(\texttt{p}_{k+1}, a) = \texttt{union}(\texttt{analysis}(\texttt{p}_k, a), \texttt{analysis}(\texttt{b}, \texttt{analysis}(\texttt{p}_k, a)))$
$\texttt{R} \leftarrow \texttt{union}(\texttt{R}, \texttt{analysis}(\texttt{b}, \texttt{R}))$
analysis converges if R
stabilizes
approach: force the number of abstract elements to decrease over iteration
widen
ing: overapproximates union
s, enforces convergence
$\texttt{widen}(a_0, a_1)$:
keeps all constraints of $a_0$ that are also satisfied in $a_1$ and
discards all constraints of $a_0$ that are not satisfied in $a_1$ (hence to subsume $a_1$)
inclusion
: inputs abstract elements $a_0, a_1$ and returns true only when it can prove that $γ(a_0) ⊆ γ(a_1)$
Algorithm: $\texttt{analysis(iter\{p\}}, a)$
R ← a;
repeat
T ← R;
R ← widen(R, analysis(p, R));
until inclusion(R, T)
return T;
Widening is another source of potential incompleteness, but fortunately there are many techniques to make the analysis of loops more precise.
To unroll the first iteration of the loop into the union of {{}; b}
can ease the effect of the succeeding widenings.
program that acts on 2D space  abstract iteration  

original   
loop unrolled  
The analysis discussed so far actually computes as intermediate results overapproximations for all the interesting states of the input program, and thus we can just monitor them and use it for the verification of the property of interest.
Idea: compute, from the outset, all occurring intermediate states
The goal of the analysis: to collect all the states occurring in all possible transition sequences of the input program.
program: a collection of statements with a well defined execution order
program counter (program point): a unique label assigned to each statement of the program
control flow: the execution order, specified by a relation between the labels (from current program points to next program points)
The algorithm works very similarly to the "compositional style", but it will compute states per statement and iterates until all the statement states get converged.
The threestage approach for static analysis:
selection of the semantics and properties of interest
choice of the abstraction
derivation of the analysis algorithms from the semantics and from the abstraction