Internals of JET.jl
Abstract Interpretation
In order to perform type-level program analysis, JET.jl uses Core.Compiler.AbstractInterpreter
interface, and customizes its abstract interpretation by overloading a subset of Core.Compiler
functions, that are originally developed for Julia compiler's type inference and optimizations that aim at generating efficient native code for CPU execution.
JET.AbstractAnalyzer
overloads a set of Core.Compiler
functions to implement the "core" functionalities of JET's analysis, including inter-procedural error report propagation and caching of the analysis result. And each plugin analyzer (e.g. JET.JETAnalyzer
) will overload more Core.Compiler
functions so that it can perform its own program analysis on top of the core AbstractAnalyzer
infrastructure.
Most overloads use the invoke
reflection, which allows AbstractAnalyzer
to dispatch to the original AbstractInterpreter
's abstract interpretation methods while still passing AbstractAnalyzer
to the subsequent (maybe overloaded) callees.
How AbstractAnalyzer
manages caches
JET.AnalysisResult
— TypeAnalysisResult
analyzer::AbstractAnalyzer
manages InferenceErrorReport
by associating it with InferenceResult
. InferenceErrorReport
s found within the currently-analyzed result::InferenceResult
can be accessed using get_reports(analyzer, result)
.
JET.CachedAnalysisResult
— TypeCachedAnalysisResult
AnalysisResult
is transformed into CachedAnalysisResult
when it is cached into a global cache maintained by AbstractAnalyzer
. That means, codeinf::CodeInstance = Core.Compiler.code_cache(analyzer::AbstractAnalyzer)[mi::MethodInstance])
is expected to have its field codeinf.inferred::CachedAnalysisResult
.
JET.AnalysisCache
— TypeAnalysisCache
JET's internal representation of a global analysis cache.
Core.Compiler.inlining_policy
— Functioninlining_policy(analyzer::AbstractAnalyzer, @nospecialize(src), ...) -> source::Any
Implements inlining policy for AbstractAnalyzer
. Since AbstractAnalyzer
works on InferenceResult
whose src
field keeps AnalysisResult
or CachedAnalysisResult
, this implementation needs to forward their wrapped source to inlining_policy(::AbstractInterpreter, ::Any, ::UInt8)
.
Top-level Analysis
JET.virtual_process
— Functionvirtual_process(s::AbstractString,
filename::AbstractString,
analyzer::AbstractAnalyzer,
config::ToplevelConfig) -> res::VirtualProcessResult
Simulates Julia's toplevel execution and collects error points, and finally returns Base.Docs.DocStr(svec(" res::VirtualProcessResult\n\n- res.included_files::Set{String}
: files that have been analyzed\n- res.defined_modules::Set{Module}
: module contexts created while this top-level analysis\n- res.toplevel_error_reports::Vector{ToplevelErrorReport}
: toplevel errors found during the\n text parsing or partial (actual) interpretation; these reports are \"critical\" and should\n have precedence over inference_error_reports
\n- res.inference_error_reports::Vector{InferenceErrorReport}
: possible error reports found\n by AbstractAnalyzer
\n- res.toplevel_signatures
: signatures of methods defined within the analyzed files\n- res.actual2virtual::", Pair{Module, Module}, "
: keeps actual and virtual module\n"), res::VirtualProcessResult
res.included_files::Set{String}
: files that have been analyzedres.defined_modules::Set{Module}
: module contexts created while this top-level analysisres.toplevel_error_reports::Vector{ToplevelErrorReport}
: toplevel errors found during the text parsing or partial (actual) interpretation; these reports are "critical" and should have precedence overinference_error_reports
res.inference_error_reports::Vector{InferenceErrorReport}
: possible error reports found byAbstractAnalyzer
res.toplevel_signatures
: signatures of methods defined within the analyzed filesres.actual2virtual::Pair{Module, Module}
: keeps actual and virtual module
, Dict{Symbol, Any}(:typesig => Union{}, :module => JET, :linenumber => 338, :binding => JET.VirtualProcessResult, :path => "/home/runner/work/JET.jl/JET.jl/src/toplevel/virtualprocess.jl", :fields => Dict{Symbol, Any}()))
This function first parses s::AbstractString
into toplevelex::Expr
and then iterate the following steps on each code block (blk
) of toplevelex
:
- if
blk
is a:module
expression, recursively enters analysis into an newly defined virtual module lower
sblk
into:thunk
expressionlwr
(macros are also expanded in this step)- if the context module is virtualized, replaces self-references of the original context module with virtualized one: see
fix_self_references
ConcreteInterpreter
partially interprets some statements inlwr
that should not be abstracted away (e.g. a:method
definition); see alsopartially_interpret!
- finally,
AbstractAnalyzer
analyzes the remaining statements by abstract interpretation
In order to process the toplevel code sequentially as Julia runtime does, virtual_process
splits the entire code, and then iterate a simulation process on each code block. With this approach, we can't track the inter-code-block level dependencies, and so a partial interpretation of toplevle definitions will fail if it needs an access to global variables defined in other code blocks that are not interpreted but just abstracted. We can circumvent this issue using JET's concretization_patterns
configuration, which allows us to customize JET's concretization strategy. See ToplevelConfig
for more details.
JET.VirtualProcessResult
— Typeres::VirtualProcessResult
res.included_files::Set{String}
: files that have been analyzedres.defined_modules::Set{Module}
: module contexts created while this top-level analysisres.toplevel_error_reports::Vector{ToplevelErrorReport}
: toplevel errors found during the text parsing or partial (actual) interpretation; these reports are "critical" and should have precedence overinference_error_reports
res.inference_error_reports::Vector{InferenceErrorReport}
: possible error reports found byAbstractAnalyzer
res.toplevel_signatures
: signatures of methods defined within the analyzed filesres.actual2virtual::Pair{Module, Module}
: keeps actual and virtual module
JET.virtualize_module_context
— Functionvirtualize_module_context(actual::Module)
HACK to return a module where the context of actual
is virtualized.
The virtualization will be done by 2 steps below:
- loads the module context of
actual
into a sandbox module, and export the whole context from there - then uses names exported from the sandbox
This way, JET's runtime simulation in the virtual module context will be able to define a name that is already defined in actual
without causing "cannot assign a value to variable ... from module ..." error, etc. It allows JET to virtualize the context of already-existing module other than Main
.
Currently this function relies on Base.names
, and thus it can't restore the using
ed names.
JET.ConcreteInterpreter
— TypeConcreteInterpreter
The trait to inject code into JuliaInterpreter's interpretation process; JET.jl overloads:
JuliaInterpreter.step_expr!
to add error report pass for module usage expressions and support package analysisJuliaInterpreter.evaluate_call_recurse!
to special caseinclude
callsJuliaInterpreter.handle_err
to wrap an error happened during interpretation intoActualErrorWrapped
JET.partially_interpret!
— Functionpartially_interpret!(interp::ConcreteInterpreter, mod::Module, src::CodeInfo)
Partially interprets statements in src
using JuliaInterpreter.jl:
- concretizes "toplevel definitions", i.e.
:method
,:struct_type
,:abstract_type
and:primitive_type
expressions and their dependencies - concretizes user-specified toplevel code (see
ToplevelConfig
) - directly evaluates module usage expressions and report error of invalid module usages (TODO: enter into the loaded module and keep JET analysis)
- special-cases
include
calls so that top-level analysis recursively enters the included file
How top-level analysis is bridged to AbstractAnalyzer
JET.AbstractGlobal
— Typemutable struct AbstractGlobal
t::Any # analyzed type
isconst::Bool # is this abstract global variable declarared as constant or not
end
Wraps a global variable whose type is analyzed by abstract interpretation. AbstractGlobal
object will be actually evaluated into the context module, and a later analysis may refer to or alter its type on future load and store operations.
The type of the wrapped global variable will be propagated only when in a toplevel frame, and thus we don't care about the analysis cache invalidation on a refinement of the wrapped global variable, since JET doesn't cache the toplevel frame.
Analysis Result
JET.JETToplevelResult
— Typeres::JETToplevelResult
Represents the result of JET's analysis on a top-level script.
res.analyzer::AbstractAnalyzer
:AbstractAnalyzer
used for this analysisres.res::VirtualProcessResult
:VirtualProcessResult
collected from this analysisres.source::AbstractString
: the identity key of this analysisres.jetconfigs
: configurations used for this analysis
JETToplevelResult
implements show
methods for each different frontend. An appropriate show
method will be automatically chosen and render the analysis result.
JET.JETCallResult
— Typeres::JETCallResult
Represents the result of JET's analysis on a function call.
res.result::InferenceResult
: the result of this analysisres.analyzer::AbstractAnalyzer
:AbstractAnalyzer
used for this analysisres.source::AbstractString
: the identity key of this analysisres.jetconfigs
: configurations used for this analysis
JETCallResult
implements show
methods for each different frontend. An appropriate show
method will be automatically chosen and render the analysis result.
Splitting and filtering reports
Both JETToplevelResult
and JETCallResult
can be split into individual failures for integration with tools like Cthulhu:
JET.get_reports
— Functionrpts = JET.get_reports(result::JETCallResult)
Split result
into a vector of reports, one per issue.
JET.reportkey
— Functionreportkey(report::InferenceErrorReport)
Returns an identifier for the runtime-dispatched call site of report
.
If you have a long list of reports to analyze, urpts = unique(reportkey, rpts)
may remove "duplicates" that arrive at the same runtime dispatch from different entry points.
Error Report Interface
JET.VirtualFrame
— TypeVirtualFrame
Stack information representing virtual execution context:
file::Symbol
: the path to the file containing the virtual execution contextline::Int
: the line number in the file containing the virtual execution contextsig::Signature
: a signature of this framelinfo::MethodInstance
: TheMethodInstance
containing the execution context
This type is very similar to Base.StackTraces.StackFrame
, but its execution context is collected during abstract interpration, not collected from actual execution.
JET.VirtualStackTrace
— TypeVirtualStackTrace
Represents a virtual stack trace in the form of a vector of VirtualFrame
. The vector holds VirtualFrame
s in order of "from entry call site to error point", i.e. the first element is the VirtualFrame
of the entry call site, and the last element is that contains the error.
JET.Signature
— TypeSignature
Represents an expression signature. print_signature
implements a frontend functionality to show this type.
JET.InferenceErrorReport
— Typeabstract type InferenceErrorReport end
An interface type of error reports collected by JET's abstract interpretation based analysis. All InferenceErrorReport
s have the following fields, which explains where and how this error is reported:
vst::VirtualStackTrace
: a virtual stack trace of the errorsig::Signature
: a signature of the error point
Note that some InferenceErrorReport
may have additional fields other than vst
and sig
to explain why they are reported.
JET.ToplevelErrorReport
— TypeToplevelErrorReport
An interface type of error reports that JET collects while top-level concrete interpration. All ToplevelErrorReport
should have the following fields:
file::String
: the path to the file containing the interpretation contextline::Int
: the line number in the file containing the interpretation context
See also: virtual_process
, ConcreteInterpreter