Internals of JET.jl

Abstract Interpretation

In order to perform type-level program analysis, JET.jl uses Core.Compiler.AbstractInterpreter interface, and customizes its abstract interpretation by overloading a subset of Core.Compiler functions, that are originally developed for Julia compiler's type inference and optimizations that aim at generating efficient native code for CPU execution.

JET.AbstractAnalyzer overloads a set of Core.Compiler functions to implement the "core" functionalities of JET's analysis, including inter-procedural error report propagation and caching of the analysis result. And each plugin analyzer (e.g. JET.JETAnalyzer) will overload more Core.Compiler functions so that it can perform its own program analysis on top of the core AbstractAnalyzer infrastructure.

Most overloads use the invoke reflection, which allows AbstractAnalyzer to dispatch to the original AbstractInterpreter's abstract interpretation methods while still passing AbstractAnalyzer to the subsequent (maybe overloaded) callees.

JET.islineageFunction
islineage(parent::MethodInstance, current::MethodInstance) ->
    (report::InferenceErrorReport) -> Bool

Returns a function that checks if a given InferenceErrorReport

  • is generated from current, and
  • is "lineage" of parent (i.e. entered from it).

This function is supposed to be used when additional analysis with extended lattice information happens in order to filter out reports collected from current by analysis without using that extended information. When a report should be filtered out, the first virtual stack frame represents parent and the second does current.

Example:

entry
└─ linfo1 (report1: linfo1->linfo2)
   ├─ linfo2 (report1: linfo2)
   ├─ linfo3 (report2: linfo3->linfo2)
   │  └─ linfo2 (report2: linfo2)
   └─ linfo3′ (~~report2: linfo3->linfo2~~)

In the example analysis above, report2 should be filtered out on re-entering into linfo3′ (i.e. when we're analyzing linfo3 with constant arguments), nevertheless report1 shouldn't because it is not detected within linfo3 but within linfo1 (so it's not a "lineage of linfo3"):

  • islineage(linfo1, linfo3)(report2) === true
  • islineage(linfo1, linfo3)(report1) === false
source
Core.Compiler.bail_out_toplevel_callFunction

By default AbstractInterpreter implements the following inference bail out logic:

  • bail_out_toplevel_call(::AbstractInterpreter, sig, ::InferenceState): bail out from inter-procedural inference when inferring top-level and non-concrete call site callsig
  • bail_out_call(::AbstractInterpreter, rt, ::InferenceState): bail out from inter-procedural inference when return type rt grows up to Any
  • bail_out_apply(::AbstractInterpreter, rt, ::InferenceState): bail out from _apply_iterate inference when return type rt grows up to Any

It also bails out from local statement/frame inference when any lattice element gets down to Bottom, but AbstractInterpreter doesn't provide a specific interface for configuring it.

source
Core.Compiler.bail_out_callFunction

By default AbstractInterpreter implements the following inference bail out logic:

  • bail_out_toplevel_call(::AbstractInterpreter, sig, ::InferenceState): bail out from inter-procedural inference when inferring top-level and non-concrete call site callsig
  • bail_out_call(::AbstractInterpreter, rt, ::InferenceState): bail out from inter-procedural inference when return type rt grows up to Any
  • bail_out_apply(::AbstractInterpreter, rt, ::InferenceState): bail out from _apply_iterate inference when return type rt grows up to Any

It also bails out from local statement/frame inference when any lattice element gets down to Bottom, but AbstractInterpreter doesn't provide a specific interface for configuring it.

source
Core.Compiler.add_call_backedges!Function
add_call_backedges!(analyzer::JETAnalyzer, ...)

An overload for abstract_call_gf_by_type(analyzer::JETAnalyzer, ...), which always add backedges (even if a new method can't refine the return type grew up to Any). This is because a new method definition always has a potential to change JETAnalyzer's analysis result.

source
JET.analyze_task_parallel_code!Function
analyze_task_parallel_code!(analyzer::AbstractAnalyzer, arginfo::ArgInfo, sv::InferenceState)

Adds special cased analysis pass for task parallelism. In Julia's task parallelism implementation, parallel code is represented as closure and it's wrapped in a Task object. Core.Compiler.NativeInterpreter doesn't infer nor optimize the bodies of those closures when compiling code that creates parallel tasks, but JET will try to run additional analysis pass by recurring into the closures.

See also: https://github.com/aviatesk/JET.jl/issues/114

Note

JET won't do anything other than doing JET analysis, e.g. won't annotate return type of wrapped code block in order to not confuse the original AbstractInterpreter routine track https://github.com/JuliaLang/julia/pull/39773 for the changes in native abstract interpretation routine.

source

How AbstractAnalyzer manages caches

JET.CachedAnalysisResultType
CachedAnalysisResult

AnalysisResult is transformed into CachedAnalysisResult when it is cached into a global cache maintained by AbstractAnalyzer. That means, codeinf::CodeInstance = Core.Compiler.code_cache(analyzer::AbstractAnalyzer)[mi::MethodInstance]) is expected to have its field codeinf.inferred::CachedAnalysisResult.

InferenceErrorReports found within already-analyzed result::InferenceResult can be accessed with get_cached_reports(analyzer, result).

source
Core.Compiler.inlining_policyFunction
inlining_policy(analyzer::AbstractAnalyzer, @nospecialize(src), ...) -> source::Any

Implements inlining policy for AbstractAnalyzer. Since AbstractAnalyzer works on InferenceResult whose src field keeps AnalysisResult or CachedAnalysisResult, this implementation needs to forward their wrapped source to inlining_policy(::AbstractInterpreter, ::Any, ::UInt8).

source

Top-level Analysis

JET.virtual_processFunction
virtual_process(s::AbstractString,
                filename::AbstractString,
                analyzer::AbstractAnalyzer,
                config::ToplevelConfig) -> res::VirtualProcessResult

Simulates Julia's toplevel execution and collects error points, and finally returns nothing

This function first parses s::AbstractString into toplevelex::Expr and then iterate the following steps on each code block (blk) of toplevelex:

  1. if blk is a :module expression, recursively enters analysis into an newly defined virtual module
  2. lowers blk into :thunk expression lwr (macros are also expanded in this step)
  3. if the context module is virtualized, replaces self-references of the original context module with virtualized one: see fix_self_references
  4. ConcreteInterpreter partially interprets some statements in lwr that should not be abstracted away (e.g. a :method definition); see also partially_interpret!
  5. finally, AbstractAnalyzer analyzes the remaining statements by abstract interpretation
Warning

In order to process the toplevel code sequentially as Julia runtime does, virtual_process splits the entire code, and then iterate a simulation process on each code block. With this approach, we can't track the inter-code-block level dependencies, and so a partial interpretation of toplevle definitions will fail if it needs an access to global variables defined in other code blocks that are not interpreted but just abstracted. We can circumvent this issue using JET's concretization_patterns configuration, which allows us to customize JET's concretization strategy. See ToplevelConfig for more details.

source
JET.VirtualProcessResultType
res::VirtualProcessResult
  • res.included_files::Set{String}: files that have been analyzed
  • res.defined_modules::Set{Module}: module contexts created while this top-level analysis
  • res.toplevel_error_reports::Vector{ToplevelErrorReport}: toplevel errors found during the text parsing or partial (actual) interpretation; these reports are "critical" and should have precedence over inference_error_reports
  • res.inference_error_reports::Vector{InferenceErrorReport}: possible error reports found by AbstractAnalyzer
  • res.toplevel_signatures: signatures of methods defined within the analyzed files
  • res.actual2virtual::Pair{Module, Module}: keeps actual and virtual module
source
JET.virtualize_module_contextFunction
virtualize_module_context(actual::Module)

HACK to return a module where the context of actual is virtualized.

The virtualization will be done by 2 steps below:

  1. loads the module context of actual into a sandbox module, and export the whole context from there
  2. then uses names exported from the sandbox

This way, JET's runtime simulation in the virtual module context will be able to define a name that is already defined in actual without causing "cannot assign a value to variable ... from module ..." error, etc. It allows JET to virtualize the context of already-existing module other than Main.

TODO

Currently this function relies on Base.names, and thus it can't restore the usinged names.

source
JET.ConcreteInterpreterType
ConcreteInterpreter

The trait to inject code into JuliaInterpreter's interpretation process; JET.jl overloads:

  • JuliaInterpreter.step_expr! to add error report pass for module usage expressions and support package analysis
  • JuliaInterpreter.evaluate_call_recurse! to special case include calls
  • JuliaInterpreter.handle_err to wrap an error happened during interpretation into ActualErrorWrapped
source
JET.partially_interpret!Function
partially_interpret!(interp::ConcreteInterpreter, mod::Module, src::CodeInfo)

Partially interprets statements in src using JuliaInterpreter.jl:

  • concretizes "toplevel definitions", i.e. :method, :struct_type, :abstract_type and :primitive_type expressions and their dependencies
  • concretizes user-specified toplevel code (see ToplevelConfig)
  • directly evaluates module usage expressions and report error of invalid module usages (TODO: enter into the loaded module and keep JET analysis)
  • special-cases include calls so that top-level analysis recursively enters the included file
source

How top-level analysis is bridged to AbstractAnalyzer

JET.AbstractGlobalType
mutable struct AbstractGlobal
    t::Any     # analyzed type
    isconst::Bool # is this abstract global variable declarared as constant or not
end

Wraps a global variable whose type is analyzed by abstract interpretation. AbstractGlobal object will be actually evaluated into the context module, and a later analysis may refer to or alter its type on future load and store operations.

Note

The type of the wrapped global variable will be propagated only when in a toplevel frame, and thus we don't care about the analysis cache invalidation on a refinement of the wrapped global variable, since JET doesn't cache the toplevel frame.

source

Analysis Result

JET.JETToplevelResultType
res::JETToplevelResult

Represents the result of JET's analysis on a top-level script.

  • res.analyzer::AbstractAnalyzer: AbstractAnalyzer used for this analysis
  • res.res::VirtualProcessResult: VirtualProcessResult collected from this analysis
  • res.source::AbstractString: the identity key of this analysis
  • res.jetconfigs: configurations used for this analysis

JETToplevelResult implements show methods for each different frontend. An appropriate show method will be automatically chosen and render the analysis result.

source
JET.JETCallResultType
res::JETCallResult

Represents the result of JET's analysis on a function call.

  • res.result::InferenceResult: the result of this analysis
  • res.analyzer::AbstractAnalyzer: AbstractAnalyzer used for this analysis
  • res.source::AbstractString: the identity key of this analysis
  • res.jetconfigs: configurations used for this analysis

JETCallResult implements show methods for each different frontend. An appropriate show method will be automatically chosen and render the analysis result.

source

Error Report Interface

JET.VirtualFrameType
VirtualFrame

Stack information representing virtual execution context:

  • file::Symbol: the path to the file containing the virtual execution context
  • line::Int: the line number in the file containing the virtual execution context
  • sig::Signature: a signature of this frame
  • linfo::MethodInstance: The MethodInstance containing the execution context

This type is very similar to Base.StackTraces.StackFrame, but its execution context is collected during abstract interpration, not collected from actual execution.

source
JET.VirtualStackTraceType
VirtualStackTrace

Represents a virtual stack trace in the form of a vector of VirtualFrame. The vector holds VirtualFrames in order of "from entry call site to error point", i.e. the first element is the VirtualFrame of the entry call site, and the last element is that contains the error.

source
JET.SignatureType
Signature

Represents an expression signature. print_signature implements a frontend functionality to show this type.

source
JET.InferenceErrorReportType
abstract type InferenceErrorReport end

An interface type of error reports collected by JET's abstract interpretation based analysis. All InferenceErrorReports have the following fields, which explains where and how this error is reported:

Note that some InferenceErrorReport may have additional fields other than vst and sig to explain why they are reported.

source
JET.ToplevelErrorReportType
ToplevelErrorReport

An interface type of error reports that JET collects while top-level concrete interpration. All ToplevelErrorReport should have the following fields:

  • file::String: the path to the file containing the interpretation context
  • line::Int: the line number in the file containing the interpretation context

See also: virtual_process, ConcreteInterpreter

source