-
Notifications
You must be signed in to change notification settings - Fork 4
The Monomorphizer
This document describes the monomorphizer [3] (specialization of polymorphic expressions) in the Scilla compiler. The compiler pass consists of the following components: (1) AST initialization (2) Type-flow analysis (3) Monomorphization
We associate with each AST node, a TFA (type-flow-analysis) index. This
index is the offset into a global array of tfa_el
s (short for TFA
element). Each tfa_el
contains the data-flow information we're
tracking for that AST node.
The goal of the type-flow analysis is to compute, for each tfun 'A
expression, the set of closed types that may be substituted for 'A
at
runtime (via the @
operator). To achieve this, the analysis keeps
track of
- The flow of closed types from the argument of a type instantiation
expression
@foo
, to the type variable'A
of thetfun 'A
expressions thatfoo
may resolve to at runtime. - The flow of
tfun
expressions. - Control-flow analysis: the flow of
fun
expressions to improve the precision of the analysis.
(2) is carried out using the same constraints as done for control-flow analysis (3) is performed without context sensitivity. (1) is context sensitive.
(* The context of free variables in a (type) closure. *)
type context_env = Context.t IntMap.t
(* We track the flow of Funs and TFuns as a pair of tfa_data index
* and the context environment of its free type variables. *)
type clo_env = int * context_env
type rev_ref = ExprRef of expr_annot ref | VarRef of eannot Identifier.t
(* Data element propagated in the type-flow analysis. *)
type tfa_el = {
(* The TFun expressions that reach a program point or variable. *)
reaching_tfuns : CloSet.t;
(* The Fun expressions that reach a program point or variable. *)
reaching_funs : CloSet.t;
(* The closed types that reach a type variable, in a given context. *)
reaching_ctyps : TypSet.t Context.Map.t;
(* A back reference to who this information belongs to. *)
elof : rev_ref;
(* The free type variables at a TFun.
* Useful in building the context_env for reaching_tfuns. *)
free_tvars : int list;
(* Is fun being analyzed already? *)
on_analysis_stack : bool;
}
The terms context environment is used as in [1]. The
flow of both tfun
and fun
expressions track the flow of clo_env
,
which is a pair of the closure itself (identified by its TFA index) and
the environment for its free type variables.
The analysis is an AST walk, repeated until there is no change to the
tfa_el
associated with any of the AST nodes. TFA elements are updated
during the analysis by an
include_in : ~a:tfa_el -> ~b:tfa_el -> (tfa_bol * bool)
function which includes the sets in b
into that of a
. This is the
data-flow analysis merge operation (compiler text books typically use a
lattice meet operation to represent this, while program analyses texts
denote it as a lattice join operation). The include_in
function simply
performs set inclusion for each of the three constituents of the
analysis and returns if anything changed in a
.
The environment for the analysis itself is defined as
type tfa_env = {
(* The contexts of currently free type variables. *)
ctx_env : context_env;
(* The current calling context. *)
cctx : Context.t;
}
I next outline what the analysis does for relevant AST node types.
In each case, whenever a set inclusion is computed, changes, if any
are noted, and will lead to one more iteration of the analysis. For
an expression e
:
-
App (f, actual_args)
: Three operations are performed at function applications. For eachtfun
expressionf'
that may flow intof
:- The TFA information from the actual arguments are included in
the TFA information of the corresponding formal arguments of
f'
. - The body of
f'
is analyzed. - The TFA information from the body of
f'
is included in the TFA information of function application we are currently analyzing.
- The TFA information from the actual arguments are included in
the TFA information of the corresponding formal arguments of
-
TFun _ | Fun _
: The expresssione
is included in its own corresponding TFA information (reaching_tfuns
andreaching_funs
respectively), with the right context environment for the free type variables (recall that we track the reaching closures as a pair of the closure itself and the context environment for its free type variables). -
TApp (tf, targs)
:- For each free type variable
'A
in the current context, the set of types that may flow into'A
is queried. - For each
targ
intargs
, substitute the free type variables intarg
with the set of types that may flow into them. This involves computing all possible permutations into a final set of closed typestarg_specls
for eachtarg
. For example, if we have@foo 'X -> 'Y
and in the current context,'X
hasInt32
,Int64
flowing into it, and'Y
hasString
andByStr20
flowing into it, then the actual parameter (there is only one in this example) will have the set of closed typestarg_specls
computed asInt32 -> String
,Int32 -> ByStr20
,Int64 -> String
andInt64 -> ByStr20
. - The current context is appened with this
TApp
site -
apply env tf [targ :: remaining_args]
: For eachtfun 'A
expressiontf'
that flows intotf
, do the following:-
targ
's specializations computed earlier (targ_specls
) is included into the formal argument'A
. -
tf'
's sub-expressiontf'_subexpr
is analyzed, with an updated current context that includes'A
as a free type variable. - The environment is updated to include
'A
as a free type variable. Let's call thisenv'
. -
apply
recursively:apply env' tf'_subexpr remaining_args
. - As a precision improvement heuristic, before applying
tf'_subexpr
, we remove thosereaching_tfuns
in it whose context environments are incompatible with the current one. This is helpful becausereaching_tfuns
is not context sensitive.
-
- For each free type variable
- Principles of Program Analysis by Flemming Nielson, Hanne R. Nielson, Chris Hankin.
- An Efficient Type- and Control-Flow Analysis for System F - Adsit and Fluet.
- https://github.com/Zilliqa/scilla-compiler/blob/master/src/astlowering/Monomorphize.ml