Source processing flow

An overview of the operations fortran-src does on Fortran source.

Note that the online fortran-src Haddock documentation may be useful as a reference while reading this. Click on one of the modules to be taken to its documentation page. Many of the types mentioned here are defined in Language.Fortran.AST.

As of fortran-src 0.10.1

At the highest level, fortran-src parses Fortran source code in Language.Fortran.Parser. This module exports various functions for parsing Fortran, along with primitives for defining your own parser.

Language.Fortran.Parser.byVer selects the default parser for the given Fortran version (see Language.Fortran.Version). A default parser does the following:

Lex & parse (simultaneous)
Post-parse transform

A successful initial parse generates a ProgramFile. There are also "smaller" parsers which will only parse a subset of the ProgramFile, like a single Block or Expression.

After parsing, you may wish to run analyses on the code. (The variable renaming pass at Language.Fortran.Analysis.Renaming is generally prior to running any of these.) fortran-src provides 3 top-level analyses:

Type analysis: Language.Fortran.Analysis.Types
Basic block analysis: Language.Fortran.Analysis.BBlocks
Dataflow analysis (requires basic blocks): Language.Fortran.Analysis.DataFlow

Parse

This is all handled by the Happy parser generator. See e.g. Language.Fortran.Parser.Free.Fortran90 - programParser is a magical definition, generated for us by Happy. Lexing occurs alongside parsing. Parsers inside Language.Fortran.Parser.Free use the free-form Fortran syntax (and appropriate lexer), while those in Parser.Fixed use fixed-form.

Post-parse transform

Due to Fortran's idiosyncrasies, some syntax is ambiguous until we inspect the AST a bit closer. Rather than doing this in the parser (awkward with a parser generator like Happy), we separate these passes out into a set of post-parse transformations. These are by default applied depending on the Fortran version you're parsing. These transformations include:

Function call disambiguation: Language.Fortran.Transformation.Disambiguation.Function
- Subscript syntax is ambiguous whether f(x) means calling function f with argument x, or accessing the element at index x in the array f. Disambiguating requires determining if f is a function (perhaps an intrinsic) or array.
Turn statement-based syntax blocks into actual delimited blocks: Language.Fortran.Transformation.Grouping

Post-parse transformation alters the AST, but not node annotations. Internally, it does the following:

perform temporary renaming pass
perform temporary type analysis pass
perform transformations in sequence with annotations from temporary passes
discard analysis and renamings, return transformed AST

Variable renaming

Analyses variables through the whole AST and produces unique names, which allow ignoring scoping during later analyses. The original names are retained.

analyseRenames generates these renamings and places them in the relevant nodes' annotations.
rename then substitutes these renamings in, over the original names.
unrename puts the original names back.

Type analysis

Type checking depends on some post-parse transformations for correctness (the subscript disambiguation, and intrinsics disambiguation) With type checking (--typecheck CLI, analyseTypes library), after parsing:

Gathers type information on 4 separate traversals through the entire ProgramFile
Uses gathered information to "annotate" Expressions and ProgramUnits: involves evaluating expression types e.g. real + int = real
Types are a mix of BaseType + other syntax tags, and fortran-vars SemanticTypes which include kind

Basic block analysis

Analyses control flow and produces basic block graphs. Each ProgramUnit has its basic block graph inserted into its annotation.

Dataflow analysis

Live variable analysis
Constant expression analysis
- Gathers explicit constants (PARAMETER variables)
- Evaluates a handful of intrinsics and binops

Points of contention

The implicit type analysis isn't ideal. We should be able to identify the exact set of syntax analysis required to perform these post-parse transformations, and do only that.
- Or perhaps these transformations could take place in-line, inside the main syntax analysis?
- It gets even worse as we do more work during type analysis - perhaps we could make these analyses a bit configurable to work around that.
Parts of the constant expression analysis should be done earlier. Explicit constant variables (PARAMETERs) may be used in types as kind parameters, so we should gather that info during type analysis. #192
Constant expression analysis could be expanded to cover more operations, and have behaviour closer to Fortran compilers in use. #192

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Source processing flow

As of fortran-src 0.10.1

Parse

Post-parse transform

Variable renaming

Type analysis

Basic block analysis

Dataflow analysis

Points of contention

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally