Skip to content
Martijn Schrage edited this page Mar 2, 2017 · 1 revision

The module $instantiation/src/ProxParser.hs declares the parsers for all nonterminals (also called document nodes) that have a parsing presentation. The parsers are written using the uulib parsing-combinator library. The algorithm for parsing in Proxima and the interaction between the scanner and the parser is explained in the paper: Beyond ASCII - Parsing programs with graphical presentations (pdf). Slides for a research presentation of this paper are available at Publications. The parsing algorithm has undergone major changes, which are not yet reflected in the Dazzle and Helium editors. In the parsers for these editors, structurally-presented parts of the presentation are still explicitly parsed. The declaration form uses the new automatic structure recognition algorithm.

The top-level parser recognizeEnrichedDoc needs to be defined in ProxParser, as well as all the parsers that are mentioned in the presentation sheet. If a node has a structural presentation, its parser is simply pStructural, but for a parsing presentation, a parser need to be specified. These parsers are similar to ordinary combinator parsers, except that they operate on the Proxima Token datatype, and that they must store the IDP of every token that is parsed.

As an example, consider a UserToken datatype that contains a token for keywords:

data UserToken = KeyTk String | ...

The pToken combinator constructs a parser for KeyTk tokens.

pKey str  = pToken (KeyTk str)

An example parser that parses an if expression may look as follows:

parseIfExp = (\tk1 b tk2 t tk3 e -> IfExp (getTokenIDP tk1) (getTokenIDP tk2) (getTokenIDP tk3) b t  e
         <$> pKey "if" <*> parseExp <*> pKey "then <*> parseExp <*> pKey "else" <*> parseExp

The difference with an ordinary combinator parser is that the returned value gets the IDP's of the parsed tokens. The fields for the IDP's are specified in the declaration for the IfExp constructor in $instantiation/src/DocumentType.prx:

data Exp = IfExp exp1:Exp exp2:Exp exp3:Exp              { idP0:IDP idP1:IDP idP2:IDP }
         | ...

A special mechanism is available to parse presentations that do not contain all the necessary information the construct the enriched document. For example, in a source editor, this occurs when parsing the right-hand side of a collapsed function definition. Such missing information (also called interpretation extra state) can be recovered by employing reuse<TYPE> functions, which are generated for each nonterminal in the enriched document type. The information is taken from the previous value of the enriched document. The reuse mechanism is explained in Sections 4.2 and 7.2.3 of the Proxima PhD thesis.

-- Main.MartijnSchrage - 05 Mar 2010

Clone this wiki locally