-
Notifications
You must be signed in to change notification settings - Fork 1
Scanner Sheet
For the scanner sheet ($instantiation/src/ScannerSheet.x
), Proxima uses the Haskell scanner generator Alex. A Proxima scanner sheet is very similar to a normal Alex scanner, except for some import statement and two combinators that can be used to create Proxima tokens and automatically handle whitespace. More information about Alex can be found in the Alex user guide.
The editor designer needs to declare a datatype for the tokens: UserToken
. In order to prevent cyclic imports and even more type parameters, the file in which UserToken
needs to be defined is $instantiation/src/DocumentType_Generated.hs
(the first part of which is not generated).
The scanner sheet needs to import $instantiation/src/DocumentType_Generated
for the UserToken
type, as well as the module Layout.ScanLib
for the two Proxima-specific combinators and several functions that are used internally by the Alex scanner. The module name specified in the scanner sheet should be ScannerSheetHS
. This means that the following lines are part of every scanner sheet.
{
module ScannerSheetHS where
import DocTypes_Generated
import Layout.ScanLib
}
In Alex, a token is normally constructed with an action, which is a Haskell expression of type String -> Token
for some Token
type. For example, we could have: { \str -> !SomeToken str }
. In a Proxima scanner sheet, a token is constructed almost in the same way, but because Proxima tokens keep track of whitespace and focus information (see the SBLP paper for more information), we need to use a special combinator in the action: mkToken
. The argument to mkToken
is a Haskell expression of type String -> !UserToken
. Thus, the example in Proxima would be: {mkToken $ \ str -> SomeToken str} (where SomeToken
is a constructor of UserToken
.)
In order to let the scanner automatically keep track of whitespace, add the following rule:
[\n \ ]+ { collectWhitespace }
With this rule, whitespace is kept track of automatically, and when the document is presented. The only thing the editor designer needs to do is to declare in the document type definition the IDP
fields for each of the tokens used in the presentation of a nonterminal, and, in the parsing sheet, set these fields with the IDP
s from the tokens that are parsed.
Often, an editor has different lexical structures in one document, for example when shows both a program source and natural text. We can specify different lexers in one scanner sheet using Alex mechanism of start codes. By simply placing identifiers between angle brackets in front of a rule, this rule is only used for the lexing.
-- Main.MartijnSchrage - 16 Jan 2008