Why can't we use dynamic lexing with the postlex option? #1537

nchammas · 2025-06-05T22:21:25Z

nchammas
Jun 5, 2025

I am referring to this restriction:

Line 383 in 87bb8ef

    
           raise ConfigurationError("Can't use postlex with a dynamic lexer. Use basic or contextual instead")

I am developing an indentation-sensitive language so I need to use postlex to generate indent/dedent tokens. I am using Earley because that is the default and because I understand it is much more flexible and forgiving than LALR(1).

My problem is that I am trying to add a rule to my grammar that is just syntactic sugar for another rule. Lark matches the rule for the syntactic sugar before the rule for the base expression, which causes parsing to fail.

Here is an example derived from the docs:

from lark import Lark
from lark.indenter import Indenter

tree_grammar = r"""
    %import common.CNAME -> NAME
    %import common.WS_INLINE
    %import common.SH_COMMENT
    %ignore WS_INLINE
    %ignore SH_COMMENT
    %declare _INDENT _DEDENT

    ?start: _NL* tree
    tree: node _NL [_INDENT tree+ _DEDENT]
    node: "node" NAME | special_node
    special_node: "special"
    _NL: (/\r?\n[\t ]*/ | SH_COMMENT)+
"""

class TreeIndenter(Indenter):
    NL_type = '_NL'
    OPEN_PAREN_types = []
    CLOSE_PAREN_types = []
    INDENT_type = '_INDENT'
    DEDENT_type = '_DEDENT'
    tab_len = 8

parser = Lark(
    tree_grammar,
    # parser='lalr',
    parser='earley',
    # lexer='dynamic',
    postlex=TreeIndenter(),
)

test_tree = """
node a
    # check this comment out
    node special
    node c
"""

def test():
    print(parser.parse(test_tree).pretty())

if __name__ == '__main__':
    test()

The key feature here is that special is syntactic sugar for node special. When special occurs after node it is just a NAME. Otherwise, it is its own thing, special_node. At least, that's my intention.

This fails with:

lark.exceptions.UnexpectedToken: Unexpected token Token('SPECIAL', 'special') at line 4, column 10.
Expected one of: 
        * NAME

If I switch to parser='lalr' it works fine:

tree
  node  a
  tree
    node        special
  tree
    node        c

My actual grammar is not a valid LALR(1) grammar, but has this same problem. I believe it's because Lark is using the basic lexer instead of the dynamic one.

What are my options for solving this problem? Do I basically need to refine my grammar into a LALR(1) grammar? And why does using postlex prevent us from using dynamic lexing with Earley?

Answered by erezsh

Jun 6, 2025

Yes, that seems about right. It should be possible to implement a contextual lexer for early, or postlex step for the dynamic lexer, but as far as I know, no one has tried, and it wouldn't be trivial to do.

View full answer

erezsh · 2025-06-05T23:16:54Z

erezsh
Jun 5, 2025
Maintainer

You can use the "basic" lexer with Earley. Did you try doing that? Maybe it will work, and then you can use postlex.

5 replies

nchammas Jun 5, 2025
Author

I believe the basic lexer is chosen automatically by Lark when the parser is set to Earley and the postlex option is configured. But this leads to exactly the problem demonstrated in my post above.

To double check, I specified the basic lexer explicitly in the example script I posted and confirmed the error is the same:

parser = Lark(
    tree_grammar,
    parser='earley',
    lexer='basic',
    postlex=TreeIndenter(),
)

Output:

lark.exceptions.UnexpectedToken: Unexpected token Token('SPECIAL', 'special') at line 4, column 10.
Expected one of: 
        * NAME

nchammas Jun 5, 2025
Author

And why does using postlex prevent us from using dynamic lexing with Earley?

An alternate way of getting at the same point is to observe that Lark allows indentation-sensitive grammars to be more flexible when they are LALR(1)-compatible.

This is surprising because LALR(1) is supposed to be more stringent than Earley. However, in the case of indentation-sensitive grammars, Lark requires you to use the postlex option, which in turn disables Earley's dynamic lexer, which prevents you from expressing the kind of syntatic sugar rule in my example above.

Am I understanding correctly? If so, I suppose my only option then is to change my grammar to make it LALR(1)-compatible.

erezsh Jun 6, 2025
Maintainer

Yes, that seems about right. It should be possible to implement a contextual lexer for early, or postlex step for the dynamic lexer, but as far as I know, no one has tried, and it wouldn't be trivial to do.

Answer selected by nchammas

nchammas Jun 6, 2025
Author

Thankfully, yesterday I was able to update my grammar to make it LALR(1)-compatible, so this is no longer an issue for me. It was much easier than I expected! Turns out I had what can only be described as some mistakes in my grammar that Earley graciously accepted; fixing them made the grammar compatible with LALR(1).

Side note: GitHub Copilot with GPT 4.1 was very unhelpful in this task, confidently sending me down dead-end paths. It was my first time using it from within my IDE, so I thought it would work really well with all that context available to it. But it just doesn't understand Lark well enough, which I found surprising given Lark's popularity. It couldn't even get Lark's syntax for rule/terminal priority correct!

erezsh Jun 6, 2025
Maintainer

It's hard to predict when the LLM will get it right and when it won't. But you should try different models, sometimes one is better than the others at specific tasks. Also maybe try windsurf, imo their agent is a bit better.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Why can't we use dynamic lexing with the postlex option? #1537

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Why can't we use dynamic lexing with the postlex option? #1537

Uh oh!

nchammas Jun 5, 2025

Replies: 1 comment · 5 replies

Uh oh!

erezsh Jun 5, 2025 Maintainer

Uh oh!

nchammas Jun 5, 2025 Author

Uh oh!

nchammas Jun 5, 2025 Author

Uh oh!

erezsh Jun 6, 2025 Maintainer

Uh oh!

nchammas Jun 6, 2025 Author

Uh oh!

erezsh Jun 6, 2025 Maintainer

nchammas
Jun 5, 2025

Replies: 1 comment 5 replies

erezsh
Jun 5, 2025
Maintainer

nchammas Jun 5, 2025
Author

nchammas Jun 5, 2025
Author

erezsh Jun 6, 2025
Maintainer

nchammas Jun 6, 2025
Author

erezsh Jun 6, 2025
Maintainer