-
I am referring to this restriction: Line 383 in 87bb8ef I am developing an indentation-sensitive language so I need to use My problem is that I am trying to add a rule to my grammar that is just syntactic sugar for another rule. Lark matches the rule for the syntactic sugar before the rule for the base expression, which causes parsing to fail. Here is an example derived from the docs: from lark import Lark
from lark.indenter import Indenter
tree_grammar = r"""
%import common.CNAME -> NAME
%import common.WS_INLINE
%import common.SH_COMMENT
%ignore WS_INLINE
%ignore SH_COMMENT
%declare _INDENT _DEDENT
?start: _NL* tree
tree: node _NL [_INDENT tree+ _DEDENT]
node: "node" NAME | special_node
special_node: "special"
_NL: (/\r?\n[\t ]*/ | SH_COMMENT)+
"""
class TreeIndenter(Indenter):
NL_type = '_NL'
OPEN_PAREN_types = []
CLOSE_PAREN_types = []
INDENT_type = '_INDENT'
DEDENT_type = '_DEDENT'
tab_len = 8
parser = Lark(
tree_grammar,
# parser='lalr',
parser='earley',
# lexer='dynamic',
postlex=TreeIndenter(),
)
test_tree = """
node a
# check this comment out
node special
node c
"""
def test():
print(parser.parse(test_tree).pretty())
if __name__ == '__main__':
test() The key feature here is that This fails with:
If I switch to
My actual grammar is not a valid LALR(1) grammar, but has this same problem. I believe it's because Lark is using the basic lexer instead of the dynamic one. What are my options for solving this problem? Do I basically need to refine my grammar into a LALR(1) grammar? And why does using |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
You can use the "basic" lexer with Earley. Did you try doing that? Maybe it will work, and then you can use postlex. |
Beta Was this translation helpful? Give feedback.
Yes, that seems about right. It should be possible to implement a contextual lexer for early, or postlex step for the dynamic lexer, but as far as I know, no one has tried, and it wouldn't be trivial to do.