Replies: 2 comments 7 replies
-
This is an excellent question! The java/java/ grammar is the "optimized" grammar for Java; the other three are derived from the Java Language Specification (JLS). The java/java/ grammar was written long ago by Parr to demonstrate the then "new" Antlr4 capabilities. I don't know whether he started from the JLS, but it seems plausible. Unfortunately, the refactorings performed to optimize the grammar were never documented. I've been trying to reverse-engineer the refactorings, but it's not complete, and I haven't worked on that for a long time. But, generally, the optimizations fall into several broad categories.
|
Beta Was this translation helpful? Give feedback.
-
Indeed, it is often the case that an unoptimized grammar can be 1 to 2 orders of magnitude slower than the optimized grammar. This is because the Antlr parser engine is a straightforward implementation of an NFA graph interpreter with some DFA graph caching. Many DFA states are computed in ambiguous and non-left-factored grammars as compared to optimized grammars. But, as you say, this isn't the whole story. Even for optimized grammars, it'll still be slow because the interpreter starts with an empty DFA graph. There has been some discussion of preloading the DFA cache before parsing: antlr/antlr4#3682. This would speed up parse times, but it is unclear by how much in a complete parse, as it would be offset by the time required to preload the cache. And, in my opinion, I don't think the data structures are the best representation of the DFA graph. For the Java grammars, I think it is essential that When a grammar changed, I added code to test and output ambiguity, but most contributors don't look at it. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This repository contains various Java grammars:
java/java8
,java/java9
,java/java20
andjava/java
. I was playing around with them and discovered that the only one which is really usable in terms of performance is the one thatjava/java
. In fact, that one seems to be at least 30 or 40 times faster than the other Java grammars. (I mean, obviously, the parsers generated from them!) In fact, otherREADME
files refer to thejava/java
grammar as the optimized Java grammar.What I would love for somebody to explain to me is what the specific diffs there are that make the optimized grammar so much faster!!?? (Or conversely, why are the other grammars so much slower?) Surely, this is crying out for some explanation, no?
Beta Was this translation helpful? Give feedback.
All reactions