Skip to content

Commit 9cef558

Browse files
committed
Use incremental parsing and add error recovery
1 parent f4122fd commit 9cef558

File tree

8 files changed

+1044
-24
lines changed

8 files changed

+1044
-24
lines changed

ERROR_RECOVERY.md

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# Error Recovery Implementation - Summary
2+
3+
## Overview
4+
5+
Stellogen now features **comprehensive error recovery** powered by Menhir's incremental parsing API. This significantly improves the developer experience by collecting and reporting multiple parse errors in a single pass.
6+
7+
## Key Features
8+
9+
### ✅ Multiple Error Collection
10+
11+
- Collects up to 20 errors per file (configurable)
12+
- No more fix-compile-fix cycles
13+
- See all problems at once
14+
15+
### ✅ Context-Aware Error Messages
16+
17+
```
18+
error: no opening delimiter for ')'
19+
--> test.sg:2:12
20+
21+
2 | (:= bad1 x))
22+
| ^
23+
24+
hint: remove this delimiter or add a matching opening delimiter
25+
```
26+
27+
Each error includes:
28+
- Exact position from parser state
29+
- Clear message
30+
- Source context with visual pointer
31+
- Helpful hint (when applicable)
32+
33+
### ✅ Smart Recovery Strategies
34+
35+
The parser attempts to continue after errors using context-aware strategies:
36+
37+
- **Extra closing delimiter** → Skip and continue
38+
- **Unexpected token** → Skip to next expression start
39+
- **Nested errors** → Skip to matching delimiter level
40+
- **EOF with unclosed delimiter** → Abort (cannot recover)
41+
42+
### ✅ Leverages Parser State
43+
44+
Uses `Parser.MenhirInterpreter.positions env` for accurate error locations instead of relying on global mutable state.
45+
46+
## Files Added/Modified
47+
48+
### New Files
49+
- **`src/parse_error.ml`** - Error collection, recovery strategies, and contextualization
50+
- **`docs/error_recovery.md`** - Comprehensive documentation
51+
- **`examples/error_recovery_demo.md`** - Usage examples
52+
53+
### Modified Files
54+
- **`src/sgen_parsing.ml`** - Integrated error recovery into incremental parser
55+
- **`docs/incremental_parsing.md`** - Updated to document error recovery
56+
57+
## Example Usage
58+
59+
```bash
60+
# File with multiple errors
61+
$ cat test.sg
62+
(:= good1 42)
63+
(:= bad1 x))
64+
(:= good2 100)
65+
66+
# See all errors at once
67+
$ sgen run test.sg
68+
error: no opening delimiter for ')'
69+
--> test.sg:2:12
70+
71+
2 | (:= bad1 x))
72+
| ^
73+
74+
hint: remove this delimiter or add a matching opening delimiter
75+
76+
error: unexpected symbol ':='
77+
--> test.sg:3:2
78+
79+
3 | (:= good2 100)
80+
| ^
81+
82+
hint: check if this symbol is in the right place
83+
84+
found 2 error(s)
85+
```
86+
87+
## Benefits for Maintainers
88+
89+
### Improved Developer Experience
90+
- See all syntax errors in one pass
91+
- Helpful hints guide toward fixes
92+
- Visual context makes errors easy to locate
93+
94+
### Better Error Quality
95+
- Accurate positions from parser state
96+
- Context-aware messages
97+
- Reduced reliance on global state
98+
99+
### Maintainable Implementation
100+
- Clean separation: `parse_error.ml` handles error logic
101+
- Recovery strategies are clearly defined
102+
- Easy to extend with new recovery heuristics
103+
104+
### Foundation for Future Features
105+
- REPL: Can recover from partial input
106+
- IDE: Real-time error checking
107+
- Batch processing: Continue despite errors
108+
109+
## Known Limitations
110+
111+
### Cascading Errors
112+
Recovery attempts may generate secondary errors. This is a known challenge in error recovery systems.
113+
114+
**Example**:
115+
```stellogen
116+
(:= x ))
117+
' Primary: extra )
118+
' Cascade: parser sees := at top level after recovery
119+
```
120+
121+
### EOF Recovery
122+
Cannot recover past end-of-file with unclosed delimiters (by design).
123+
124+
## Testing
125+
126+
All existing tests pass:
127+
```bash
128+
dune test # ✓ All tests pass
129+
```
130+
131+
Error recovery tested with:
132+
- Single errors
133+
- Multiple independent errors
134+
- Unclosed delimiters
135+
- Extra closing delimiters
136+
- Mixed valid and invalid code
137+
138+
## Implementation Quality
139+
140+
### Code Organization
141+
- **Modular**: Error logic separated from parsing logic
142+
- **Type-safe**: Structured error types
143+
- **Configurable**: Max errors, recovery strategies
144+
145+
### Performance
146+
- Minimal overhead for valid files
147+
- Reasonable performance even with many errors
148+
- Early abort on unrecoverable situations
149+
150+
## Future Enhancements
151+
152+
Potential improvements:
153+
1. Reduce cascading errors with smarter recovery
154+
2. Add error message customization (Menhir `.messages` files)
155+
3. Implement warning suppression for known cascades
156+
4. Generate fix suggestions programmatically
157+
5. IDE integration for real-time checking
158+
159+
## Documentation
160+
161+
- **`docs/error_recovery.md`** - Full technical documentation
162+
- **`examples/error_recovery_demo.md`** - Usage examples and demonstrations
163+
- **`docs/incremental_parsing.md`** - Incremental parser overview
164+
165+
## Conclusion
166+
167+
The error recovery implementation fully leverages Menhir's incremental parsing API to provide:
168+
169+
✅ **Better maintainer experience** through comprehensive error reporting
170+
✅ **Maintainable code** with clean separation of concerns
171+
✅ **Foundation for growth** (REPL, IDE features)
172+
✅ **Production ready** - all tests pass, valid code unaffected
173+
174+
The parser now takes **full advantage of incremental parsing** for error handling, delivering significant improvements in developer experience and code quality.

INCREMENTAL_PARSER.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Incremental Parser Implementation
2+
3+
This document provides a quick reference for the incremental parser implementation in Stellogen.
4+
5+
## Overview
6+
7+
**The Stellogen parser now uses Menhir's incremental API by default.** The traditional parser has been completely replaced with the incremental parser in `src/sgen_parsing.ml`.
8+
9+
## Files Modified
10+
11+
- **`src/sgen_parsing.ml`** - Main parser now uses incremental API (replaced traditional parser)
12+
- **`docs/incremental_parsing.md`** - Comprehensive documentation
13+
14+
## Quick Start
15+
16+
The parser is used automatically by all Stellogen code:
17+
18+
```ocaml
19+
(* Standard usage - automatically uses incremental parser *)
20+
let lexbuf = Sedlexing.Utf8.from_string "(:= x 42)" in
21+
let exprs = Sgen_parsing.parse_with_error "<input>" lexbuf
22+
```
23+
24+
## Key Components
25+
26+
### Checkpoint Type
27+
The parser state is represented by `Parser.MenhirInterpreter.checkpoint`:
28+
- `InputNeeded` - needs more input
29+
- `Shifting` / `AboutToReduce` - internal states
30+
- `Accepted result` - success
31+
- `HandlingError` / `Rejected` - errors
32+
33+
### API Functions
34+
- `Parser.Incremental.expr_file` - create initial checkpoint
35+
- `Parser.MenhirInterpreter.offer` - supply token
36+
- `Parser.MenhirInterpreter.resume` - continue parsing
37+
38+
## Configuration
39+
40+
Already enabled in `src/dune`:
41+
```lisp
42+
(menhir
43+
(modules parser)
44+
(flags --table --dump --explain))
45+
```
46+
47+
The `--table` flag enables the incremental API.
48+
49+
## Testing
50+
51+
All existing tests now use the incremental parser:
52+
53+
```bash
54+
# Run all tests
55+
dune test
56+
57+
# Run specific example
58+
dune exec sgen run -- examples/nat.sg
59+
```
60+
61+
## Use Cases
62+
63+
1. **REPL** - parse partial input interactively
64+
2. **IDE features** - syntax highlighting, error recovery
65+
3. **Incremental compilation** - reparse only changed sections
66+
4. **Better error messages** - access to parser state
67+
68+
## See Also
69+
70+
- `docs/incremental_parsing.md` - Full documentation
71+
- [Menhir Manual](https://gallium.inria.fr/~fpottier/menhir/manual.html)
72+
- `src/sgen_parsing.ml` - Incremental parser implementation
73+
- `src/parser.mly` - Parser grammar

bin/dune

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
(executables
2-
(public_names sgen)
3-
(names sgen)
1+
(executable
2+
(public_name sgen)
3+
(name sgen)
44
(libraries stellogen base cmdliner))
55

66
(env

0 commit comments

Comments
 (0)