Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
174 changes: 174 additions & 0 deletions ERROR_RECOVERY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Error Recovery Implementation - Summary

## Overview

Stellogen now features **comprehensive error recovery** powered by Menhir's incremental parsing API. This significantly improves the developer experience by collecting and reporting multiple parse errors in a single pass.

## Key Features

### ✅ Multiple Error Collection

- Collects up to 20 errors per file (configurable)
- No more fix-compile-fix cycles
- See all problems at once

### ✅ Context-Aware Error Messages

```
error: no opening delimiter for ')'
--> test.sg:2:12

2 | (:= bad1 x))
| ^

hint: remove this delimiter or add a matching opening delimiter
```

Each error includes:
- Exact position from parser state
- Clear message
- Source context with visual pointer
- Helpful hint (when applicable)

### ✅ Smart Recovery Strategies

The parser attempts to continue after errors using context-aware strategies:

- **Extra closing delimiter** → Skip and continue
- **Unexpected token** → Skip to next expression start
- **Nested errors** → Skip to matching delimiter level
- **EOF with unclosed delimiter** → Abort (cannot recover)

### ✅ Leverages Parser State

Uses `Parser.MenhirInterpreter.positions env` for accurate error locations instead of relying on global mutable state.

## Files Added/Modified

### New Files
- **`src/parse_error.ml`** - Error collection, recovery strategies, and contextualization
- **`docs/error_recovery.md`** - Comprehensive documentation
- **`examples/error_recovery_demo.md`** - Usage examples

### Modified Files
- **`src/sgen_parsing.ml`** - Integrated error recovery into incremental parser
- **`docs/incremental_parsing.md`** - Updated to document error recovery

## Example Usage

```bash
# File with multiple errors
$ cat test.sg
(:= good1 42)
(:= bad1 x))
(:= good2 100)

# See all errors at once
$ sgen run test.sg
error: no opening delimiter for ')'
--> test.sg:2:12

2 | (:= bad1 x))
| ^

hint: remove this delimiter or add a matching opening delimiter

error: unexpected symbol ':='
--> test.sg:3:2

3 | (:= good2 100)
| ^

hint: check if this symbol is in the right place

found 2 error(s)
```

## Benefits for Maintainers

### Improved Developer Experience
- See all syntax errors in one pass
- Helpful hints guide toward fixes
- Visual context makes errors easy to locate

### Better Error Quality
- Accurate positions from parser state
- Context-aware messages
- Reduced reliance on global state

### Maintainable Implementation
- Clean separation: `parse_error.ml` handles error logic
- Recovery strategies are clearly defined
- Easy to extend with new recovery heuristics

### Foundation for Future Features
- REPL: Can recover from partial input
- IDE: Real-time error checking
- Batch processing: Continue despite errors

## Known Limitations

### Cascading Errors
Recovery attempts may generate secondary errors. This is a known challenge in error recovery systems.

**Example**:
```stellogen
(:= x ))
' Primary: extra )
' Cascade: parser sees := at top level after recovery
```

### EOF Recovery
Cannot recover past end-of-file with unclosed delimiters (by design).

## Testing

All existing tests pass:
```bash
dune test # ✓ All tests pass
```

Error recovery tested with:
- Single errors
- Multiple independent errors
- Unclosed delimiters
- Extra closing delimiters
- Mixed valid and invalid code

## Implementation Quality

### Code Organization
- **Modular**: Error logic separated from parsing logic
- **Type-safe**: Structured error types
- **Configurable**: Max errors, recovery strategies

### Performance
- Minimal overhead for valid files
- Reasonable performance even with many errors
- Early abort on unrecoverable situations

## Future Enhancements

Potential improvements:
1. Reduce cascading errors with smarter recovery
2. Add error message customization (Menhir `.messages` files)
3. Implement warning suppression for known cascades
4. Generate fix suggestions programmatically
5. IDE integration for real-time checking

## Documentation

- **`docs/error_recovery.md`** - Full technical documentation
- **`examples/error_recovery_demo.md`** - Usage examples and demonstrations
- **`docs/incremental_parsing.md`** - Incremental parser overview

## Conclusion

The error recovery implementation fully leverages Menhir's incremental parsing API to provide:

✅ **Better maintainer experience** through comprehensive error reporting
✅ **Maintainable code** with clean separation of concerns
✅ **Foundation for growth** (REPL, IDE features)
✅ **Production ready** - all tests pass, valid code unaffected

The parser now takes **full advantage of incremental parsing** for error handling, delivering significant improvements in developer experience and code quality.
73 changes: 73 additions & 0 deletions INCREMENTAL_PARSER.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Incremental Parser Implementation

This document provides a quick reference for the incremental parser implementation in Stellogen.

## Overview

**The Stellogen parser now uses Menhir's incremental API by default.** The traditional parser has been completely replaced with the incremental parser in `src/sgen_parsing.ml`.

## Files Modified

- **`src/sgen_parsing.ml`** - Main parser now uses incremental API (replaced traditional parser)
- **`docs/incremental_parsing.md`** - Comprehensive documentation

## Quick Start

The parser is used automatically by all Stellogen code:

```ocaml
(* Standard usage - automatically uses incremental parser *)
let lexbuf = Sedlexing.Utf8.from_string "(:= x 42)" in
let exprs = Sgen_parsing.parse_with_error "<input>" lexbuf
```

## Key Components

### Checkpoint Type
The parser state is represented by `Parser.MenhirInterpreter.checkpoint`:
- `InputNeeded` - needs more input
- `Shifting` / `AboutToReduce` - internal states
- `Accepted result` - success
- `HandlingError` / `Rejected` - errors

### API Functions
- `Parser.Incremental.expr_file` - create initial checkpoint
- `Parser.MenhirInterpreter.offer` - supply token
- `Parser.MenhirInterpreter.resume` - continue parsing

## Configuration

Already enabled in `src/dune`:
```lisp
(menhir
(modules parser)
(flags --table --dump --explain))
```

The `--table` flag enables the incremental API.

## Testing

All existing tests now use the incremental parser:

```bash
# Run all tests
dune test

# Run specific example
dune exec sgen run -- examples/nat.sg
```

## Use Cases

1. **REPL** - parse partial input interactively
2. **IDE features** - syntax highlighting, error recovery
3. **Incremental compilation** - reparse only changed sections
4. **Better error messages** - access to parser state

## See Also

- `docs/incremental_parsing.md` - Full documentation
- [Menhir Manual](https://gallium.inria.fr/~fpottier/menhir/manual.html)
- `src/sgen_parsing.ml` - Incremental parser implementation
- `src/parser.mly` - Parser grammar
6 changes: 3 additions & 3 deletions bin/dune
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
(executables
(public_names sgen)
(names sgen)
(executable
(public_name sgen)
(name sgen)
(libraries stellogen base cmdliner))

(env
Expand Down
Loading
Loading