-
Notifications
You must be signed in to change notification settings - Fork 485
Open
Labels
Description
Issue: symbolic_file.py Test Failure - Incompatibility Between Symbolic Files and libc Functions
Summary
The symbolic_file.py
test consistently fails because Manticore cannot properly handle symbolic file content when it's read through libc functions like getline()
. The test expects multiple execution paths but only generates one.
Test Details
- Location:
examples/script/symbolic_file.py
- Purpose: Demonstrate symbolic file analysis by having Manticore explore both branches of a password check
- Expected: Multiple execution paths (password matches and doesn't match)
- Actual: Only one path explored, assertion fails
Root Cause Analysis
The Core Problem
Manticore's symbolic file implementation is incompatible with libc's buffered I/O functions:
- Type Mismatch: When a symbolic file is opened, Manticore returns a
SymbolicFile
Python object, butgetline()
expects a CFILE*
structure - No Function Modeling:
getline()
is not modeled/hooked by Manticore, so it executes natively without understanding symbolic data - Early Termination: The program exits with code 127 (command/library failure) when trying to use the incompatible file handle
Technical Details
How Symbolic Files Work in Manticore
- Files marked with
--file
flag oradd_symbolic_file()
become symbolic - '+' characters in the file are treated as symbolic bytes
- Manticore creates a
SymbolicFile
object (Python) with a symbolic array - Works at the syscall level (read/write/open)
Why getline() Fails
// In fileio.c
FILE *infile = fopen(fname, "r"); // Returns SymbolicFile, not FILE*
getline(&line, &line_size, infile); // Expects FILE*, crashes
Evidence from Investigation
- State lifecycle: States are killed rather than terminated normally
- No branching: Fork callbacks never trigger
- Hook analysis: File operation hooks don't execute, indicating early failure
- Exit code: 127 indicates library/command failure
- Workspace: Only generates 1 testcase instead of expected 2+
Code Flow
1. Program starts with symbolic file argument
2. fopen() called -> Manticore returns SymbolicFile object
3. getline() called -> Expects FILE* structure
4. Type mismatch causes crash/undefined behavior
5. State is killed with exit code 127
6. No symbolic data reaches strcmp()
7. No branching occurs
Reproduction Steps
cd manticore/examples/script
python symbolic_file.py
# AssertionError: Should have found more than 1 path through the program
Three Options for Fixing
Option 1: Model getline() in Manticore (Recommended Long-term)
Effort: High
Impact: Would fix all similar issues
Implementation would require:
- Hook
getline()
at the binary level - Detect when file handle is a SymbolicFile
- Manually implement getline's behavior with symbolic data
- Return symbolic bytes to the buffer
# Pseudocode for manticore/native/models.py
def getline(state, lineptr, n, stream):
if isinstance(stream, SymbolicFile):
# Read symbolic bytes until newline or EOF
symbolic_data = stream.read_until_newline()
state.cpu.write_bytes(lineptr, symbolic_data)
return len(symbolic_data)
else:
# Fall back to native execution
return state.invoke_model(getline_native, lineptr, n, stream)
Option 2: Rewrite Test to Use Compatible Functions (Recommended Short-term)
Effort: Low
Impact: Fixes this specific test
Replace getline()
with direct read()
syscall:
// Instead of:
getline(&line, &line_size, infile);
// Use:
int fd = open(fname, O_RDONLY);
read(fd, buffer, 11);
This works because Manticore properly handles read() syscalls with symbolic files.
Option 3: Document as Known Limitation (Immediate)
Effort: Minimal
Impact: Prevents CI failures, sets expectations
Mark the test as expected to fail with clear documentation:
- Add to known limitations in docs
- Skip test in CI with explanation
- Document that symbolic files only work with syscalls, not libc buffered I/O
Impact
- CI/CD: Test causes false failures in continuous integration
- User Experience: Users may think symbolic file analysis is broken
- Documentation: Feature appears to work but has undocumented limitations
Related Code
manticore/platforms/linux.py:540-663
- SymbolicFile classmanticore/platforms/linux.py:3847-3852
- _sys_open_get_file()manticore/platforms/linux.py:4155-4160
- generate_workspace_files()manticore/native/cli.py:33-34
- add_symbolic_file initialization
Workaround
Users can work around this by:
- Using programs that read files with syscalls (read/write) instead of libc functions
- Writing custom models for specific libc functions they need
- Using concrete files instead of symbolic for programs using buffered I/O
Test History
- Test was already marked as "flaky" in CI
- Has been a known intermittent issue
- Original
fileio.c
had uninitialized variable bug (fixed in ef8677a)
Recommendation
- Immediate: Disable test in CI with skip marker and explanation
- Short-term: Create alternative test using syscall-based file reading
- Long-term: Implement proper modeling for common libc file functions
References
- Original issue with uninitialized variables: Fixed in commit ef8677a
- Similar limitations exist for other libc functions that use FILE* structures
- Symbolic file feature works correctly with syscall-level operations