A tool for analyzing Rust codebases and extracting code structure information (atoms and dependencies) using verus-analyzer and scip.
├── Cargo.toml # Rust project configuration
├── Dockerfile # Docker container configuration
├── docker-compose.yml # Docker Compose configuration
├── run_compose.sh # Main entry script
├── run.sh # Internal Docker script
├── src/ # Rust source code
│ ├── lib.rs # Library root
│ ├── bin/
│ │ └── write_atoms.rs # Main binary for SCIP processing
│ └── scip_to_call_graph_json.rs # Core SCIP parsing logic
├── scripts/ # Python scripts
│ └── populate_atomsdeps_grouped_rust.py # Database population script
├── logs/ # Generated log files
│ ├── atomizer_*_*.log # Rust atomizer logs
│ └── populate_atoms_*.log # Python script logs
- SCIP Generation: Uses verus-analyzer to generate SCIP files from Rust source code
- JSON Conversion: Converts SCIP data to a structured JSON format containing atoms (code elements) and their relationships with scip
- Logging: Both Rust and Python components generate detailed logs for debugging and auditing
- Database Population: Stores the extracted code structure in a MySQL database
- Docker and Docker Compose
- MySQL database accessible at
127.0.0.1
with database nameverilib
- Environment variable
DB_PASSWORD
set for MySQL connection
Run the analysis on a Rust repository:
./run_compose.sh <rust_repo_path> <repo_id> [user_id]
<rust_repo_path>
: Path to the Rust repository you want to atomize<repo_id>
: Numeric identifier for the repository in the database[user_id]
: Optional user identifier (defaults to 460176 if not provided)
# Using default user_id (460176)
./run_compose.sh /path/to/my-rust-project 123
# With custom user_id
./run_compose.sh /path/to/my-rust-project 123 789
- Docker Build: Builds the analysis container with Rust toolchain, verus-analyzer, and SCIP tools
- SCIP Analysis:
- Runs
verus-analyzer scip
on the target repository - Converts SCIP output to JSON format using the
write_atoms
binary - Generates timestamped log files in
logs/atomizer_{repo_id}_{timestamp}.log
- Runs
- Database Population:
- Parses the generated JSON file
- Populates the database with code atoms (functions, files, folders)
- Establishes dependency relationships between atoms
- Associates all operations with the specified user_id
- Generates timestamped log files in
logs/populate_atoms_{timestamp}.log
- Log Aggregation:
- Combines both Rust and Python logs into a single database entry
- Stores comprehensive execution logs in the
atomizerlogs
table with repo_id and user_id
- JSON File:
<repo_name>.json
containing structured code analysis - Log Files:
logs/atomizer_{repo_id}_{timestamp}.log
: Rust processing logslogs/populate_atoms_{timestamp}.log
: Python processing logs
- Database Records: Code atoms and dependencies stored in MySQL tables:
atoms
: Individual code elements with user_id associationatomsdependencies
: Dependencies between elements in theatoms
tablereposfolders
: Folder structurecodes
: File contents and metadataatomizerlogs
: Combined execution logs for debugging and auditing (includes repo_id and user_id)
The tool provides comprehensive logging at multiple levels:
- Real-time progress updates during execution
- Immediate feedback for debugging
- Rust Component: Timestamped logs with session metadata including repo_id and user_id
- Python Component: Detailed database operation logs
- Persistent Storage: All logs preserved for later analysis
- Combined Logs: Rust and Python logs merged into single database entries
- User Association: All log entries tagged with repo_id and user_id for tracking
- Structured Format: Clear separation between different execution phases
- Audit Trail: Complete record of all operations per repository and user
Each database log entry contains:
=== RUST ATOMIZER LOGS ===
[timestamp] [level] message
...
=== END RUST ATOMIZER LOGS ===
=== PYTHON POPULATE SCRIPT LOGS ===
timestamp - component - level - message
...
=== END PYTHON POPULATE SCRIPT LOGS ===
- Debian bookworm-slim base
- Rust toolchain (latest stable)
- verus-analyzer
- SCIP v0.5.2
- Python 3 with mysql-connector-python
- chrono crate for Rust timestamping
Expects MySQL tables: atoms
, codes
, reposfolders
, atomizerlogs
with specific schema for code analysis storage. All relevant tables should support user_id field for user association.
- Ensure the target repository has valid Rust code
- Check that MySQL is running and accessible
- Verify
DB_PASSWORD
environment variable is set - For repositories without
Cargo.toml
, the tool will automatically create one for standalone.rs
files - Check log files in the
logs/
directory for detailed error information - Review the
atomizerlogs
table for complete execution history filtered by repo_id and user_id - Enable debug mode by setting
DEBUG=true
environment variable for verbose logging - If using a custom user_id, ensure it's a valid numeric identifier in your system