SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation

A comprehensive framework for generating spatial reasoning traces using Large Language Models (LLMs) and computer vision tools. This system enables step-by-step spatial reasoning with integrated verification, experiment management, and detailed analysis capabilities.

Overview

Spatial Trace provides a complete pipeline for spatial reasoning research:

Trace Generation: Step-by-step spatial reasoning using LLMs and computer vision tools (SAM2, DAV2, TRELLIS)
Verification System: Built-in trace verification with configurable strictness levels
Evaluation Framework: Comprehensive evaluation on datasets like CLEVR with automated grading
Experiment Management: Organized experiment structure with result tracking and analysis
Tool Analysis: Detailed analysis of tool usage patterns and distributions
Visualization: Rich visualizations for tool usage, accuracy metrics, and reasoning patterns

Key Features

Spatial Reasoning Pipeline

Multi-Tool Integration: SAM2 (segmentation), DAV2 (depth estimation), TRELLIS (3D generation)
LLM Interface: Structured communication with OpenAI's GPT models
Reasoning Traces: Generation of interpretable step-by-step spatial reasoning processes
Verification System: Configurable verification levels (τ = 4, τ = 5) for quality control

Evaluation & Analysis

Automated Evaluation: Process large datasets with accuracy tracking
Tool Usage Analysis: Comprehensive analysis of which tools are used when and where
Quality Assessment: Automated grading and quality metrics

Development & Research

Modular Architecture: Clean separation of concerns across components
Extensible Design: Easy to add new tools and reasoning capabilities
CLI Interface: Command-line tools for batch processing and analysis

Setup and Installation

1. Clone the Repository

git clone https://github.com/your-repo/SpatialTraceGen.git # Replace with your repo URL
cd SpatialTraceGen

2. Create and Activate Conda Environment All tool execution and generation scripts are managed through a single, unified Conda environment.

# Create the environment. The name 'models' is recommended for consistency.
conda create -n models python=3.10
conda activate models

3. Install Dependencies Install the necessary Python packages from the requirements file.

# Ensure you are in the root directory of the repository
pip install -r requirements.txt

4. Set Environment Variables The framework requires an OpenAI API key for the Generator and Verifier models. Export it as an environment variable:

export OPENAI_API_KEY='your-key-here'

You will also need to ensure the paths to the external vision tools (SAM2, DAv2, TRELLIS) are correctly configured within the respective tool implementation files (spatial_trace/spatial_trace/tools/*.py).

Dataset

The reasoning traces generated by this framework are stored in spatial_trace/spatial_trace/evaluation/experiments/. Each subdirectory corresponds to a specific experimental condition described in our paper:

clevr_human_traces_WITHOUT_verification_large/: Corresponds to the No Verification baseline (τ=0) condition in the paper
clevr_human_traces_WITH_basic_verification_large/: Corresponds to the Basic Verification condition (τ=4.0) in the paper
clevr_human_traces_WITH_strict_verification_large/: Corresponds to the Strict Verification condition (τ=5.0) in the paper

Reproducing Experiments from the Paper

This section provides the exact commands and locations needed to reproduce the findings in our paper.

Generating the Reasoning Traces

Run the following commands from the repository's root directory.

1. No Verification (Baseline, $\tau=0$)

python -m spatial_trace.evaluation.quality_generator \
    --dataset clevr_human_subset \
    --experiment clevr_human_traces_WITHOUT_verification_large \
    --max_samples 30 \
    --no-verification

2. Basic Verification ($\tau=4.0$)

python -m spatial_trace.evaluation.quality_generator \
    --dataset clevr_human_subset \
    --experiment clevr_human_traces_WITH_basic_verification_large \
    --max_samples 30 \
    --min_rating 4.0

3. Strict Verification ($\tau=5.0$)

python -m spatial_trace.evaluation.quality_generator \
    --dataset clevr_human_subset \
    --experiment clevr_human_traces_WITH_strict_verification_large \
    --max_samples 30 \
    --min_rating 5.0

Evaluating Final Answer Accuracy

To calculate the final answer accuracy for each condition as reported in our paper, run the provided evaluation scripts from the repository's root directory.

For the no-verification condition ($\tau=0$):
```
python baseline.py
```
For the verification conditions ($\tau=4.0$ and $\tau=5.0$):
```
python main.py
```
Note: You may need to modify the minimum rating parameter in main.py to match the verification condition you are evaluating.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
configs/prompts		configs/prompts
data		data
spatial_trace		spatial_trace
.gitignore		.gitignore
README.md		README.md
baseline.py		baseline.py
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation

Overview

Key Features

Spatial Reasoning Pipeline

Evaluation & Analysis

Development & Research

Setup and Installation

Dataset

Reproducing Experiments from the Paper

Generating the Reasoning Traces

Evaluating Final Answer Accuracy

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

geo-179/spatial_trace

Folders and files

Latest commit

History

Repository files navigation

SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation

Overview

Key Features

Spatial Reasoning Pipeline

Evaluation & Analysis

Development & Research

Setup and Installation

Dataset

Reproducing Experiments from the Paper

Generating the Reasoning Traces

Evaluating Final Answer Accuracy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages