ScenarioMax

A high-performance toolkit for autonomous vehicle scenario-based testing and dataset conversion

ScenarioMax is an extension to ScenarioNet that transforms various autonomous driving datasets into standardized formats. Like ScenarioNet, it first converts different datasets (Waymo, nuPlan, nuScenes) to a unified pickle format. ScenarioMax then extends this process with additional pipelines to convert this unified data into formats compatible with Waymax, V-Max, and GPUDrive.

🚀 Key Features

Multi-Dataset Support: Unified interface for Waymo Open Motion Dataset, nuScenes, nuPlan, and OpenScenes
Flexible Output Formats: Convert to TFExample (Waymax/V-Max), JSON (GPUDrive), or unified pickle format
High Performance: Parallel processing with memory optimization and progress monitoring
Two-Stage Architecture: Raw → Unified → Target format pipeline for maximum flexibility
Enhanced Scenarios: Optional scenario enhancement with customizable processing steps

🛠️ Installation

Prerequisites

Python 3.10
uv for fast dependency management
Access to at least one supported dataset (Waymo, nuPlan, or nuScenes)
Sufficient disk space for dataset processing

Basic Installation

# Clone the repository
git clone https://github.com/valeoai/ScenarioMax.git
cd ScenarioMax

# Create and activate virtual environment
uv venv -p 3.10
source .venv/bin/activate

# Install ScenarioMax with dataset support
make womd          # Waymo Open Motion Dataset
make nuplan        # nuPlan dataset
make nuscenes      # nuScenes dataset
make all           # All datasets
make dev           # Development environment

Manual Installation

# For specific datasets
uv pip install -e ".[womd]"      # Waymo support
uv pip install -e ".[nuplan]"    # nuPlan support
uv pip install -e ".[nuscenes]"  # nuScenes support
uv pip install -e ".[dev]"       # Development tools
uv pip install -e ".[all]"       # All datasets support

Environment Setup

For nuPlan dataset, set required environment variables:

export NUPLAN_MAPS_ROOT=/path/to/nuplan/maps
export NUPLAN_DATA_ROOT=/path/to/nuplan/data

🚀 Quick Start

Basic Dataset Conversion

# Convert Waymo dataset to TFRecord format
scenariomax-convert \
  --waymo_src /path/to/waymo/data \
  --dst /path/to/output \
  --target_format tfexample \
  --num_workers 8

# Convert nuScenes to GPUDrive format
scenariomax-convert \
  --nuscenes_src /path/to/nuscenes \
  --dst /path/to/output \
  --target_format gpudrive

# Multi-dataset conversion
scenariomax-convert \
  --waymo_src /data/waymo \
  --nuscenes_src /data/nuscenes \
  --dst /output \
  --target_format tfexample

📊 Usage Examples

Use Case 1: Raw Data to Pickle (Unified Format)

# Create unified format for later processing
scenariomax-convert \
  --waymo_src /data/waymo \
  --dst /unified_output \
  --target_format pickle \
  --num_workers 8

Use Case 2: Enhanced Processing Pipeline

# Raw → Enhanced → TFRecord with scenario enhancement
scenariomax-convert \
  --waymo_src /data/waymo \
  --dst /output \
  --target_format tfexample \
  --enable_enhancement \
  --num_workers 8

Use Case 3: Batch Processing with Multiple Datasets

# Process multiple datasets with sharding
scenariomax-convert \
  --waymo_src /data/waymo \
  --nuplan_src /data/nuplan \
  --nuscenes_src /data/nuscenes \
  --dst /output \
  --target_format tfexample \
  --shard 1000 \
  --num_workers 16

Use Case 4: Two-Stage Processing

# Stage 1: Raw → Pickle
scenariomax-convert \
  --waymo_src /data/waymo \
  --dst /intermediate \
  --target_format pickle

# Stage 2: Pickle → Enhanced → TFRecord
scenariomax-convert \
  --pickle_src /intermediate \
  --dst /final_output \
  --target_format tfexample \
  --enable_enhancement

🗂️ Supported Datasets

Dataset	Version	Link	Status
Waymo Open Motion Dataset	v1.3.0	Site	✅ Full Support
nuPlan	v1.1	Site	✅ Full Support
nuScenes	v1.0	Site	🚧 WIP
Argoverse	v2.0	Site	🚧 WIP

Dataset-Specific Options

# nuScenes with specific split
scenariomax-convert \
  --nuscenes_src /data/nuscenes \
  --split v1.0-trainval \
  --dst /output \
  --target_format tfexample

# nuPlan with direct log parsing
scenariomax-convert \
  --nuplan_src /data/nuplan \
  --nuplan_direct_from_logs \
  --dst /output \
  --target_format gpudrive

📤 Output Formats

TFRecord (TensorFlow/Waymax)

--target_format tfexample

Use Case: Training neural networks with Waymax/V-Max
Output: training.tfrecord files with sharding support

GPUDrive JSON

--target_format gpudrive

Use Case: GPU-accelerated simulation and training
Output: JSON files compatible with GPUDrive simulator

Unified Pickle Format

--target_format pickle

Use Case: Intermediate format for custom processing
Features: Full scenario data preservation, Python-native
Output: .pkl files with complete scenario information

🏗️ Architecture

ScenarioMax uses a two-stage pipeline architecture:

Raw Data → Unified Format → Target Format
    ↓            ↓              ↓
[Dataset]   [Enhancement]  [ML Ready]

Pipeline Stages

Raw to Unified: Dataset-specific parsers convert native formats to standardized Python dictionaries
Enhancement (Optional): Apply transformations, filtering, or augmentation
Unified to Target: Convert to training-ready formats (TFRecord, JSON, etc.)

Key Components

pipeline.py: Main orchestrator with multi-dataset support
dataset_registry.py: Dynamic dataset configuration system
raw_to_unified/: Dataset-specific extractors and converters
unified_to_*/: Target format converters
core/write.py: Parallel processing with memory management

🔧 Configuration

Command Line Options

# Processing options
--num_workers 8              # Parallel workers (default: 8)
--shard 1000                 # Output sharding
--num_files 100              # Limit files processed
--enable_enhancement         # Enable scenario enhancement

# Dataset options
--split v1.0-trainval        # nuScenes data split
--nuplan_direct_from_logs    # Alternative nuPlan parsing

# Output options
--tfrecord_name training     # TFRecord filename
--log_level INFO             # Logging verbosity
--log_file /path/to/log      # Log file location

📚 Additional Resources

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
devkit		devkit
docs		docs
scenariomax		scenariomax
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ScenarioMax

🚀 Key Features

📋 Table of Contents

🛠️ Installation

Prerequisites

Basic Installation

Manual Installation

Environment Setup

🚀 Quick Start

Basic Dataset Conversion

📊 Usage Examples

Use Case 1: Raw Data to Pickle (Unified Format)

Use Case 2: Enhanced Processing Pipeline

Use Case 3: Batch Processing with Multiple Datasets

Use Case 4: Two-Stage Processing

🗂️ Supported Datasets

Dataset-Specific Options

📤 Output Formats

TFRecord (TensorFlow/Waymax)

GPUDrive JSON

Unified Pickle Format

🏗️ Architecture

Pipeline Stages

Key Components

🔧 Configuration

Command Line Options

📚 Additional Resources

📄 License

About

Uh oh!

Contributors 2

Uh oh!

Languages

License

valeoai/ScenarioMax

Folders and files

Latest commit

History

Repository files navigation

ScenarioMax

🚀 Key Features

📋 Table of Contents

🛠️ Installation

Prerequisites

Basic Installation

Manual Installation

Environment Setup

🚀 Quick Start

Basic Dataset Conversion

📊 Usage Examples

Use Case 1: Raw Data to Pickle (Unified Format)

Use Case 2: Enhanced Processing Pipeline

Use Case 3: Batch Processing with Multiple Datasets

Use Case 4: Two-Stage Processing

🗂️ Supported Datasets

Dataset-Specific Options

📤 Output Formats

TFRecord (TensorFlow/Waymax)

GPUDrive JSON

Unified Pickle Format

🏗️ Architecture

Pipeline Stages

Key Components

🔧 Configuration

Command Line Options

📚 Additional Resources

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages