Skip to content

valeoai/ScenarioMax

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ScenarioMax

Python 3.10+ License: MIT

A high-performance toolkit for autonomous vehicle scenario-based testing and dataset conversion

ScenarioMax is an extension to ScenarioNet that transforms various autonomous driving datasets into standardized formats. Like ScenarioNet, it first converts different datasets (Waymo, nuPlan, nuScenes) to a unified pickle format. ScenarioMax then extends this process with additional pipelines to convert this unified data into formats compatible with Waymax, V-Max, and GPUDrive.

Scheme

πŸš€ Key Features

  • Multi-Dataset Support: Unified interface for Waymo Open Motion Dataset, nuScenes, nuPlan, and OpenScenes
  • Flexible Output Formats: Convert to TFExample (Waymax/V-Max), JSON (GPUDrive), or unified pickle format
  • High Performance: Parallel processing with memory optimization and progress monitoring
  • Two-Stage Architecture: Raw β†’ Unified β†’ Target format pipeline for maximum flexibility
  • Enhanced Scenarios: Optional scenario enhancement with customizable processing steps

πŸ“‹ Table of Contents

πŸ› οΈ Installation

Prerequisites

  • Python 3.10
  • uv for fast dependency management
  • Access to at least one supported dataset (Waymo, nuPlan, or nuScenes)
  • Sufficient disk space for dataset processing

Basic Installation

# Clone the repository
git clone https://github.com/valeoai/ScenarioMax.git
cd ScenarioMax

# Create and activate virtual environment
uv venv -p 3.10
source .venv/bin/activate

# Install ScenarioMax with dataset support
make womd          # Waymo Open Motion Dataset
make nuplan        # nuPlan dataset
make nuscenes      # nuScenes dataset
make all           # All datasets
make dev           # Development environment

Manual Installation

# For specific datasets
uv pip install -e ".[womd]"      # Waymo support
uv pip install -e ".[nuplan]"    # nuPlan support
uv pip install -e ".[nuscenes]"  # nuScenes support
uv pip install -e ".[dev]"       # Development tools
uv pip install -e ".[all]"       # All datasets support

Environment Setup

For nuPlan dataset, set required environment variables:

export NUPLAN_MAPS_ROOT=/path/to/nuplan/maps
export NUPLAN_DATA_ROOT=/path/to/nuplan/data

πŸš€ Quick Start

Basic Dataset Conversion

# Convert Waymo dataset to TFRecord format
scenariomax-convert \
  --waymo_src /path/to/waymo/data \
  --dst /path/to/output \
  --target_format tfexample \
  --num_workers 8

# Convert nuScenes to GPUDrive format
scenariomax-convert \
  --nuscenes_src /path/to/nuscenes \
  --dst /path/to/output \
  --target_format gpudrive

# Multi-dataset conversion
scenariomax-convert \
  --waymo_src /data/waymo \
  --nuscenes_src /data/nuscenes \
  --dst /output \
  --target_format tfexample

πŸ“Š Usage Examples

Use Case 1: Raw Data to Pickle (Unified Format)

# Create unified format for later processing
scenariomax-convert \
  --waymo_src /data/waymo \
  --dst /unified_output \
  --target_format pickle \
  --num_workers 8

Use Case 2: Enhanced Processing Pipeline

# Raw β†’ Enhanced β†’ TFRecord with scenario enhancement
scenariomax-convert \
  --waymo_src /data/waymo \
  --dst /output \
  --target_format tfexample \
  --enable_enhancement \
  --num_workers 8

Use Case 3: Batch Processing with Multiple Datasets

# Process multiple datasets with sharding
scenariomax-convert \
  --waymo_src /data/waymo \
  --nuplan_src /data/nuplan \
  --nuscenes_src /data/nuscenes \
  --dst /output \
  --target_format tfexample \
  --shard 1000 \
  --num_workers 16

Use Case 4: Two-Stage Processing

# Stage 1: Raw β†’ Pickle
scenariomax-convert \
  --waymo_src /data/waymo \
  --dst /intermediate \
  --target_format pickle

# Stage 2: Pickle β†’ Enhanced β†’ TFRecord
scenariomax-convert \
  --pickle_src /intermediate \
  --dst /final_output \
  --target_format tfexample \
  --enable_enhancement

πŸ—‚οΈ Supported Datasets

Dataset Version Link Status
Waymo Open Motion Dataset v1.3.0 Site βœ… Full Support
nuPlan v1.1 Site βœ… Full Support
nuScenes v1.0 Site 🚧 WIP
Argoverse v2.0 Site 🚧 WIP

Dataset-Specific Options

# nuScenes with specific split
scenariomax-convert \
  --nuscenes_src /data/nuscenes \
  --split v1.0-trainval \
  --dst /output \
  --target_format tfexample

# nuPlan with direct log parsing
scenariomax-convert \
  --nuplan_src /data/nuplan \
  --nuplan_direct_from_logs \
  --dst /output \
  --target_format gpudrive

πŸ“€ Output Formats

TFRecord (TensorFlow/Waymax)

--target_format tfexample
  • Use Case: Training neural networks with Waymax/V-Max
  • Output: training.tfrecord files with sharding support

GPUDrive JSON

--target_format gpudrive
  • Use Case: GPU-accelerated simulation and training
  • Output: JSON files compatible with GPUDrive simulator

Unified Pickle Format

--target_format pickle
  • Use Case: Intermediate format for custom processing
  • Features: Full scenario data preservation, Python-native
  • Output: .pkl files with complete scenario information

πŸ—οΈ Architecture

ScenarioMax uses a two-stage pipeline architecture:

Raw Data β†’ Unified Format β†’ Target Format
    ↓            ↓              ↓
[Dataset]   [Enhancement]  [ML Ready]

Pipeline Stages

  1. Raw to Unified: Dataset-specific parsers convert native formats to standardized Python dictionaries
  2. Enhancement (Optional): Apply transformations, filtering, or augmentation
  3. Unified to Target: Convert to training-ready formats (TFRecord, JSON, etc.)

Key Components

  • pipeline.py: Main orchestrator with multi-dataset support
  • dataset_registry.py: Dynamic dataset configuration system
  • raw_to_unified/: Dataset-specific extractors and converters
  • unified_to_*/: Target format converters
  • core/write.py: Parallel processing with memory management

πŸ”§ Configuration

Command Line Options

# Processing options
--num_workers 8              # Parallel workers (default: 8)
--shard 1000                 # Output sharding
--num_files 100              # Limit files processed
--enable_enhancement         # Enable scenario enhancement

# Dataset options
--split v1.0-trainval        # nuScenes data split
--nuplan_direct_from_logs    # Alternative nuPlan parsing

# Output options
--tfrecord_name training     # TFRecord filename
--log_level INFO             # Logging verbosity
--log_file /path/to/log      # Log file location

πŸ“š Additional Resources

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Extension of ScenarioNet to support Waymax/V-Max and GPUDrive format

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •