Official implementation of VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning.
- [June 2025] Codebase open-sourced.
- [May 2025] Initial preprint released on arXiv, VLMLight.
VLMLight presents a novel vision-language multimodal framework for adaptive traffic signal control, featuring:
- The first vision-based traffic control system utilizing visual foundation models for scene understanding;
- A dual-branch architecture combining fast RL policies with deliberative LLM reasoning
- Enhanced handling of safety-critical scenarios through multi-agent collaboration
First multi-view visual traffic simulator enabling context-aware decision making:
BEV | North | East | South | West |
---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
- Fast RL Policy: Efficient handling of routine traffic
- Deliberative Reasoning: Structured analysis for complex scenarios
- Meta-Controller: Dynamic branch selection based on real-time context
Specialized pipeline for emergency vehicle prioritization:
Deliberative Reasoning policy for complex traffic in Massy.
- Install TransSimHub:
git clone https://github.com/Traffic-Alpha/TransSimHub.git
cd TransSimHub
pip install -e ".[all]"
- Install Qwen-Agent:
pip install -U "qwen-agent[gui,rag,code_interpreter,mcp]"
# Or use `pip install -U qwen-agent` for the minimal requirements.
# The optional requirements, specified in double brackets, are:
# [gui] for Gradio-based GUI support;
# [rag] for RAG support;
# [code_interpreter] for Code Interpreter support;
# [mcp] for MCP support.
VLMLight provides both English and Chinese implementations. The following examples demonstrate the English version usage. For Chinese version, simply replace vlm_tsc_en
with vlm_tsc_zh
in all paths and commands.
Configure your LLM/VLM endpoints in vlm_tsc_en/vlmlight_decision.py
:
llm_cfg = {
'model': 'Qwen/Qwen2.5-72B-Instruct-AWQ',
'model_type': 'oai',
'model_server': 'http://localhost:5070/v1',
'api_key': 'token-abc123',
'generate_cfg': {
'top_p': 0.8,
}
} # Language Model
llm_cfg_json = {
'model': 'Qwen/Qwen2.5-72B-Instruct-AWQ',
'model_type': 'oai',
'model_server': 'http://localhost:5070/v1',
'api_key': 'token-abc123',
'generate_cfg': {
'top_p': 0.8,
'response_format': {"type": "json_object"},
}
} # Language Model
vlm_cfg = {
'model': 'Qwen/Qwen2.5-VL-32B-Instruct-AWQ',
'model_type': 'qwenvl_oai',
'model_server': 'http://localhost:5030/v1',
'api_key': 'token-abc123',
'generate_cfg': {
'top_p': 0.8,
}
} # Vision Language Model
Train RL policies for baseline control:
cd rl_tsc
python train_rl_tsc.py
Pretrained models available in rl_tsc/results:
Hongkong YMT | France Massy | SouthKorea Songdo |
---|---|---|
![]() |
![]() |
![]() |
Execute the decision pipeline:
cd vlm_tsc_en
python vlmlight_decision.py
.
├── assets/ # Visual assets for documentation
├── result_analysis/ # Trip information analysis tools
│ └── analysis_tripinfo.py # Performance metric calculation
├── rl_tsc/ # Reinforcement learning components
│ ├── _config.py # RL training configuration
│ ├── eval_rl_tsc.py # RL policy evaluation
│ ├── train_rl_tsc.py # RL policy training
│ └── utils/ # RL helper functions
├── sim_envs/ # Traffic simulation scenarios
│ ├── France_Massy/ # Massy, France intersection
│ ├── Hongkong_YMT/ # YMT, Hong Kong intersection
│ └── SouthKorea_Songdo/ # Songdo, South Korea intersection
├── vlm_tsc_en/ # English version implementation
│ ├── _config.py # English agent configuration
│ ├── utils/ # English processing utilities
│ └── vlmlight_decision.py # English decision pipeline
└── vlm_tsc_zh/ # Chinese version implementation
├── _config.py # Chinese agent configuration
├── utils/ # Chinese processing utilities
└── vlmlight_decision.py # Chinese decision pipeline
If you find this work useful, please cite our papers:
@article{wang2025vlmlight,
title={VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning},
author={Wang, Maonan and Chen, Yirong and Pang, Aoyu and Cai, Yuxin and Chen, Chung Shue and Kan, Yuheng and Pun, Man-On},
journal={arXiv preprint arXiv:2505.19486},
year={2025}
}
We thank our collaborators from SenseTime and Shanghai AI Lab (in alphabetical order):
- Yuheng Kan (阚宇衡)
- Zian Ma (马子安)
- Chengcheng Xu (徐承成)
for their contributions to the TransSimHub simulator development.
If you have any questions, please open an issue in this repository. We will respond as soon as possible.