📃 Confidence-Guided Human-AI Collaboration: Reinforcement Learning with Distributional Proxy Value Propagation for Autonomous Driving
-
This work introduces Distributional Proxy Value Propagation (D-PVP), which integrates human intention into distributional reinforcement learning, enabling efficient policy learning with minimal human intervention.
-
A shared control mechanism and policy confidence evaluation algorithm dynamically balance human-guided and self-learning policies, ensuring both safety and performance in autonomous driving.
-
The proposed method is validated in both MetaDrive and real-world urban driving using a sensor-equipped UGV. Extensive experiments demonstrate superior performance in terms of sample efficiency, safety, and generalization across diverse traffic scenarios.
Email: lizeqiao@tju.edu.cn
cd to your workspace and clone the repo.
git clone https://github.com/lzqw/C-HAC.git
cd to your workspace:
conda create -n CHAC python=3.9
conda activate CHAC
Select the correct version based on your cuda version and device (cpu/gpu):
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# Install the requirements.
pip install -r requirements.txt
Modify the sys path in example_train file, and run:
python train_dsact_pvp_rl.py