Uncertainty-Aware Rank-One MIMO Q Network Framework for Accelerated Offline Reinforcement Learning for Accelerated Offline RL
PyTorch implementation of
Uncertainty-Aware Rank-One MIMO Q Network Framework for Accelerated Offline Reinforcement Learning
[Paper]
Offline reinforcement learning (RL) eliminates the need for interactive data collection, making it a safe and scalable alternative for real-world applications. However, it suffers from extrapolation errors caused by out-of-distribution (OOD) data in static datasets.
This paper proposes an Uncertainty-Aware Rank-One MIMO Q Network Framework, a novel framework that addresses these challenges through:
- β Uncertainty quantification over OOD samples using a single network
- π Lower confidence bound optimization to train robust policies
- π‘ Rank-One Multi-Input Multi-Output (MIMO) architecture that mimics ensemble behavior with minimal computational cost
This framework significantly improves sample efficiency, robustness, and speed while outperforming prior methods on the D4RL benchmark.
- π Uncertainty-aware training objective based on lower confidence bound of Q-values
- π§ Lightweight Rank-One MIMO Q-Network for modeling aleatoric and epistemic uncertainty
- π§ͺ Plug-and-play architecture compatible with standard offline RL pipelines (e.g., CQL, TD3+BC)
- π State-of-the-art performance on D4RL tasks with high computational efficiency
git clone https://github.com/thanhkaist/mimo_Q_network.git
cd mimo_Q_network
pip install -r requirements.txt
If you use this codebase or refer to the method in your research, please cite:
@article{nguyen2024uncertainty,
title={Uncertainty-Aware Rank-One MIMO Q Network Framework for Accelerated Offline Reinforcement Learning},
author={Nguyen, Thanh and Luu, Tung M and Ton, Tri and Kim, Sungwoong and Yoo, Chang D},
journal={IEEE Access},
volume={12},
pages={100972--100982},
year={2024},
publisher={IEEE}
}