For anybody interested in this project please feel free to read the project report. The code mainly heritages from https://github.com/HxLyn3/ADMPO
[1] F.-M. Luo, T. Xu, H. Lai, X.-H. Chen, W. Zhang, and Y. Yu. A survey on model-based reinforcement learning. Science China Information Sciences, 67(2):121101, 2024.
[2] D. Ha and J. Schmidhuber. World models. arXiv preprint arXiv:1803.10122, 2018.
[3] M. Bhardwaj, T. Xie, B. Boots, N. Jiang, and C.-A. Cheng. Adversarial model for offline reinforcement learning. Advances in Neural Information Processing Systems, 36:1245–1269, 2023.
[4] X.-H. Chen, Y. Yu, Z. Zhu, Z. Yu, C. Zhenjun, C. Wang, Y. Wu, R.-J. Qin, H. Wu, R. Ding, et al. Adversarial counterfactual environment model learning. Advances in Neural Information Processing Systems, 36:70654–70706, 2023.
[5] Z.-M. Zhu, X.-H. Chen, H.-L. Tian, K. Zhang, and Y. Yu. Offline reinforcement learning with causal structured world models. arXiv preprint arXiv:2206.01474, 2022.
[6] T. Yu, G. Thomas, L. Yu, S. Ermon, J. Y. Zou, S. Levine, C. Finn, and T. Ma. MOPO: Model-based offline policy optimization. Advances in Neural Information Processing Systems, 33:14129–14142, 2020.
[7] M. Janner, J. Fu, M. Zhang, and S. Levine. When to trust your model: Model-based policy optimization. Advances in Neural Information Processing Systems, 32, 2019.
[8] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. A Bradford Book, 2018.
[9] H. Lin, Y.-Y. Xu, Y. Sun, Z. Zhang, Y.-C. Li, C. Jia, J. Ye, J. Zhang, and Y. Yu. Any-step dynamics model improves future predictions for online and offline reinforcement learning. arXiv preprint arXiv:2405.17031, 2024.
[10] C. Chen, J. Yoon, Y.-F. Wu, and S. Ahn. TransDreamer: Reinforcement learning with transformer world models, 2022. URL: https://openreview.net/forum?id=s3K0arSRl4d.
[11] D. Hafner, T. Lillicrap, J. Ba, and M. Norouzi. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603, 2019.
[12] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.