Skip to content

Tinysimpler/papers-and-codes-for-MARL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 

Repository files navigation

This is a collection of Multi-Agent Reinforcement Learning (MARL) papers with code. I have selected some relatively important papers with open source code and categorized them by time and method based on the efforts of Chen, Hao, Multi-Agent Reinforcement Learning Papers with Code, thanks to him.

Then I'll update this collection for deeper study.

经典论文

算法

Category Paper Code Accepted at Year
Independent Learning IQL:Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents https://github.com/oxwhirl/pymarl ICML 1993
Value Decomposition VDN:Value-Decomposition Networks For Cooperative Multi-Agent Learning https://github.com/oxwhirl/pymarl AAMAS 2017
Value Decomposition QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning https://github.com/oxwhirl/pymarl ICML 2018
Value Decomposition QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning https://github.com/oxwhirl/pymarl ICML 2019
Policy Gradient COMA:Counterfactual Multi-Agent Policy Gradients https://github.com/oxwhirl/pymarl AAAI 2018
Policy Gradient MADDPG:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/maddpg NIPS 2017
Communication BiCNet:Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games https://github.com/Coac/CommNet-BiCnet 2017
Communication CommNet:Learning Multiagent Communication with Backpropagation https://github.com/facebookarchive/CommNet NIPS 2016
Communication IC3Net:Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks https://github.com/IC3Net/IC3Net 2018
Communication RIAL/RIDL:Learning to Communicate with Deep Multi-Agent Reinforcement Learning https://github.com/iassael/learning-to-communicate NIPS 2016
Exploration MAVEN:Multi-Agent Variational Exploration https://github.com/starry-sky6688/MARL-Algorithms NIPS 2019

环境

Environment Paper KeyWords Code Accepted at Year Others
StarCraft The StarCraft Multi-Agent Challenge https://github.com/oxwhirl/smac NIPS 2019
StarCraft SMACv2: A New Benchmark for Cooperative Multi-Agent Reinforcement Learning https://github.com/oxwhirl/smacv2 2022
StarCraft Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks https://github.com/uoe-agents/epymarl NIPS 2021
Football Google Research Football: A Novel Reinforcement Learning Environment https://github.com/google-research/football AAAI 2020
PettingZoo PettingZoo: Gym for Multi-Agent Reinforcement Learning https://github.com/Farama-Foundation/PettingZoo NIPS 2021
Melting Pot Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot https://github.com/deepmind/meltingpot ICML 2021
MuJoCo MuJoCo: A physics engine for model-based control https://github.com/deepmind/mujoco IROS 2012
MALib MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning 基于种群的多智能体强化学习自动课程学习可扩展速度比RLlib快5倍比OpenSpiel至少快3倍 https://github.com/sjtu-marl/malib 2021
MAgent MAgent: A many-agent reinforcement learning platform for artificial collective intelligence https://github.com/Farama-Foundation/MAgent AAAI 2018
Neural MMO Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents https://github.com/openai/neural-mmo 2019
MPE Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/multiagent-particle-envs NIPS 2017
Pommerman Pommerman: A multi-agent playground https://github.com/MultiAgentLearning/playground 2018
HFO Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork https://github.com/LARG/HFO AAMAS Workshop 2016
Multiagent Coordination Simulator 基于局部信息的控制多种多智能体协调技术群体行为钉扎控制动态包围任意闭合曲线跟踪牧羊控制对抗恶意智能体的控制Python https://github.com/tjards/multi-agent_sim?tab=readme-ov-file 2024 img
A collection of reference environments for offline reinforcement learningPython https://github.com/Farama-Foundation/D4RL
a collection of discrete grid-world environments to conduct research on Reinforcement Learning https://github.com/Farama-Foundation/Minigrid img

其他论文

Category Paper Code Accepted at Year
Graph Neural Network Multi-Agent Game Abstraction via Graph Attention Neural Network https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
Curriculum Learning From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
Curriculum Learning EPC:Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning https://github.com/qian18long/epciclr2020 ICLR 2020
Curriculum Learning/Emergent Emergent Tool Use From Multi-Agent Autocurricula https://github.com/openai/multi-agent-emergence-environments ICLR 2020
Curriculum Learning Cooperative Multi-agent Control using deep reinforcement learning https://github.com/sisl/MADRL AAMAS 2017
Role ROMA: Multi-Agent Reinforcement Learning with Emergent Roles https://github.com/TonghanWang/ROMA ICML 2020
Role RODE: Learning Roles to Decompose Multi-Agent Tasks https://github.com/TonghanWang/RODE ICLR 2021
Role Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing https://github.com/uoe-agents/seps ICML 2021
Opponent Modeling Opponent Modeling in Deep Reinforcement Learning https://github.com/hhexiy/opponent ICML 2016
Selfish Agent M3RL: Mind-aware Multi-agent Management Reinforcement Learning https://github.com/facebookresearch/M3RL ICLR 2019
Communication Emergence of grounded compositional language in multi-agent populations https://github.com/bkgoksel/emergent-language AAAI 2018
Communication Fully decentralized multi-agent reinforcement learning with networked agents https://github.com/cts198859/deeprl_network ICML 2018
Policy Gradient DOP: Off-Policy Multi-Agent Decomposed Policy Gradients https://github.com/TonghanWang/DOP ICLR 2021
Policy Gradient MAAC:Actor-Attention-Critic for Multi-Agent Reinforcement Learning https://github.com/shariqiqbal2810/MAAC ICML 2019
Environment [Emergent Complexity via Multi-Agent Competition](https://arxiv.org/pdf/1710.03748.pdfKEYWORDS: Artificial) https://github.com/openai/multiagent-competition ICLR 2018
Exploration EITI/EDTI:Influence-Based Multi-Agent Exploration https://github.com/TonghanWang/EITI-EDTI ICLR 2020
Exploration LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning https://github.com/yalidu/liir NIPS 2019
From Single-Agent to Multi-Agent MAPPO:The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games https://github.com/marlbenchmark/on-policy 2021
Diversity Q-DPP:Multi-Agent Determinantal Q-Learning https://github.com/QDPP-GitHub/QDPP ICML 2020
Ad Hoc Teamwork CollaQ:Multi-Agent Collaboration via Reward Attribution Decomposition https://github.com/facebookresearch/CollaQ 2020
Value Decomposition NDQ: Learning Nearly Decomposable Value Functions Via Communication Minimization https://github.com/TonghanWang/NDQ ICLR 2020
Value Decomposition QPLEX: Duplex Dueling Multi-Agent Q-Learning https://github.com/wjh720/QPLEX ICLR 2021
Self-Play TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning https://github.com/tencent-ailab/TLeague 2020
Transformer UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers https://github.com/hhhusiyi-monash/UPDeT ICLR 2021
Sparse Reward Individual Reward Assisted Multi-Agent Reinforcement Learning https://github.com/MDrW/ICML2022-IRAT ICML 2022
Ad Hoc Open Ad Hoc Teamwork using Graph-based Policy Learning https://github.com/uoe-agents/GPL ICLM 2021
Generalization UNMAS: Multiagent Reinforcement Learningfor Unshaped Cooperative Scenarios https://github.com/James0618/unmas TNNLS 2021
Other SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning https://github.com/deligentfool/SIDE AAMAS 2022
Other Context-Aware Sparse Deep Coordination Graphs https://github.com/TonghanWang/CASEC-MACO-benchmark ICLR 2022

综述

Recent Reviews (Since 2019)

Other Reviews (Before 2019)

环境

Environment Paper KeyWords Code Accepted at Year Others
StarCraft The StarCraft Multi-Agent Challenge https://github.com/oxwhirl/smac NIPS 2019
StarCraft SMACv2: A New Benchmark for Cooperative Multi-Agent Reinforcement Learning https://github.com/oxwhirl/smacv2 2022
StarCraft Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks https://github.com/uoe-agents/epymarl NIPS 2021
Football Google Research Football: A Novel Reinforcement Learning Environment https://github.com/google-research/football AAAI 2020
PettingZoo PettingZoo: Gym for Multi-Agent Reinforcement Learning https://github.com/Farama-Foundation/PettingZoo NIPS 2021
Melting Pot Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot https://github.com/deepmind/meltingpot ICML 2021
MuJoCo MuJoCo: A physics engine for model-based control https://github.com/deepmind/mujoco IROS 2012
MALib MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning https://github.com/sjtu-marl/malib 2021
MAgent MAgent: A many-agent reinforcement learning platform for artificial collective intelligence https://github.com/Farama-Foundation/MAgent AAAI 2018
Neural MMO Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents https://github.com/openai/neural-mmo 2019
MPE Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/multiagent-particle-envs NIPS 2017
Pommerman Pommerman: A multi-agent playground https://github.com/MultiAgentLearning/playground 2018
HFO Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork https://github.com/LARG/HFO AAMAS Workshop 2016
A unified official code releasement of MARL researches made by TJU-RL-Labagents规模信用分配探索-利用平衡混合action部分观测非稳定性:自模仿+对手建模Python https://github.com/TJU-DRL-LAB/Multiagent-RL 2022
自博弈强化学习环境多个博弈参与者 https://github.com/davidADSP/SIMPLE 2021 img

多优化目标

综述类

环境

Paper Key Words Code Accepted at Year others
A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement Learning 开源的多目标强化学习算法库多目标强化学习环境MO-Gymnasium单策略和多策略方法 https://github.com/LucasAlegre/morl-baselines 2023
Paper Key Words Code Accepted at Year others
A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation 算法框架偏好未知线性偏好优于标量化的MORL算法合理推断隐藏偏好贝尔曼方程参数化policy表示 https://github.com/RunzheYang/MORL NeurIPS'19 2019
Lexicographic Multi-Objective Reinforcement Learning 字典序多目标问题:目标有明确优先级 资源有限 多阶段任务可扩展性实际应用Policy-basedvalue-based单智能体 https://github.com/lrhammond/lmorl 2022
Multi-objective Conflict-based Search for Multi-agent Path Finding. Subdimensional Expansion for Multi-objective Multi-agent Path Finding. 帕累托最优解集基于冲突搜索 https://github.com/wonderren/public_pymomapf 2021
Unifying All Species: LLM-based Hyper-Heuristics for Multi-objective Optimization TSP多目标优化 2024
Multi-objective Evolution of Heuristic Using Large Language Model TSP,BPP多目标 2024
Thresholded Lexicographic Ordered Multiobjective Reinforcement Learning 梯度投影多智能体 2024
PA2D-MORL:Pareto Ascent Directional Decomposition Based Multi-Objective Reinforcement Learning Pareto策略集灵活性适应性结果稳定 2024
Multi-Objective Deep Reinforcement Learning Optimisation in Autonomous Systems 同时优化多个目标自适应服务器配置 https://github.com/JuanK120/RL_EWS 2024
A practical guide to multi-objective reinforcement learning and planning 线性标量化转为单目标 VS. 权重空间遍历存储向量值的Q值(Q-Learning)每个策略对应不同的目标组合(Pareto Q-Learning)在学习过程中动态调整偏好(Q-steering) 2021
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning 智能体个体目标和集体目标平衡两阶段课程学习(平滑过渡)多目标多智能体策略梯度(信用分配机制)学习效率提升 https://github.com/011235813/cm3 2021
MO-MIX: Multi-Objective Multi-Agent Cooperative Decision-Making With Deep Reinforcement Learning 权重向量:平衡不同目标的重要性混合网络:整合所有智能体的局部信息,协调合作

信用分配

值分解

Paper Code Accepted at Year
VDN:Value-Decomposition Networks For Cooperative Multi-Agent Learning https://github.com/oxwhirl/pymarl AAMAS 2017
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning https://github.com/oxwhirl/pymarl ICML 2018
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning https://github.com/oxwhirl/pymarl ICML 2019
NDQ: Learning Nearly Decomposable Value Functions Via Communication Minimization https://github.com/TonghanWang/NDQ ICLR 2020
CollaQ:Multi-Agent Collaboration via Reward Attribution Decomposition https://github.com/facebookresearch/CollaQ 2020
SQDDPG:Shapley Q-Value: A Local Reward Approach to Solve Global Reward Games https://github.com/hsvgbkhgbv/SQDDPG AAAI 2020
QPD:Q-value Path Decomposition for Deep Multiagent Reinforcement Learning ICML 2020
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning https://github.com/oxwhirl/wqmix NIPS 2020
QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning 2020
QPLEX: Duplex Dueling Multi-Agent Q-Learning https://github.com/wjh720/QPLEX ICLR 2021

其他方法

Paper KeyWords Code Accepted at Year
COMA:Counterfactual Multi-Agent Policy Gradients https://github.com/oxwhirl/pymarl AAAI 2018
LiCA:Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning https://github.com/mzho7212/LICA NIPS 2020
Evaluating Memory and Credit Assignment in Memory-Based RL Decoupling Memory from Credit Assignment https://github.com/twni2016/Memory-RL 2023

策略梯度

Paper Code Accepted at Year
MADDPG:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/maddpg NIPS 2017
COMA:Counterfactual Multi-Agent Policy Gradients https://github.com/oxwhirl/pymarl AAAI 2018
IPPO:Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? 2020
MAPPO:The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games https://github.com/marlbenchmark/on-policy 2021
MAAC:Actor-Attention-Critic for Multi-Agent Reinforcement Learning https://github.com/shariqiqbal2810/MAAC ICML 2019
DOP: Off-Policy Multi-Agent Decomposed PolicyGradients https://github.com/TonghanWang/DOP ICLR 2021
M3DDPG:Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient AAAI 2019

多任务

Paper KeyWords Code Accepted at Year Others
Switch Trajectory Transformer with Distributional Value Approximation for Multi-Task Reinforcement Learning Multi-Task RLSparse Reward ExpEnv: MINIGRID
HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning ExpEnv: MetaWorld 2024
Elastic Decision Transformer Offline RLstitch trajectoryMulti-Task 2023
Learning to Modulate pre-trained Models in RL multi-task learningcontinual learningfine-tuning 2023
Multi Task RL Baselines https://github.com/facebookresearch/mtrl
A PyTorch Library for Multi-Task Learning https://github.com/median-research-group/LibMTL 2024
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data 有限来源的离线数据 MARL跨任务的协作未见任务泛化能力 https://github.com/LAMDA-RL/ODIS 2023
Few is More: Task-Efficient Skill-Discovery for Multi-Task Offline Multi-Agent Reinforcement Learning 避免新任务重复训练多任务离线MARL算法重构观测->评估固定动作+可变动作->正则保守动作从有限小规模源任务->强大的多任务泛化 2025

通信

带宽限制

Paper KeyWords Code Accepted at Year Others
SchedNet:Learning to Schedule Communication in Multi-Agent Reinforcement learning 2019
Learning Multi-agent Communication under Limited-bandwidth Restriction for Internet Packet Routing 2019
Gated-ACML:Learning Agent Communication under Limited Bandwidth by Message Pruning AAAI 2020
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach ICML 2020
Coordinating Multi-Agent Reinforcement Learning with Limited Communication AAMAS 2013
Learning Efficient Diverse Communication for Cooperative Heterogeneous Teaming 通信消息异构带宽限制 https://github.com/CORE-Robotics-Lab/HetNet 2022

无带宽限制

Paper KeyWords Code Accepted at Year Others
CommNet:Learning Multiagent Communication with Backpropagation https://github.com/facebookarchive/CommNet NIPS 2016
BiCNet:Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games https://github.com/Coac/CommNet-BiCnet 2017
VAIN: Attentional Multi-agent Predictive Modeling NIPS 2017
IC3Net:Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks https://github.com/IC3Net/IC3Net 2018
VBC:Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control NIPS 2019
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation 2018
NDQ:Learning Nearly Decomposable Value Functions Via Communication MinimizationNDQ: Learning Nearly Decomposable Value Functions Via Communication Minimization https://github.com/TonghanWang/NDQ ICLR 2020
RIAL/RIDL:Learning to Communicate with Deep Multi-Agent Reinforcement Learning https://github.com/iassael/learning-to-communicate NIPS 2016
ATOC:Learning Attentional Communication for Multi-Agent Cooperation NIPS 2018
Fully decentralized multi-agent reinforcement learning with networked agents https://github.com/cts198859/deeprl_network ICML 2018
TarMAC: Targeted Multi-Agent Communication ICML 2019

未分

Paper Key Words Code Accepted at Year Others
Responsive Regulation of Dynamic UAV Communication Networks Based on Deep Reinforcement Learning 无人机(UAV)通信网络动态调控异步DDPGPython https://github.com/ducmngx/DDPG-UAV-Efficiency 2021 img
eQMARL: Entangled Quantum Multi-Agent Reinforcement Learning for Distributed Cooperation over Quantum Channels 分布式多智能体强化学习信息共享量子通道收敛更快 效果更好 减少计算负担 https://github.com/news-vt/eqmarl 2025
Learning to Communicate Through Implicit Communication Channels 隐式通信协议(ICP)框架更高效 2024
Scaling Large Language Model-based Multi-Agent Collaboration LLM协作涌现通信模式有向无环图1000个智能体 https://github.com/OpenBMB/ChatDev 2024

涌现

Paper KeyWords Code Accepted at Year
Multiagent Cooperation and Competition with Deep Reinforcement Learning PloS one 2017
Multi-agent Reinforcement Learning in Sequential Social Dilemmas 2017
Emergent preeminence of selfishness: an anomalous Parrondo perspective Nonlinear Dynamics 2019
Emergent Coordination Through Competition 2019
Biases for Emergent Communication in Multi-agent Reinforcement Learning NIPS 2019
Towards Graph Representation Learning in Emergent Communication 2020
Emergent Tool Use From Multi-Agent Autocurricula https://github.com/openai/multi-agent-emergence-environments ICLR 2020
On Emergent Communication in Competitive Multi-Agent Teams AAMAS 2020
QED:Quasi-Equivalence Discovery for Zero-Shot Emergent Communication 2021
Incorporating Pragmatic Reasoning Communication into Emergent Language NIPS 2020
Scaling Large Language Model-based Multi-Agent Collaboration LLM协作涌现通信模式有向无环图1000个智能体 https://github.com/OpenBMB/ChatDev 2024

对手建模

Paper Code Accepted at Year
Bayesian Opponent Exploitation in Imperfect-Information Games IEEE Conference on Computational Intelligence and Games 2018
LOLA:Learning with Opponent-Learning Awareness AAMAS 2018
Variational Autoencoders for Opponent Modeling in Multi-Agent Systems 2020
Stable Opponent Shaping in Differentiable Games 2018
Opponent Modeling in Deep Reinforcement Learning https://github.com/hhexiy/opponent ICML 2016
Game Theory-Based Opponent Modeling in Large Imperfect-Information Games AAMAS 2011
Agent Modelling under Partial Observability for Deep Reinforcement Learning NIPS 2021

博弈论

Paper KeyWords Code Accepted at Year Others
α-Rank: Multi-Agent Evaluation by Evolution Scientific reports 2019
α^α -Rank: Practically Scaling α-Rank through Stochastic Optimisation AAMAS 2020
A Game Theoretic Framework for Model Based Reinforcement Learning ICML 2020
Fictitious Self-Play in Extensive-Form Games ICML 2015
Combining Deep Reinforcement Learning and Search for Imperfect-Information Games NIPS 2020
Real World Games Look Like Spinning Tops NIPS 2020
PSRO: A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning NIPS 2017
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games NIPS 2020
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems AAMAS 2013
Neural Replicator Dynamics: Multiagent Learning via Hedging Policy Gradients AAMAS 2020
ASP: Learn a Universal Neural Solver! https://github.com/LOGO-CUHKSZ/ASP IEEE 2023
自博弈代码框架Python https://github.com/davidADSP/SIMPLE 2021 img
TimeChamber: A Massively Parallel Large Scale Self-Play Framework 大规模并行自我对战框架Python https://github.com/inspirai/TimeChamber 2022
复现与多智能体博弈相关的论文 https://github.com/BaoyiCui/MAS-Game
Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent 反事实遗憾最小化(Counterfactual Regret Minimization, CFR)不完全信息博弈快速收敛 https://github.com/rpSebastian/PDCFRPlus 2024
Dynamic Discounted Counterfactual Regret Minimization 第一个使用动态的、自动学习的方案来对先前迭代进行折扣的均衡求解框架泛化收敛速度 https://github.com/rpSebastian/DDCFR 2024
Awesome Game AI materials of Multi-Agent Reinforcement Learning https://github.com/datamllab/awesome-game-ai
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games 零和不完全信息博弈收敛速度快 https://github.com/JBLanier/pipeline-psro 2020
Neural Auto-Curricula 神经自动课程自动:选择对手策略 + 寻找最佳响应元梯度下降通用MARL算法可扩展性 https://github.com/waterhorse1/NAC 2021 img
Temporal Induced Self-Play for Stochastic Bayesian Games 动态博弈问题基于强化学习的算法框架基于策略梯度可扩展性 https://github.com/laonahongchen/Temporal-Induced-Self-Play-for-Stochastic-Bayesian-Games 2020

分层

Paper Code Accepted at Year
Hierarchical multi-agent reinforcement learning AAMAS 2006
Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery AAMAS 2020
Hierarchical Critics Assignment for Multi-agent Reinforcement Learning 2019
Hierarchical Reinforcement Learning for Multi-agent MOBA Game 2019
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction 2018
HAMA:Multi-Agent Actor-Critic with Hierarchical Graph Attention Network AAAI 2020

角色

Paper KeyWords Code Accepted at Year
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles https://github.com/TonghanWang/ROMA ICML 2020
RODE: Learning Roles to Decompose Multi-Agent Tasks https://github.com/TonghanWang/RODE ICLR 2021
Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing https://github.com/uoe-agents/seps ICML 2021
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework 元编程框架:流水线范式智能体角色任务分解:子任务更连贯的解决方案 https://github.com/geekan/MetaGPT 2023

大规模

Paper Key Words Code Accepted at Year
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
PooL: Pheromone-inspired Communication Framework for Large Scale Multi-Agent Reinforcement Learning 2022
Factorized Q-learning for large-scale multi-agent systems ICDAI 2019
EPC:Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning https://github.com/qian18long/epciclr2020 ICLR 2020
Mean Field Multi-Agent Reinforcement Learning ICML 2018
A Study of AI Population Dynamics with Million-agent Reinforcement Learning AAMAS 2018
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification 离线RL智能体数量增加 陷入局部最优方法:带演员修正的离线多智能体强化学习 https://github.com/ling-pan/OMAR 2022

即兴协作

Paper Key Words Code Accepted at Year
CollaQ:Multi-Agent Collaboration via Reward Attribution Decomposition https://github.com/facebookresearch/CollaQ 2020
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems AAMAS 2013
Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork https://github.com/LARG/HFO AAMAS Workshop 2016
Open Ad Hoc Teamwork using Graph-based Policy Learning https://github.com/uoe-agents/GPL ICLM 2021
A Survey of Ad Hoc Teamwork: Definitions, Methods, and Open Problems 2022
Towards open ad hoc teamwork using graph-based policy learning ICML 2021
Learning with generated teammates to achieve type-free ad-hoc teamwork IJCAI 2021
Online ad hoc teamwork under partial observability ICLR 2022
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data 有限来源的离线数据 MARL跨任务的协作未见任务泛化能力 https://github.com/LAMDA-RL/ODIS 2023
Few is More: Task-Efficient Skill-Discovery for Multi-Task Offline Multi-Agent Reinforcement Learning 避免新任务重复训练多任务离线MARL算法重构观测->评估固定动作+可变动作->正则保守动作从有限小规模源任务->强大的多任务泛化 2025
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework 元编程框架:流水线范式智能体角色任务分解:子任务更连贯的解决方案 https://github.com/geekan/MetaGPT 2023

进化算法

综述类

Types Paper Key Words Code Accepted at Year Others
Socialized Learning: Making Each Other Better Through Multi-Agent Collaboration 社会化学习(Socialized Learning, SL)集体协作 互惠利他模块 https://github.com/yxjdarren/SL 2024
RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution MARL+演化算法+表征学习 https://github.com/yeshenpy/RACE 2023
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning 基于种群的多智能体强化学习自动课程学习可扩展速度比RLlib快5倍比OpenSpiel至少快3倍 https://github.com/sjtu-marl/malib 2021
EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search 进化算法(EAs)和强化学习(RL)结合机制探索5种Python https://github.com/yeshenpy/EvoRainbow 2024 img
1.EA辅助RL**-参数搜索** Reinforcement Learning beyond The Bellman Equation: Exploring Critic Objectives using Evolution https://github.com/ajleite/RLBeyondBellman 2020
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning 2021
Improving Deep Policy Gradients with Value Function Search 2023
1.EA辅助RL-Action搜索 Scalable deep reinforcement learning for vision-based robotic manipulation https://github.com/quantumiracle/QT_Opt 2018
RL4RealLife Workshop Q-learning for continuous actions with cross-entropy guided policies 2019
Evolutionary Action Selection for Gradient-based Policy Learning 2022
Soft Actor-Critic with Cross-entropy Policy Optimization https://github.com/wcgcyx/SAC-CEPO 2021
GRAC: Self-guided and Self-regularized Actor-critic https://github.com/stanford-iprl-lab/GRAC 2021
Plan better amid conservatism: Offline multi-agent reinforcement learning with actor rectification https://github.com/ling-pan/OMAR 2022
Deep Multi-agent Reinforcement Learning for Decentralized Continuous Cooperative Control https://github.com/oxwhirl/comix 2020
1.EA辅助RL-超参优化 Online Meta-learning by Parallel Algorithm Competition 2018
Population Based Training of Neural Networks https://github.com/voiler/PopulationBasedTraining 2017
Sample-efficient Automated Deep Reinforcement Learning https://github.com/automl/SEARL 2021
GA+DDPG+HER: Genetic Algorithm-based Function Optimizer in Deep Reinforcement Learning for Robotic Manipulation Tasks https://github.com/aralab-unr/ga-drl-aubo-ara-lab 2022
Towards Automatic Actor-critic Solutions to Continuous Control https://github.com/jakegrigsby/deep_control 2021
Online Hyper-parameter Tuning in Offpolicy Learning via Evolutionary Strategies 2020
1.EA辅助RL-其他 Evolving Reinforcement Learning Algorithms https://github.com/google/brain_autorl/tree/main/evolving_rl 2021
Discovered Policy Optimisation https://github.com/luchris429/discovered-policy-optimisation 2022
Discovering Temporally-Aware Reinforcement Learning Algorithms 时序感知RL? https://github.com/EmptyJackson/groove 2024
Behaviour Distillation https://github.com/FLAIROx/behaviour-distillation 2024
Adversarial Cheap Talk https://github.com/luchris429/adversarial-cheap-talk 2023
PNS: Population-guided Novelty Search for Reinforcement Learning in Hard Exploration Environments 2021
Go explore: A New Approach for Hard-exploration Problems https://github.com/uber-research/go-explore Nature 2021
Genetic-gated Networks for Deep Reinforcement Learning 2018
Evo-rl: Evolutionary-driven Reinforcement Learning 2021
Robust Multi-agent Coordination via Evolutionary Generation of Auxiliary Adversarial Attackers https://github.com/zzq-bot/ROMANCE 2023
Communication-robust Multiagent Learning by Adaptable Auxiliary Multi-agent Adversary Generation 2023
Evolutionary Population Curriculum for Scaling Multi-agent Reinforcement Learning https://github.com/qian18long/epciclr2020 2020
MAPPER: Multi-agent Path Planning with Evolutionary Reinforcement Learning in Mixed Dynamic Environments 2020
2.RL辅助EA-种群初始化 Symbolic Regression Via Neural-guided Genetic Programming Population Seeding https://github.com/dso-org/deep-symbolic-optimization 2021
Rule-based Reinforcement Learning Methodology To Inform Evolutionary Algorithms For Constrained Optimization Of Engineering Applications https://github.com/mradaideh/neorl 2021
Deepaco: Neuralenhanced Ant Systems For Combinatorial Optimization https://github.com/henry-yeh/DeepACO 2023
2.RL辅助EA-种群评估 ERL-Re2: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation https://github.com/yeshenpy/ERL-Re2 2023
A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning https://github.com/Yuxing-Wang-THU/Surrogate-assisted-ERL
PGPS: Coupling Policy Gradient with Population-based Search https://github.com/NamKim88/PGPS/blob/master/Main.py 2021
2.RL辅助EA-变异操作 Policy Optimization By Genetic Distillation https://www.catalyzex.com/paper/policy-optimization-by-genetic-distillation/code 2018
Guiding Evolutionary Strategies With Off-policy Actor-critic 2021
Population Based Reinforcement Learning https://github.com/jjccero/pbrl 2021
Efficient Novelty Search Through Deep Reinforcement Learning https://github.com/shilx001/NoveltySearch_Improvement 2020
Diversity Evolutionary Policy Deep Reinforcement Learning 2021
QD-RL: Efficient Mixing Of Quality And Diversity In Reinforcement Learning, https://openreview.net/forum?id=5Dl1378QutR 2020
Policy Gradient Assisted Map-elites https://github.com/ollebompa/PGA-MAP-Elites 2021
Approximating Gradients For Differentiable Quality Diversity In Reinforcement Learning https://github.com/icaros-usc/dqd-rl 2022
Sample-efficient Quality-diversity By Cooperative Coevolution https://openreview.net/forum?id=JDud6zbpFv 2024
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery https://github.com/instadeepai/qd-skill-discovery-benchmark 2023
Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning https://github.com/icaros-usc/dqd-rl 2022
CEM-RL: Combining evolutionary and gradient-based methods for policy search https://github.com/apourchot/CEM-RL 2019
2.RL辅助EA-超参配置 Reinforcement learning for online control of evolutionary algorithms 2006
Learning step-size adaptation in CMA-ES https://github.com/automl/LTO-CMA 2020
Dynamic algorithm configuration: Foundation of a new meta-algorithmic framework 框架 https://github.com/automl/DAC 2020
Variational reinforcement learning for hyper-parameter tuning of adaptive evolutionary algorithm 2022
Controlling sequential hybrid evolutionary algorithm by q-learning https://github.com/xiaomeiabc/Controlling-Sequential-Hybrid-Evolutionary-Algorithm-by-Q-Learning 2023
Multiagent dynamic algorithm configuration https://github.com/lamda-bbo/madac 2022
Q-learning-based parameter control in differential evolution for structural optimization 2021
Reinforcement learning-based differential evolution for parameters extraction of photovoltaic models 2021
2.RL辅助EA-其他 Model-predictive control via cross-entropy and gradient-based optimization 模型预测控制 https://github.com/homangab/gradcem 2020
Learning off-policy with online planning https://github.com/hari-sikchi/LOOP 2021
Temporal difference learning for model predictive control 模型预测控制 https://github.com/nicklashansen/tdmpc 2022
3**.RL EA相辅相成-单智能体优化** EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search https://github.com/yeshenpy/EvoRainbow 2024
Value-Evolutionary-Based Reinforcement Learning https://github.com/yeshenpy/VEB-RL 2024
ERL-Re2: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation https://github.com/yeshenpy/ERL-Re2 2023
PGPS: Coupling Policy Gradient with Population-based Search https://github.com/NamKim88/PGPS/blob/master/Main.py 2021
Off-policy evolutionary reinforcement learning with maximum mutations (Maximum Mutation Reinforcement Learning for Scalable Control) https://github.com/karush17/esac 2022
Evolutionary action selection for gradient-based policy learning 2022
Competitive and cooperative heterogeneous deep reinforcement learning 2020
Guiding Evolutionary Strategies with Off-Policy Actor-Critic
PDERL: Proximal Distilled Evolutionary Reinforcement Learning https://github.com/crisbodnar/pderl 2020
Gradient Bias to Solve the Generalization Limit of Genetic Algorithms Through Hybridization with Reinforcement Learning https://github.com/ricordium/Gradient-Bias 2020
Collaborative Evolutionary Reinforcement Learning https://github.com/intelai/cerl 2019
Evolution-Guided Policy Gradient in Reinforcement Learning https://github.com/apourchot/CEM-RL 2018
3**.RL EA相辅相成-多智能体优化** RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution https://github.com/yeshenpy/RACE 2023
Novelty Seeking Multiagent Evolutionary Reinforcement Learning 2023
Evolution Strategies Enhanced Complex Multiagent Coordination 2023
MAEDyS: multiagent evolution via dynamic skill selection 2021
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination https://anonymous.4open.science/repository/1590ffb0-aa6b-4838-9d59-ae20cdd8df11/README.md https://github.com/ShawK91/MERL 2020
3**.RL EA相辅相成-形态进化** Evolution gym: A large-scale benchmark for evolving soft robots http://evogym.csail.mit.edu 2021
Embodied Intelligence via Learning and Evolution https://github.com/agrimgupta92/derl Nature Communications 2021
Task-Agnostic Morphology Evolution https://github.com/jhejna/morphology-opt 2021
3**.RL EA相辅相成-可解释AI** Interpretable-AI Policies using Evolutionary Nonlinear Decision Trees for Discrete Action Systems https://github.com/yddhebar/NLDT 2024
Interpretable ai for policy-making in pandemics 2022
A co-evolutionary approach to interpretable reinforcement learning in environments with continuous action spaces 2021
Quality diversity evolutionary learning of decision trees 2023
Social Interpretable Reinforcement Learning 2024
Symbolic regression methods for reinforcement learning 2021
3**.RL EA相辅相成-学习分类器系统** Classifier fitness based on accuracy https://github.com/hosford42/xcs Evolutionary computation 1995
Dynamical genetic programming in XCSF 2013
XCSF with tile coding in discontinuous action-value landscapes Evolutionary Intelligence 2015
Classifiers that approximate functions Natural Computing 2002

团队训练

Paper Code Accepted at Year
AlphaStar:Grandmaster level in StarCraft II using multi-agent reinforcement learning Nature 2019

课程学习

Paper KeyWords Code Accepted at Year
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems AAMAS 2021
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
EPC:Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning https://github.com/qian18long/epciclr2020 ICLR 2020
Emergent Tool Use From Multi-Agent Autocurricula https://github.com/openai/multi-agent-emergence-environments ICLR 2020
Learning to Teach in Cooperative Multiagent Reinforcement Learning AAAI 2019
StarCraft Micromanagement with Reinforcement Learning and Curriculum Transfer Learning IEEE Transactions on Emerging Topics in Computational Intelligence 2018
Cooperative Multi-agent Control using deep reinforcement learning https://github.com/sisl/MADRL AAMAS 2017
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems NIPS 2021
Bootstrapped Transformer for Offline Reinforcement Learning Generation model利用已学习的模型自动生成更多的离线数据,以提升序列模型的训练效果 https://seqml.github.io/bootorl

平均场

Paper KeyWords Code Accepted at Year Others
Mean Field Multi-Agent Reinforcement Learning ICML 2018
Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning The world wide web conference 2019
Bayesian Multi-type Mean Field Multi-agent Imitation Learning NIPS 2020
Bridging mean-field games and normalizing flows with trajectory regularization 联系平均场博弈与归一化流Python https://github.com/Whalefishin/MFG_NF 2023
Python https://github.com/hsvgbkhgbv/Mean-field-Fictitious-Play-in-Potential-Games
MFGLib A Library for Mean Field Games https://github.com/radar-research-lab/MFGLib 2023
APAC-Net: Alternating the population and agent control via two neural networks to solve high-dimensional stochastic mean field games 求解随机均场博弈100维 https://github.com/atlin23/apac-net 2021 img

迁移学习

Paper Code Accepted at Year
A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems Journal of Artificial Intelligence Research 2019
Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning 2020

元学习

Paper keyWords Code Accepted at Year
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning ICML 2021
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments 2017
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework 元编程框架:流水线范式智能体角色任务分解:子任务更连贯的解决方案 https://github.com/geekan/MetaGPT 2023

公平性

Paper Code Accepted at Year
FEN:Learning Fairness in Multi-Agent Systems NIPS 2019
Fairness in Multiagent Resource Allocation with Dynamic and Partial Observations AAMAS 2018
Fairness in Multi-agent Reinforcement Learning for Stock Trading 2019

奖励搜索

稠密奖励搜索

Paper Code Accepted at Year
MAVEN:Multi-Agent Variational Exploration https://github.com/starry-sky6688/MARL-Algorithms NIPS 2019
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning ICML 2019
Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration NIPS 2021
Celebrating Diversity in Shared Multi-Agent Reinforcement Learning https://github.com/lich14/CDS NIPS 2021

稀疏奖励搜索

Paper Code Accepted at Year
EITI/EDTI:Influence-Based Multi-Agent Exploration https://github.com/TonghanWang/EITI-EDTI ICLR 2020
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning ICML 2021
Centralized Model and Exploration Policy for Multi-Agent 2021
REMAX: Relational Representation for Multi-Agent Exploration AAMAS 2022

未分

Paper Code Accepted at Year
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning ICLR 2020
Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning 2019
Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework AAAI 2021
Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory AAAI 2021
LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning https://github.com/yalidu/liir NIPS 2019

稀疏奖励

Paper KeyWords Code Accepted at Year
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems NIPS 2021
Individual Reward Assisted Multi-Agent Reinforcement Learning https://github.com/MDrW/ICML2022-IRAT ICML 2022
Switch Trajectory Transformer with Distributional Value Approximation for Multi-Task Reinforcement Learning Multi-Task RLSparse Reward ExpEnv: MINIGRID

图神经网络

Paper Code Accepted at Year
Multi-Agent Game Abstraction via Graph Attention Neural Network https://github.com/starry-sky6688/MARL-Algorithms AAAI 2020
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation ICLR 2020
Multi-Agent Reinforcement Learning with Graph Clustering 2020
Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems ICML 2018
Distributed constrained combinatorial optimization leveraging hypergraph neural networks https://github.com/nasheydari/HypOp 2023
Learning Scalable Policies over Graphs for Multi-Robot Task Allocation using Capsule Attention Networks https://github.com/iamstevepaul/MRTA-Graph_RL img

基于模型的

Paper Code Accepted at Year
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping 2020

神经架构搜索NAS

Paper Code Accepted at Year
MANAS: Multi-Agent Neural Architecture Search 2019

安全学习

Paper Code Accepted at Year
MAMPS: Safe Multi-Agent Reinforcement Learning via Model Predictive Shielding 2019
Safer Deep RL with Shallow MCTS: A Case Study in Pommerman 2019

单智能体到多智能体

Paper Code Accepted at Year
IQL:Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents https://github.com/oxwhirl/pymarl ICML 1993
IPPO:Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? 2020
MAPPO:The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games https://github.com/marlbenchmark/on-policy 2021
MADDPG:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments https://github.com/openai/maddpg NIPS 2017

动作空间

Paper Code Accepted at Year
Deep Reinforcement Learning in Parameterized Action Space 2015
DMAPQN: Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces IJCAI 2019
H-PPO: Hybrid actor-critic reinforcement learning in parameterized action space IJCAI 2019
P-DQN: Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space 2018
Few is More: Task-Efficient Skill-Discovery for Multi-Task Offline Multi-Agent Reinforcement Learning 避免新任务重复训练多任务离线MARL算法重构观测->评估固定动作+可变动作->正则保守动作从有限小规模源任务->强大的多任务泛化 2025

多样性

Paper KeyWords Code Accepted at Year
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems AAMAS 2021
Q-DPP:Multi-Agent Determinantal Q-Learning https://github.com/QDPP-GitHub/QDPP ICML 2020
Diversity is All You Need: Learning Skills without a Reward Function 2018
Modelling Behavioural Diversity for Learning in Open-Ended Games ICML 2021
Diverse Agents for Ad-Hoc Cooperation in Hanabi CoG 2019
Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning IJCAI 2020
Quantifying environment and population diversity in multi-agent reinforcement learning 2021
POMO: Policy Optimization with Multiple Optima for Reinforcement Learning REINFORCE算法组合优化问题多样化轨迹Python https://github.com/yd-kwon/POMO 2021
HIQL: Offline Goal-Conditioned RL with Latent States as Actions Hierarchical Goal-Conditioned RLOffline Reinforcement LearningValue Function Estimation 2023

分布式训练分布式执行

Paper Code Accepted at Year
Networked Multi-Agent Reinforcement Learning in Continuous Spaces IEEE conference on decision and control 2018
Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning NIPS 2019
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents ICML 2018

离线多智能体强化学习

Paper Key Words Code Accepted at Year
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks 2021
Believe what you see: Implicit constraint approach for offline multi-agent reinforcement learning NIPS 2021
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization https://github.com/LanqingLi1993/FOCAL-ICLR 2020
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization 分布偏移
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data 有限来源的离线数据 MARL跨任务的协作未见任务泛化能力 https://github.com/LAMDA-RL/ODIS 2023
Few is More: Task-Efficient Skill-Discovery for Multi-Task Offline Multi-Agent Reinforcement Learning 避免新任务重复训练多任务离线MARL算法重构观测->评估固定动作+可变动作->正则保守动作从有限小规模源任务->强大的多任务泛化 2025

对抗

单智能体

Paper Code Accepted at Year
Robust Adversarial Reinforcement Learning Non-official implements on GitHub ICML 2017
Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations https://github.com/chenhongge/StateAdvDRL NIPS 2020
Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training 2022
Risk Averse Robust Adversarial Reinforcement Learning ICRA 2019
Robust Deep Reinforcement Learning with Adversarial Attacks 2017
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary https://github.com/huanzhang12/ATLA_robust_RL ICLR 2021
Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State Observations 2021
RoMFAC: A Robust Mean-Field Actor-Critic Reinforcement Learning against Adversarial Perturbations on States 2022
Adversary Agnostic Robust Deep Reinforcement Learning TNNLS 2021
Learning to Cope with Adversarial Attacks 2019
Adversarial Attack on Graph Structured Data ICML 2018
Characterizing Attacks on Deep Reinforcement Learning AAMAS 2022
Adversarial policies: Attacking deep reinforcement learning https://github.com/HumanCompatibleAI/adversarial-policies ICLR 2020
Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization AAAI 2022
On the Robustness of Safe Reinforcement Learning under Observational Perturbations 2022
Robust Reinforcement Learning using Adversarial Populations 2020
Robust Deep Reinforcement Learning through Adversarial Loss https://github.com/tuomaso/radial_rl_v2 NIPS 2021

多智能体

Paper Code Accepted at Year
Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems 2022
Distributed Multi-Agent Deep Reinforcement Learning for Robust Coordination against Noise 2022
On the Robustness of Cooperative Multi-Agent Reinforcement Learning IEEE Security and Privacy Workshops 2020
Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning CVPR workshop 2022
Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient AAAI 2019
Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations NIPS Deep Reinforcement Learning Workshop 2018
Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods 2021

对抗通信

Paper Code Accepted at Year
Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems 2022

评估

Paper Code Accepted at Year
Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning CVPR workshop 2022

模仿学习

Paper Key Words Code Accepted at Year Others
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale 路径发现无通信 无启发式模仿学习部分可观测Python https://github.com/CognitiveAISystems/MAPF-GPT 2025 img

训练数据

Paper Key Words Code Accepted at Year
INS: Interaction-aware Synthesis to Enhance Offline Multi-agent Reinforcement Learning 数据稀缺性智能体间交互数据扩散模型合成高质量多智能体数据集稀疏注意力机制 https://github.com/fyqqyf/INS 2025
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data 有限来源的离线数据 MARL跨任务的协作未见任务泛化能力 https://github.com/LAMDA-RL/ODIS 2023
Few is More: Task-Efficient Skill-Discovery for Multi-Task Offline Multi-Agent Reinforcement Learning 避免新任务重复训练多任务离线MARL算法重构观测->评估固定动作+可变动作->正则保守动作从有限小规模源任务->强大的多任务泛化 2025

优化器

Paper Key Words Code Accepted at Year
Conformal Symplectic Optimization for Stable Reinforcement Learning RL专用神经网络优化器RAD性能达到Adam优化器的2.5倍得分提升了155.1% https://github.com/TobiasLv/RAD 2025

待分类

Paper KeyWords Code Accepted at Year
Mind-aware Multi-agent Management Reinforcement Learning https://github.com/facebookresearch/M3RL ICLR 2019
Emergence of grounded compositional language in multi-agent populations https://github.com/bkgoksel/emergent-language AAAI 2018
[Emergent Complexity via Multi-Agent Competition](https://arxiv.org/pdf/1710.03748.pdfKEYWORDS: Artificial) https://github.com/openai/multiagent-competition ICLR 2018
TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning https://github.com/tencent-ailab/TLeague 2020
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers https://github.com/hhhusiyi-monash/UPDeT ICLR 2021
SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning https://github.com/deligentfool/SIDE AAMAS 2022
UNMAS: Multiagent Reinforcement Learningfor Unshaped Cooperative Scenarios https://github.com/James0618/unmas TNNLS 2021
Context-Aware Sparse Deep Coordination Graphs https://github.com/TonghanWang/CASEC-MACO-benchmark ICLR 2022
Neural Spline Flows 流模型概率密度评估和采样模型灵活性PyTorchPython https://github.com/bayesiains/nflowshttps://github.com/bayesiains/nsf 2021
领域知识图谱数据采集、数据处理以及可视化 https://github.com/Louis-tiany/Military-KG

致谢

Chen, Hao, Multi-Agent Reinforcement Learning Papers with Code

Chen, Hao, Multi Agent Reinforcement Learning papers

Chen, Hao, MARL Resources Collection

About

multi-agent deep reinforcement learning papers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published