mpmg
is a modular environment designed for studying the Minimum Price Markov Game (MPMG), a concept in game theory and algorithmic game theory. It provides an easy-to-use framework for conducting experiments with multiple agents using collusion and cooperation dynamics. This environment is useful for researchers and developers interested in game theory, reinforcement learning, and multi-agent systems.
The paper introducing the MPMG can be found here: https://arxiv.org/abs/2407.03521
If you want to contribute to the MPMG project please make sure to read the paper, the present README.md, as well as the CONTRIBUTING.md file. Thank you.
- Customizable Multi-Agent Environment: Supports different numbers of agents and heterogeneous vs. homogeneous settings.
To install the package locally, run the following command from the root directory:
pip install mpmg
This installs the package in "editable" mode, meaning any changes made in the source code will immediately reflect in the installed package.
- Python 3.6+
The MPMG involves
Payoffs
All players have symmetric strategy sets. Each agent
Heterogeneity
Each agent
The notion of strong and weak agents is relative to the distribution of
Payoffs
Let
However, defection must always yield a higher payoff than cooperation when at least one opponent defects, leading to
In fact, we set
Strategy Profile | |||
---|---|---|---|
num_agents (int): Number of agents. Must be a positive integer, default value is 2.
sigma_beta (float): Heterogeneity level, standard deviation of the power parameters' distribution. Must be in [0,1], default value is 0.
alpha (float): Collusive bid multiplier. Must be > 1.
The MPMGEnv
class provides methods for resetting the environment, taking steps, and observing the state, rewards, and dynamics of multi-agent interactions.
Methods
-------
reset():
Resets the environment and returns the initial state.
Input: None
Output: np.ndarray
step(actions):
Returns rewards, next state, and the "done" flag used in episodic tasks.
Input: List[int]
Output: (np.ndarray, np.ndarray, bool)
Attributes
----------
num_agents (int): Number of agents.
sigma_beta (float): Heterogeneity level.
alpha (float): Collusive bid multiplier.
action_size (int): Action space size, which is always 2.
joint_action_size (int): action_size ** num_agents, joint action space size.
beta_size (int): num_agents, the size of the beta parameters array.
state_size (int): num_agents + joint_action_size + beta_size. Size of the observation space. May change upon customization of the state space.
state_space: The observation space is composed of 'action_frequencies', 'joint_action_frequencies', and 'beta_parameters', and is of size state_size.
action_frequencies (np.ndarray(num_agents)): Action frequencies of action 1 for each player.
joint_action_frequencies (np.ndarray(joint_action_size)): Joint action frequencies for each joint action.
Example use:
# import the environment
from mpmg import MPMGEnv
# Create an instance of the environment
env = MPMGEnv(num_agents=2, sigma_beta=0.0, alpha=1.3)
# Reset the environment
state = env.reset()
# Probably a loop here
for i in range(...):
# Sample actions
actions = [1, 0] # Example of actions array for 2 players
# Take a step in the environment
rewards, next_state, done = env.step(actions)
# Do what you need
...
# Update state
state = next_state
MPMGEnv
is a social dilemma based on the Prisoner's Dilemma.
- Full Defection: All agents choose to defect (action 0), which represents the Nash equilibrium.
- Full Cooperation: All agents cooperate (action 1), achieving the Pareto-optimal outcome.
- Asymmetric Play: Actions taken can be separated into two sets, leading to a suboptimal outcome.
This project is licensed under the MIT License. See the LICENSE
file for details.
Contributions are welcome! Feel free to open an issue or submit a pull request for improvements, bug fixes, or new features.
Igor Sadoune - igor.sadoune@polymtl.ca