Agent for RoomEnv-v0

This repo is to train an agent that interacts with the RoomEnv-v0. The agent is not trained with RL, but with heuristics. See the paper "A Machine With Human-Like Memory Systems" for more information.

Prerequisites

A unix or unix-like x86 machine
python 3.10 or higher.
Running in a virtual environment (e.g., conda, virtualenv, etc.) is highly recommended so that you don't mess up with the system python.
Install the requirements by running pip install -r requirements.txt

Run training

python train.py

The hyperparameters can be configured in train.yaml. The results will be saved in ./figures.

Results

Handcrafted 1	Handcrafted 2

Handcrafted 3	Handcrafted 4

Total rewards with respect to different handcrafted policies and memory capacities.

Total rewards with respect to the number of agents. The lighter and narrower bars account for the single agent.

Heuristics

Below are some heuristics for the single and multi agent setups.

Single Agent Policies

Inspired by the theories on the explicit human memory, we have designed the following four handcrafted policies (models).

Handcrafted 1: Only episodic, forget the oldest and answer the latest. This agent only has an episodic memory system. When the episodic memory system is full, it will forget the oldest episodic memory. When a question is asked and there are more than one relevant episodic memories found, it will use the latest relevant episodic memory to answer the question.

Handcrafted 2: Only semantic, forget the weakest and answer the strongest. This agent only has a semantic memory system. When the semantic memory system is full, it will forget the weakest semantic memory. When a question is asked and there are more than one relevant semantic memories found, it will use the strongest relevant semantic memory to answer the question.

Handcrafted 3: Both episodic and semantic. This agent has both episodic and semantic memory systems. When the episodic memory system is full, it will forget similar episodic memories that can be compressed into one semantic memory. When the semantic memory system is full, it will forget the weakest semantic memory. When a question is asked, it will first try to use the latest episodic memory to answer it, if it can not, it will use the strongest relevant semantic memory to answer the question.

Handcrafted 4: Both episodic and pretrained semantic. From the beginning of an episode, the semantic memory system is populated with the ConceptNet commonsense knowledge. When the episodic memory system is full, it will forget the oldest episodic memory. When a question is asked, it will first try to use the latest episodic memory to answer it, if it can not, it will use the strongest relevant semantic memory to answer the question.

For a fair comparison, every agent has the same total memory capacity. As for the Handcrafted 3 agent, the episodic and semantic memory systems have the same capacity, since this agent does not know which one is more important a priori. As for the Handcrafted 4 agent, if there is space left in the semantic memory system after filling it up, it will give the rest of the space to the episodic memory system. In order to show the validity of our handcrafted agents, we compare them with the agents that forget and answer uniform-randomly.

Multiple Agent Policies

The multiple agent policies work in the same manner as the single agent policies, except that they can use their combined memory systems to answer questions.

pdoc documentation

Click on this link to see the HTML rendered docstrings

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Run make test && make style && make quality in the root repo directory, to ensure code quality.
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Cite our paper

@misc{https://doi.org/10.48550/arxiv.2204.01611,
  doi = {10.48550/ARXIV.2204.01611},
  url = {https://arxiv.org/abs/2204.01611},
  author = {Kim, Taewoon and Cochez, Michael and Francois-Lavet, Vincent and Neerincx, Mark and Vossen, Piek},
  keywords = {Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {A Machine With Human-Like Memory Systems},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}

Authors

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
figures		figures
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
agent.py		agent.py
requirements.txt		requirements.txt
train.py		train.py
train.yaml		train.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agent for RoomEnv-v0

Prerequisites

Run training

Results

Heuristics

Single Agent Policies

Multiple Agent Policies

pdoc documentation

Contributing

Cite our paper

Authors

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

humemai/agent-room-env-v0

Folders and files

Latest commit

History

Repository files navigation

Agent for RoomEnv-v0

Prerequisites

Run training

Results

Heuristics

Single Agent Policies

Multiple Agent Policies

pdoc documentation

Contributing

Cite our paper

Authors

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages