Multi-Agent Cooperation in Hanabi with Policy Optimisation

This is the source code for my master's dissertation project Multi-Agent Cooperation in Hanabi with Policy Optimisation. As improvements have been made after the dissertation, please see tag:final-report for the version of used in my dissertation final report.

The implementation is based on Proximal Policy Optimisation and Hanabi Learning Environment.

Requirement

Python 3.7+ (because I can't survive without f-strings)
PyTorch (currently CPU only) (someone buy me a NVIDIA laptop plz?)
Hanabi Learning Environment

Repo Structure

configs/: JSON configuration files for agent training. Some are outdated, use for reference only.
figures/: figures for this README.
ppo/: main source files.
scripts/: plotting scripts. Very volatile, use at own risk (I detest matplotlib).

Performance

Hanabi-Small

Hanabi-Small is a smaller version of Hanabi with a maximum score of 10. Currently we can train an agent on Hanabi-Small in under 8 hours on an 8-core CPU machine.

Hanabi-Full

Hanabi-Full is the full version with a maximum score of 25. We are still tuning the hyperparameters for Hanabi-Full but this is our current results.

Ad hoc teams

We can also do ad hoc evaluation using trained agents.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
configs		configs
figures		figures
ppo		ppo
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent Cooperation in Hanabi with Policy Optimisation

Requirement

Repo Structure

Performance

Hanabi-Small

Hanabi-Full

Ad hoc teams

About

Releases

Packages

Languages

patrick22414/hanabi_project

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Cooperation in Hanabi with Policy Optimisation

Requirement

Repo Structure

Performance

Hanabi-Small

Hanabi-Full

Ad hoc teams

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages