What?

galvanise is a General Game Player, where games are written in GDL. The original galvanise code was converted to a library ggplib and galvanise_zero adds AlphaZero style learning. Much inspiration was from Deepmind’s related papers, and the excellent Expert Iteration paper. A number of Alpha*Zero open source projects were also inspirational: LeelaZero and KataGo (XXX add links).

Features

there is no game specific code other than the GDL description of the games, a high level python configuration file describing GDL symbols to state mapping and symmetries (see here for more information).
multiple policies - train assymetric games
fully automated, put in oven and strong model is baked
network replaced during self play games
training is very fast using proper coroutines at the C level. 1000s of concurrent games are trained using large batch sizes on GPU (for small networks).
uses a post processed replay buffer, which uses the excellent bcolz (XXX link) project. Training can allow arbitrary sampling from the buffer (giving emphasis to most recent data).
initially project used expert iteration. This was deprecated in favour of oscillating sampling (similar to KataGo).
3 value heads for games with draws

Training

used same setting for training all games types (cpuct 0.85, fpu 0.25).
uses smaller number of evaluations (200) than A0, oscillating sampling during training (75% of moves are skipped, using much less evals to do so).
policy squashing and extra noise to prevent overfitting
models use dropout, global average pooling and squeeze excite blocks (these are optional)
in general, takes 3-5 days in many of the trained game types below to become super human strength

See gzero_bot for how to play on Little Golem.

Status

Games with significant training, links to elo graphs and models:

Little Golem Champion in last attempts @ Connect6, Hex13, Amazons and Breakthrough, winning all matches. Retired from further Championships. Connect6 and Hex 13 are currently rated 1st and 2nd respectively on active users.

Amazons and Breakthrough won gold medals at ICGA 2018 Computer Olympiad. 👏 👏

Reversi is also strong relative to humans on LG, yet performs a bit worse than top AB programs (about ntest level 20 the last time I tested).

Also trained Baduk 9x9, it had a rating ~2900 elo on CGOS after 2-3 week of training.

Running

The code is in fairly good shape, but could do with some refactoring and documentation (especially a how to guide on how to train a game). It would definitely be good to have an extra pair of eyes on it. I’ll welcome and support anyone willing to try training a game for themselves. Some notes:

python is 2.7
requires a GPU/tensorflow
good starting point is https://github.com/richemslie/ggp-zero/blob/dev/src/ggpzero/defs

How to run and install instruction coming soon!

Name		Name	Last commit message	Last commit date
Latest commit History 654 Commits
bin		bin
doc		doc
src		src
.gitignore		.gitignore
LICENSE		LICENSE
readme.org		readme.org
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What?

Features

Training

Status

Running

About

Releases

Packages

Languages

License

richemslie/galvanise_zero

Folders and files

Latest commit

History

Repository files navigation

What?

Features

Training

Status

Running

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages