Skip to content

A machine learning project attempting to build a model to predict the solvability of a Sokoban level.

License

Notifications You must be signed in to change notification settings

AbhijeetKrishnan/sokoban-solvability-predictor

Repository files navigation

Sokoban Solvability Prediction Model

TensorFlow

This is a machine learning project attempting to build a model to predict the solvability of a Sokoban level.

A Sokoban level
Ole from the Haikemono collection of Sokoban levels by Jordi Domènech

Instructions

  1. Download the JD_Sokoban level collection (requires a browser).

  2. Extract contents into a levels/ folder.

  3. Initialize generator submodules.

    git submodule init
    git submodule update
  4. Download and install the FastDownward v21.12 planner.

    tar -xvzf fast-downward-21.12.tar.gz
    cd fast-downward-21.12
    ./build.py release
  5. Build train dataset (WARNING: takes a long time)

    python level_parser.py
    python level_solver.py
  6. Train model

    python model.py
  7. Predict the solvability of a custom level (TODO:)

Motivation

It is very important for puzzle game levels to be solvable in order for the player to have a satisfying experience. Puzzle level designers must usually solve the levels they design themselves. Automated tools for solving levels like planners exist, but typically take a long time to return a verdict on large levels.

Having a tool which can quickly return a verdict for a level would make it easier for puzzle level designers to iterate on their designs while ensuring their levels remain solvable.

This is also an exercise for me in training an ML model for a toy task, and to see whether ML models are actually capable of such a task out-of-the-box. This is probably a wildly impractical way to build a level solvability checker.

Approach

Dataset

The levels used in this project have been sourced from -

  • Jordi Domènech's entire Sokoban collection (link)

All levels were either originally in, or were modified to fit, the level description format described in the Sokoban Wiki. Levels were padded with walls to fit a $50 \times 50$ grid. The input to the model is thus a $50 \times 50 \times 7$ tensor (the tile types are one-hot encoded)

Solvability of a level was determined using the FastDownward planner (v21.12). Using the IPC-2011 Sokoban domain as a basis, levels are converted into a PDDL problem file and passed as input to the planner.

The problem of finding unsolvable levels was solved by augmenting the dataset. For every level, all block tiles were turned one-at-a-time into an empty tile, and the resulting unsolvable levels were added back into the dataset. This creates a somewhat more even distribution of the two classes (solvable/unsolvable) since every level in those initially sourced levels was designed to be solvable.

Additional levels were obtained using the following level generators -

The data was split into train/test in an $80:20$ ratio. The train data was further split into train/valid in an $80:20$ ratio. The class balance was -

Train Test
Positive 876 232
Negative 407 88
%Positive 68.3 72.5
Total 1283 320

Model

Model Architecture Diagram
Model Architecture Diagram created using NN SVG

The model uses a conv block to flatten the input channels from 7 to 1. It then uses another conv block to find useful features in the flattened image. Two fully-connected layers are then used to make the prediction from the calculated features. The model has a total of $270,715$ trainable parameters.

Training

The network is trained using SGD with a batch size of 8 for 10 epochs. The loss function used is binary cross-entropy.

Graph of Loss vs. Epoch number
Graph of loss vs. epoch number for train (grey) and validation (orange)
Graph of Binary Accuracy vs. Epoch number
Graph of binary accuracy vs. epoch number for train (grey) and validation (orange)

Evaluation

The trained model was evaluated on the test data using a batch size of 8.

TODO: baselines for comparison

Results

Accuracy

The test accuracy of the model was found to be $74.69%$. A model which simply predicted the majority label would have had an accuracy of $72.5%$. The F1 score is $0.84$ ($[0,1]$, $1$ is perfect classification) and the MCC is $0.27$ ($[-1,1]$, $0$ is random, $1$ is perfect classification).

Runtime Performance

TODO:

Conclusion

The model is able to achieve a marginal improvement over the majority selector, but it is unclear whether it would show similar results on other problems, or whether it would provide meaningful benefit over using a planning-based solver.

About

A machine learning project attempting to build a model to predict the solvability of a Sokoban level.

Topics

Resources

License

Stars

Watchers

Forks