My project aimed to be a three-dimensional generative adversarial network (GAN) for generating voxel models, using a three-dimensional capsule network.
In order to make my development environment portable, I used Miniconda. Just download and install it, then use environment.yml to replicate my environment: conda env create --file environment.yml
Once the environment is set up, a dataset for training has to be created: just run python create_dataset.py
, it will download the dataset and convert it from the MATLAB file format into NumPy and saves it as one compressed HDF5 file.
It is highly recommended, that you download the version from release, because the learned weights are included there. Otherwise, you have to train the network yourself (which is rather time-consuming).
There are two different front-ends: a desktop application and a web service based on Flask.
You need to have generator.pkl
(the weights for the generator) in the root directory of the repository for the both of these methods to work.
For the desktop application simply lunch python app_gui.py
after setting up the Miniconda environment as discribed above. I originally intended to ship this desktop app as AppImage using https://github.com/linuxdeploy/linuxdeploy-plugin-conda. But that didn't work, the file create_appimage.sh
is a left over from that failed attempt: it creates an AppImage, but still needs a locally installed Conda environment with all required dependencies installed.
The easiest way is just to run python app_flask.py
after the Miniconda environment is installed. Alternatively you can build the Docker iamge and lunch the Docker container by executing create_docker.sh
. To access it in both cases, simply call localhost:5000 in your browser.
Once the setup is done, just run python gan.py
to start training the GAN.
To see the trained models, run jupyter notebook visualise.ipynb
. It uses Matplotlib and Plotly to display training and generated voxel models.
- app_flask.py: web application for generating and displaying voxel models
- app_gui.py: desktop application for generating and displaying voxel models
- capsule.py: implementation of the capsule network (doesn't work unfortunately)
- constants.py: definition of used constants
- create_appimage.sh: builds an AppImage file
- create_dataset.py: downloads the dataset and converts it into a single HDF5 file
- create_docker.sh: builds and launches a Docker container for the Flask web application
- environment.yaml: Miniconda evironment
- gan.py: the main file
- old_readme.md: former README.md, laying down the vision for this project
- README.md: this file
- visualise.ipynb: visualises the voxel models
- voxeldata.py: data loader for the training data
The capsule network does not work (for now). As I wrote at the end of exercise 1: I am aware that this was quite a big undertaking, and that I'm more doing it for the journey than the goal. The code for the capsule networks is still a bit "hacky", it runs, but it does not work: there is little to no improvement on the simple test case, I used it on (during 30 epochs).
However, since I do need something to show, I trained a GAN, with regular three-dimensional CNNs on voxel data. In order to train the GAN within reasonable time, I only used four different classes:
- wardrobe
- bed
- chair
- laptop
All generated models can be found at: https://cloud.tugraz.at/index.php/s/F8L9BwiXznP3FrL
Here are a few examples of the generated models:
Initially the generator only produces random noise, because of the threshold value of 0.9, only a few pictures are shown.
After 40 epochs the output resembles like a chair, but it is missing some important features.
After 100 more epochs, the generator generates something, that looks like a mixture between chair and table.
This example looks like the upper part of a chair.
After 350 epochs the generator is again generating chairs, this one is overall one of the best generated models.
The generated model after many more epochs looks again like a chair, but this one has some quirks to it.
I am using Binary Cross Entropy Loss (BCELoss), since I want to measure the error of a reconstruction. I measure the loss for the discriminator in distinguishing real from generated examples, and sum up both losses. For the generator, I'm interested in how well the generator is able to fool the discriminator.
The target values for both of them is of course as small as possible (>> 1) and to be constantly getting smaller. Since this is GAN working with 3D data, this did not work out:
Task | Hours |
---|---|
Getting familiar with the data / used libraries | 10 |
In-depth reading of related publications | 15 |
Coding of solution | 40 |
Creating presentation of results | 8 |
The code for the Voxel-GAN is based on 3DGAN-Pytorch from @rimchang.
The code for the Capusle Network is based on 3D Point Capsule Networks by @yongheng1991