Skip to content

PyTorch semi-supervised clustering with Convolutional Autoencoders

License

Notifications You must be signed in to change notification settings

michaal94/Semisupervised-Clustering

Repository files navigation

Semisupervised Clustering

This repository contains the code for semi-supervised clustering developed for Master Thesis: "Automatic analysis of images from camera-traps" by Michal Nazarczuk from Imperial College London

The algorithm is inspired with DCEC method (Deep Clustering with Convolutional Autoencoders). The main change adds "labelling" loss (cross-entropy between labelled examples and their predictions) as the loss component.

Prerequisites

The following libraries are required to be installed for the proper code evaluation:

  1. PyTorch
  2. NumPy
  3. scikit-learn
  4. TensorboardX

The code was written and tested on Python 3.4.1

Installation and usage

Installation

Just copy the repository to your local folder:

git clone https://github.com/michaal94/Semisupervised-Clustering

Use of the algortihm

In order to test the basic version of the semi-supervised clustering just run it with your python distribution you installed libraries for (Anaconda, Virtualenv, etc.). In general type:

cd Semisupervised-Clustering
python3 semi_supervised.py

The example will run sample clustering with MNIST-train dataset.

Options

The algorithm offers a plenty of options for adjustments:

  1. Mode choice: full or pretraining only, use: --mode train_full or --mode pretrain

    Fot full training you can specify whether to use pretraining phase --pretrain True or use saved network --pretrain False and --pretrained net ("path" or idx) with path or index (see catalog structure) of the pretrained network

  2. Dataset choice:

    • MNIST - train, test, full
    • Custom dataset - use the following data structure (characteristic for PyTorch):
      -data_directory (clusters must corespond to real clustering only for statistics)
          -cluster_1
              -image_1
              -image_2
              -...
          -cluster_2
              -image_1
              -image_2
              -...
          -...
      -data_directory_l (data used as labelled, use at least one example in each class in the current version of algorithm)
          -cluster_1
              -image_1
              -image_2
              -...
          -cluster_2
              -image_1
              -image_2
              -...
          -...
      

    Use the following: --dataset MNIST-train, --dataset MNIST-test, --dataset MNIST-full or --dataset custom (use the last one with path --dataset_path 'path to your dataset' and the trasformation you want for images --custom_img_size [height, width, depth])

  3. Different network architectures:

    • CAE 3 - convolutional autoencoder used in DCEC --net_architecture CAE_3
    • CAE 3 BN - version with Batch Normalisation layers --net_architecture CAE_3bn
    • CAE 4 (BN) - convolutional autoencoder with 4 convolutional blocks --net_architecture CAE_4 and --net_architecture CAE_4bn
    • CAE 5 (BN) - convolutional autoencoder with 5 convolutional blocks --net_architecture CAE_5 and --net_architecture CAE_5bn (used for 128x128 photos)

    The following opions may be used for model changes:

    • LeakyReLU or ReLU usage: --leaky True/False (True provided better results)
    • Negative slope for Leaky ReLU: --neg_slope value (Values around 0.01 were used)
    • Use of sigmoid and tanh activations at the end of encoder and decoder: --activations True/False (False provided better results)
    • Use of bias in layers: --bias True/False
  4. Optimiser and scheduler settings (Adam optimiser):

    • Learning rate: --rate value (0.001 is reasonable value for Adam)
    • Learning rate for pretraining phase: --rate_pretrain value (0.001 can be used as well)
    • Weight decay: --weight value (0 was used)
    • Weight decay for pretraining phase: --weight_pretrain value
    • Scheduler step (how many iterations till the rate is changed): --sched_step value
    • Scheduler step for pretraining phase: --sched_step_pretrain value
    • Scheduler gamma (multiplier of learning rate): --sched_gamma value
    • Scheduler gamma for pretraining phase: --sched_gamma_pretrain value
  5. Algorithm specific parameters:

    • Clustering loss weight (for reconstruction loss fixed with weight 1): --gamma value (Value of 0.1 provided good results)
    • Labelling loss weight: --gamma_lab value (0.01 provided good results)
    • Update interval for target distribution (in number of batches between updates): update_interval value (Value may be chosen such that distribution is updated each 1000-2000 photos)
    • Label check interval --label_upd_interval value (Suggested to leave each iteration update)
    • Stop criterium tolerance --tol value (Depends on dataset, for small 0.01 was used for bigger e.g. MNIST - 0.001)
    • Target number of clusters --num_clusters value
  6. Other options:

    • Batch size: --batch_size value (Depend on your device, but remember that too much may be bad for convergence)
    • Epochs if stop criterium not met: --epochs value
    • Epochs of pretraining: --epochs_pretrain value (300 epochs were used, 200 with 0.001 lerning rate and 100 with 10 times smaller - --sched_step_pretrain 200, --sched_gamma_pretrain 0.1)
    • Report printing frequency (in batches): --printing_frequency value
    • Tensorboard export: --tensorboard True/False

Catalog structure

The code creates the following catalog structure when reporting the statistics:

-Reports
    -(net_architecture_name)_(index).txt
-Nets (copies of weights
    -(net_architecture_name)_(index).pt
    -(net_architecture_name)_(index)_pretrained.txt
-Runs
    -(net_architecture_name)_(index)  <- directory containing tensorboard event file

The files are indexed automatically for the files not to be accidentally overwritten.

Performance

The code was mainly used to cluster images coming from camera-trap events. However, some additional benchmarks were performed on MNIST datasets. The following table gather some results (for 2% of labelled data):

Set NMI Acc
MNIST-full 95.13 98.22%
MNIST-test 89.59 95.29%

In addition, the t-SNE plots of plain and clustered MNIST full dataset are shown:

Full set before clustering:

MNIST full set t-SNE plot

After clustering:

MNIST full set after clustering t-SNE plot

About

PyTorch semi-supervised clustering with Convolutional Autoencoders

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages