Skip to content
This repository has been archived by the owner on Jul 11, 2021. It is now read-only.
/ ML-Term-Proj Public archive

Using a KL Divergence loss without labels for to learn an embedding for clustering.

Notifications You must be signed in to change notification settings

entrity/ML-Term-Proj

Repository files navigation

Deep Clustering

(For ECS 271.) Let's compare the KL Divergence loss propounded by Xie et al. 2016 on to some alternatives, starting with features taken from a pre-trained image classifier instead of the SAE's used by Xie et al.

Main approach

  1. Get pretrained Resnet-18
model = primary.net.new()
  1. Download STL-10 dataset
python data/stl10_input.py
  1. Partition dataset into train and test sets
python -m data.dataset
  1. Run trainer to fine-tune Resnet-18
NAME=my_session
python -m primary.train \
	--save_path "saves/$NAME.pth" \
	--log_path "logs/$NAME.log"

SAE approach

  1. Prepare dataset
python -m sae.dataset
  1. Pretrain SAE
NAME=sae-pretrain
rm "logs/$NAME.log"
python -m sae.pretrain \
	--ep 5000 \
	--lr 1e-1 \
	--test_every 0 \
	--print_every 0 \
	--save_path "saves/$NAME.pth" \
	--log_path "logs/$NAME.log"
  1. Fine-tune SAE's encoder
NAME=sae-finetune
python -m sae.finetune \
	--ep 5000 \
	--lr 1e-1 \
	--test_every 0 \
	--print_every 0 \
	--load_path "saves/sae-pretrain.pth" \
	--save_path "saves/$NAME.pth" \
	--log_path "logs/$NAME.log"

TODO

  1. Subclass trainer to give a _loss function that doesn't use fine-tuning (KL Div loss) but instead performs reconstruction loss for SAE

DONE

  1. Add SAE net, trainer to project
  2. Extract HOG features
  3. Make dataset for SAE training
  4. Show ACC in trainer
  5. Hyperparam search
  6. Split dataset
  7. Write dataset module
  8. Write ACC computation
  9. Write trainer to fine tune resnet-18 feature extractor

About

Using a KL Divergence loss without labels for to learn an embedding for clustering.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published