ProxyDR

Code for the paper "ProxyDR: Deep Hyperspherical Metric Learning with Distance Ratio-Based Formulation"

Environment

Python3.8
(For conda environment installations, you can follow the commands in conda_installation.txt)
PyTorch (http://pytorch.org/) (gpytorch 1.4.1)
NumPy (version 1.19.5)
Pandas (version 1.0.5)
Scikit-learn (version 0.24.2)
SciPy (version 1.5.0)
Biopython (version 1.79)
Json5 (version 0.8.5)
scikit-bio
ete3

Preparing datasets

CIFAR-100

We used CIFAR-100 from torchvision https://pytorch.org/vision/stable/datasets.html.

One may download the CIFAR-100 dataset from https://www.cs.toronto.edu/~kriz/cifar.html (CIFAR-100 python version).

NABirds

One can download NABirds dataset from https://dl.allaboutbirds.org/nabirds. You need to change path names in nabirds_cls.csv, nabirds_cls2.csv, and nabirds_info.csv such that images are located in the written path (you will only need to change "DATA_init" to the corresponding folder name in each line). You need to run Prepare_NABirds.ipynb after properly changing the config.json file as explained in the train section.

Three plankton datasets

You can download these from Small microplankton (MicroS), Large microplankton (MicroL), and Mesozooplankton (MesoZ). These datasets should be inside a folder named "plankton_data" (you need to make this folder). You need to change path names in MicroS_cls.csv, MicroS_info.csv, MicroL_cls.csv, MicroL_info.csv, MesoZ_cls.csv, and MesoZ_info.csv such that images are located in the written path (you will only need to change "DATA_init" to the corresponding folder name in each line. For instance, you might use the command sed -i 's/DATA_init/Data_path_name/g' MicroS_cls.csv).

Train

Before training, in the config.json file, you need to put where the "nabirds" and "plankton_data" folders are located (DATA_init) and where this repository (ProxyDR) is located (FOLDER_init).

For training of CIFAR100 dataset, run python train_cifar100.py --GPU [GPU_NUMBER(S)] --method [METHOD_NAME] --distance [DISTANCE] --use_val --seed [SEED_NUMBER] --[TRAINING_OPTION].

For training of NABird dataset, run python train_nabirds.py --GPU [GPU_NUMBER(S)] --method [METHOD_NAME] --distance [DISTANCE] --use_val --seed [SEED_NUMBER] --[TRAINING_OPTION].

For training of plankton datasets, run python train.py --GPU [GPU_NUMBER(S)] --dataset [DATASET_NAME] --method [METHOD_NAME] --distance [DISTANCE] --size_inform --use_val --seed [SEED_NUMBER] --[TRAINING_OPTION].

Methods ([METHOD_NAME])
Softmax: softmax, NormFace: normface, ProxyDR: default DR, CORR loss: --method DR --mds_W --CORR
Training options and the corresponding [TRAINING_OPTION] names
Standard: default (without any --[TRAINING_OPTION]), EMA: --ema, Dynamic (scale factor): --dynamic, MDS (multidimensional scaling): --mds_W

Code examples

For example, to train NormFace model on MicroS dataset with standard option (also GPU:0, seed: 1, use Euclidean distance, size information and validation), run python train.py --GPU 0 --dataset MicroS --method SD --distance euc --size_inform --seed 1 --use_val

For example, to train ProxyDR model on MicroS dataset with MDS and dynamic options (also GPU:0, seed: 1, use Euclidean distance, size information and validation), run python train.py --GPU 0 --dataset MicroS --method DR --distance euc --size_inform --seed 1 --use_val --mds_W --dynamic

For example, to train CORR model (requires MDS) on MicroS dataset (also GPU:0, seed: 1, use Euclidean distance, size information and validation), run python train.py --GPU 0 --dataset MicroS --method DR --distance euc --size_inform --seed 1 --use_val --mds_W --CORR

Training whole models (replicating experiments in our paper)

If you want to replicate the experiments, instead of typing each training setting, you can run train_CIFAR100_whole_models.sh, train_NABirds_whole_models.sh, train_MicroS_whole_models.sh, train_MicroL_whole_models.sh, and train_MesoZ_whole_models.sh. (You may want to change GPU number. Values might differ due to randomness.)

Evaluation of trained models

For evaluation of CIFAR100 dataset models, run python eval_cifar100.py --GPU [GPU_NUMBER(S)] --method [METHOD_NAME] --distance [DISTANCE] --use_val --seed [SEED_NUMBER] --[TRAINING_OPTION].

For evaluation of NABird dataset models, run python eval_nabirds.py --GPU [GPU_NUMBER(S)] --method [METHOD_NAME] --distance [DISTANCE] --use_val --seed [SEED_NUMBER] --[TRAINING_OPTION].

For evaluation of plankton dataset models, run python eval_.py --GPU [GPU_NUMBER(S)] --dataset [DATASET_NAME] --method [METHODNAME] --distance [DISTANCE] --size_inform --use_val --seed [SEED_NUMBER] --[TRAINING_OPTION].

--last: evaluate the last training epoch model (probably not the best model)

Evaluating whole models (replicating experiments in our paper)

If you want to replicate the experiments, instead of typing each evaluation setting, you can run eval_CIFAR100_whole_models.sh, eval_NABirds_whole_models.sh, eval_MicroS_whole_models.sh, eval_MicroL_whole_models.sh, and eval_MesoZ_whole_models.sh. (You may want to change GPU number. Values might differ due to randomness.)

Results

The training and evaluation results will be recorded in ./record/

References

Dynamic option implementation is modified from https://github.com/4uiiurz1/pytorch-adacos/blob/master/metrics.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProxyDR

Environment

Preparing datasets

CIFAR-100

NABirds

Three plankton datasets

Train

Code examples

Training whole models (replicating experiments in our paper)

Evaluation of trained models

Evaluating whole models (replicating experiments in our paper)

Results

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
Hierarchies		Hierarchies
CIFAR100_cls.csv		CIFAR100_cls.csv
LICENSE		LICENSE
MesoZ_cls.csv		MesoZ_cls.csv
MesoZ_info.csv		MesoZ_info.csv
MicroL_cls.csv		MicroL_cls.csv
MicroL_info.csv		MicroL_info.csv
MicroS_cls.csv		MicroS_cls.csv
MicroS_info.csv		MicroS_info.csv
Prepare_NABirds.ipynb		Prepare_NABirds.ipynb
README.md		README.md
conda_installation.txt		conda_installation.txt
config.json		config.json
eval_.py		eval_.py
eval_CIFAR100_whole_models.sh		eval_CIFAR100_whole_models.sh
eval_MesoZ_whole_models.sh		eval_MesoZ_whole_models.sh
eval_MicroL_whole_models.sh		eval_MicroL_whole_models.sh
eval_MicroS_whole_models.sh		eval_MicroS_whole_models.sh
eval_NABirds_whole_models.sh		eval_NABirds_whole_models.sh
eval_cifar100.py		eval_cifar100.py
eval_nabirds.py		eval_nabirds.py
nabirds_cls.csv		nabirds_cls.csv
nabirds_cls2.csv		nabirds_cls2.csv
nabirds_info.csv		nabirds_info.csv
train.py		train.py
train_CIFAR100_whole_models.sh		train_CIFAR100_whole_models.sh
train_MesoZ_whole_models.sh		train_MesoZ_whole_models.sh
train_MicroL_whole_models.sh		train_MicroL_whole_models.sh
train_MicroS_whole_models.sh		train_MicroS_whole_models.sh
train_NABirds_whole_models.sh		train_NABirds_whole_models.sh
train_cifar100.py		train_cifar100.py
train_nabirds.py		train_nabirds.py

License

hjk92g/ProxyDR

Folders and files

Latest commit

History

Repository files navigation

ProxyDR

Environment

Preparing datasets

CIFAR-100

NABirds

Three plankton datasets

Train

Code examples

Training whole models (replicating experiments in our paper)

Evaluation of trained models

Evaluating whole models (replicating experiments in our paper)

Results

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages