This project is a PyTorch implementation for
GLIF: A Unified Gated Leaky Integrate-and-Fire Neuron for Spiking Neural Networks
Xingting Yao, Fanrong Li, Zitao Mo, Jian Cheng
NeurIPS 2022 Poster Presentation
The main requirements of this project are as follows:
- Python 3.8.8
- PyTorch == 1.10.0+cu113
- Torchvision == 0.11.1+cu113
- CUDA 11.3
- SpikingJelly == 0.0.0.0.10
Our trained models can be found in Gated-LIF/trained models
. Download and place them in any folder you would like. For example, /home/GLIF_models
.
The following are the python commands to run the python script train.py
. We recommend using Absolute Paths to clarify the required Directories, and please make sure to change the current working directory to this project, i.e., $pwd>>.../Gated-LIF
.
Note that we utilized a single GPU for evaluations.
# CIFAR-10
## Resnet-18
CUDA_VISIBLE_DEVICES=[GPU-ID] python -u train.py --modeltag [TRAINED-MODEL-FILENAME] --soft-mode --eval --eval-resume [PATH-TO-TRAINED-MODEL-FOLDER] --stand18 --channel-wise --t [TIMESTEP] --dataset-path [PATH-TO-DATASET] > evaluation.log
## Resnet-19
CUDA_VISIBLE_DEVICES=[GPU-ID] python -u train.py --modeltag [TRAINED-MODEL-FILENAME] --soft-mode --eval --eval-resume [PATH-TO-TRAINED-MODEL-FOLDER] --channel-wise --t [TIMESTEP] --dataset-path [PATH-TO-DATASET] > evaluation.log
# CIFAR-100
## Resnet-18
CUDA_VISIBLE_DEVICES=[GPU-ID] python -u train.py --modeltag [TRAINED-MODEL-FILENAME] --soft-mode --eval --eval-resume [PATH-TO-TRAINED-MODEL-FOLDER] --stand18 --channel-wise --t [TIMESTEP] --dataset-path [PATH-TO-DATASET] --cifar100 > evaluation.log
## Resnet-19
CUDA_VISIBLE_DEVICES=[GPU-ID] python -u train.py --modeltag [TRAINED-MODEL-FILENAME] --soft-mode --eval --eval-resume [PATH-TO-TRAINED-MODEL-FOLDER] --channel-wise --t [TIMESTEP] --dataset-path [PATH-TO-DATASET] --cifar100 > evaluation.log
# ImageNet
## ResNet-18MS
CUDA_VISIBLE_DEVICES=[GPU-ID] python -u train.py --modeltag [TRAINED-MODEL-FILENAME] --soft-mode --eval --eval-resume [PATH-TO-TRAINED-MODEL-FOLDER] --MS18 --channel-wise --t [TIMESTEP] --train-dir [PATH-TO-IMAGENET-TRAININGSET] --val-dir [PATH-TO-IMAGENET-VALIDATIONSET] --imagenet > evaluation.log
Specifically, [TRAINED-MODEL-FILENAME]
refers to the filename of the .tar file, e.g., ''resCifar18stand-CIFAR10-step6-CW.pth.tar''. [PATH-TO-TRAINED-MODEL-FOLDER]
refers to the folder that contains trained models, e.g., /home/GLIF_models
. TIMESTEP
refers to the length of the time window of the model, e.g., the timestep of the model ''resCifar18stand-CIFAR10-step6-CW.pth.tar'' is 6.
Evaluation results are printed in evaluation.log.
# CIFAR-10
## Resnet-18
CUDA_VISIBLE_DEVICES=[GPU-ID] python -u train.py --epoch 200 --batch-size 64 --learning-rate 0.1 --modeltag [CHECKPOINT-FILENAME] --soft-mode --stand18 --channel-wise --randomgate --tunable-lif --t [TIMESTEP] --dataset-path [PATH-TO-DATASET] > train.log
# CIFAR-100
## Resnet-18
CUDA_VISIBLE_DEVICES=[GPU-ID] python -u train.py --epoch 200 --batch-size 64 --learning-rate 0.1 --modeltag [CHECKPOINT-FILENAME] --soft-mode --stand18 --channel-wise --randomgate --tunable-lif --t [TIMESTEP] --dataset-path [PATH-TO-DATASET] --cifar100> train.log
# ImageNet (distributed computation on multi-GPUs)
## ResNet-18MS
CUDA_VISIBLE_DEVICES=[GPU-IDs] python -m torch.distributed.run --master_port [PORT-ID] --nproc_per_node [NUMBER-OF-GPUs] train.py --epoch 150 --batch-size 50 --learning-rate 0.1 --modeltag [CHECKPOINT-FILENAME] --soft-mode --MS18 --channel-wise --randomgate --tunable-lif --t [TIMESTEP] --train-dir [PATH-TO-IMAGENET-TRAININGSET] --val-dir [PATH-TO-IMAGENET-VALIDATIONSET] --imagenet> train.log
Training details are printed in train.log. Checkpoints are stored in ./raw/models
. Model options and training hyperparameters are configurable with different commands. Those commands and their descriptions can be found in .../Gated-LIF/train.py
from line 22 to line 71.
We plug GLIF into an open-source project for CIFAR10-DVS, which is SEW-PLIF-CIFAR10-DVS.
The codes, trained models, and training logs for CIFAR10-DVS are saved in the file .../Gated-LIF/cifar10dvs
. The following is the python command that we use to train a GLIF-based 7B-wideNet:
# CIFAR10-DVS
## 7B-wideNet
CUDA_VISIBLE_DEVICES=6 python ./cifar10dvs/train.py --dsr_da -amp -out_dir ./logs -model SEWResNet_GLIF_dsr -cnf ADD -device cuda:0 -dts_cache /mnt/lustre/GPU8/home/usr/dvs_datasets/DVSCIFAR10/cifar10dvs_cache_SEW -epochs 200 -T_max 64 -T 16 -data_dir /mnt/lustre/GPU8/home/usr/dvs_datasets/DVSCIFAR10 -lr 0.01 -b 32 > True_widePLIF7B_GLIF-T_16-anneal-dsr-epoch_200.log
In .../Gated-LIF/cifar10dvs, we add some different models, including GLIF-based models, to the original cifar10dvs. You can easily find their model names from line 144 to line 128 in .../Gated-LIF/cifar10dvs/train.py
.
Model | TimeStep | CIFAR10 Top-1(%) | CIFAR100 Top-1(%) |
---|---|---|---|
ResNet-18 | 2 | 94.19 | 74.77 |
4 | 94.75 | 76.50 | |
6 | 95.09 | 77.49 | |
ResNet-19 | 2 | 94.56 | 75.60 |
4 | 94.95 | 77.22 | |
6 | 95.14 | 77.42 |
Model | Dataset | TimeStep | Top-1(%) |
---|---|---|---|
7B-wideNet | CIFAR10-DVS | 16 | 78.10 |
ResNet-18MS | ImageNet | 6 | 68.10 |
P.S. the CIFAR10-DVS result is 1.3% higher than reported in the openreview discussion. Because we fixed a minor bug in the .../Gated-LIF/cifar10dvs/smodels
. The fixed script, new training logs, and to-date trained models have been updated or added by 2022/11/3, which should work well and match the results in the above list. The paper on the openreview has already been fixed, corrected, and ensured correct.
- In the script
.../Gated-LIF/train.py
, we retain some useful python commands to reproduce our ablation studies. Anyone who reads the parser descriptions from line 22 to 71 should easily understand how to use them. - Furthermore, we retain some codes to support the experiments of unstudied GLIF-based variants and some tricks. For example, making all the gating factors learnable but keeping binary, referred to as 'hard mode', still improves performance compared to some LIF-based SNNs. Unlike the proposed GLIF method in the paper, the 'hard mode' should require the same computation overhead as the normal LIFs, increasing the heterogeneity of SNNs. (Experimental results of 'Hard Mode' GLIF will be revealed as extended studies in our in-progress works.)
- The distributions of learned parameters are very interesting, as we visualized them in the paper. The initially identical parameters learn into different bell-shaped distributions layer-wisely. This may shed light on some interesting connections between DNNs and the hierarchical structures of brains.
- Since GLIF offers more tunable parameters than LIF, extending it into the frameworks of the ANN2SNN conversion should be interesting because the recent trend of ANN2SNN is figuring out better parameter mapping from ANNs to SNNs to improve the performance of the converted SNNs. Hopefully, this could pave a new path to that field if we can find the parameter mapping from ANNs to GLIF-based SNNs.
Please cite this paper using the following BibTeX entry if you find this work useful for your research.
@inproceedings{
yao2022glif,
title={{GLIF}: A Unified Gated Leaky Integrate-and-Fire Neuron for Spiking Neural Networks},
author={Xingting Yao and Fanrong Li and Zitao Mo and Jian Cheng},
booktitle={Thirty-Sixth Conference on Neural Information Processing Systems},
year={2022},
url={https://openreview.net/forum?id=UmFSx2c4ubT}
}
Please feel free to contact us if you need any further information.