Skip to content

Latest commit

 

History

History
67 lines (51 loc) · 3.17 KB

README.md

File metadata and controls

67 lines (51 loc) · 3.17 KB

Deep Reorganization (DERO): Retaining Residuals in TinyML

“”

DERO is a simple yet systematic approach that exploits the characteristics and memory allocation behavior of operations to reorganize the residual connections in a network model. DERO maintains the same level of inference peak memory requirement as a plain-style model, while preserving the accuracy and training efficiency of the original model with residuals.

This repository contains the DERO tool and the training scripts used to evaluate the reorganized models. Kindly follow the steps mentioned below to reproducible our results.
All models were trained using eight GTX2080Ti GPUs.

Requirements

  • Python>=3.7.0

  • Pytorch>=1.7.1

Initial steps

Clone repo and install requirements.txt in a Python>=3.7.0 environment.

pip install -r requirements.txt

DERO usage

Running the tool to reorganize the residuals

python dero.py --model resnet34 --output-dir <PATH_TO_DERO_OUTPUT> --input-size 224

Training

Training for baseline models:

torchrun --nproc_per_node=8 train.py --model resnet34 --data-path <PATH_TO_DATASET> --amp --output-dir <PATH_TO_MODEL_OUTPUT> -b 64 --wd 0.00004 --random-erase 0.1 --label-smoothing 0.1 --mixup-alpha 0.2 --cutmix-alpha 1.0
                                             resnet50
                                             mcunet_v4
                                             densenet121

Training for DERO models:

torchrun --nproc_per_node=8 train.py --model resnet34dero --data-path <PATH_TO_DATASET> --amp --output-dir <PATH_TO_MODEL_OUTPUT> -b 64 --wd 0.00004 --random-erase 0.1 --label-smoothing 0.1 --mixup-alpha 0.2 --cutmix-alpha 1.0
                                             resnet50dero
                                             mcunet_dero_v4
                                             densenet121_dero

Evaluate

Testing for models:

python train.py --model <MODEL_NAME> --data-path <PATH_TO_DATASET> -b 64 --test-only --weights <PATH_TO_MODEL>

Comparision

YOLOV5 (Plain) YOLOV5 (DERO)
Ground truth “” “”
Predicted result “” “”

DERO model details

Models Accuracy Parameters (M) Training time Latency (S) Peak memory (KB) Architecture
ResNet34(DERO) 72.32% 20.64 24:23:29 167.0 294.0 Orig./DERO
ResNet50(DERO) 75.56% 21.78 25:53:13 169.9 294.0 Orig./DERO
MCUNet(DERO) 55.59% 0.72 17:29:27 6.4 302.5 Orig./DERO
DenseNet(DERO) 71.55% 7.58 32:52:39 73.3 266.4 Orig./DERO
YOLOV5n(DERO) 25.90% (mAP) 1.73 43:22:35 52.1 253.5 Orig./DERO