-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nano : ResNet Demo for InferenceOptmizer #5580
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
006c124
add resnet demo
rnwang04 5a900d2
add basic readme
rnwang04 d46a656
add inference result
rnwang04 0c0b51c
update based on comment
rnwang04 34b48f5
update based on comment
rnwang04 8406b23
fix style
rnwang04 2f66ea0
fix typos and update based on comment
rnwang04 75890da
update numpy version
rnwang04 c1d558b
update for faster demo
rnwang04 2a0d0f0
modify some number
rnwang04 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
77 changes: 77 additions & 0 deletions
77
python/nano/example/pytorch/inference_pipeline/resnet/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Bigdl-nano InferenceOptimizer example on Cat vs. Dog dataset | ||
|
||
This example illustrates how to apply InferenceOptimizer to quickly find acceleration method with the minimum inference latency under specific restrictions or without restrictions for a trained model. | ||
For the sake of this example, we first train the proposed network(by default, a ResNet18 is used) on the [cats and dogs dataset](https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip), which consists both [frozen and unfrozen stages](https://github.com/PyTorchLightning/pytorch-lightning/blob/495812878dfe2e31ec2143c071127990afbb082b/pl_examples/domain_templates/computer_vision_fine_tuning.py#L21-L35). Then, by calling `optimize()`, we can obtain all available accelaration combinations provided by BigDL-Nano for inference. By calling `get_best_mdoel()` , we could get an accelerated model whose inference is 7.5x times faster. | ||
|
||
|
||
## Prepare the environment | ||
We recommend you to use [Anaconda](https://www.anaconda.com/distribution/#linux) to prepare the environment. | ||
**Note**: during your installation, there may be some warnings or errors about version, just ignore them. | ||
``` | ||
conda create -n nano python=3.7 # "nano" is conda environment name, you can use any name you like. | ||
conda activate nano | ||
pip install jsonargparse[signatures] | ||
pip install --pre --upgrade bigdl-nano[pytorch] | ||
|
||
# bf16 is available only on torch1.12 | ||
pip install torch==1.12.0 torchvision --extra-index-url https://download.pytorch.org/whl/cpu | ||
# Necessary packages for inference accelaration | ||
pip install --upgrade intel-extension-for-pytorch | ||
pip install onnx==1.12.0 onnxruntime==1.12.1 onnxruntime-extensions | ||
pip install openvino-dev | ||
pip install neural-compressor==1.12 | ||
pip install --upgrade numpy==1.21.6 | ||
``` | ||
Initialize environment variables with script `bigdl-nano-init` installed with bigdl-nano. | ||
``` | ||
source bigdl-nano-init | ||
``` | ||
You may find environment variables set like follows: | ||
``` | ||
Setting OMP_NUM_THREADS... | ||
Setting OMP_NUM_THREADS specified for pytorch... | ||
Setting KMP_AFFINITY... | ||
Setting KMP_BLOCKTIME... | ||
Setting MALLOC_CONF... | ||
+++++ Env Variables +++++ | ||
LD_PRELOAD=./../lib/libjemalloc.so | ||
MALLOC_CONF=oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:-1,muzzy_decay_ms:-1 | ||
OMP_NUM_THREADS=112 | ||
KMP_AFFINITY=granularity=fine,compact,1,0 | ||
KMP_BLOCKTIME=1 | ||
TF_ENABLE_ONEDNN_OPTS= | ||
+++++++++++++++++++++++++ | ||
Complete. | ||
``` | ||
|
||
## Prepare Dataset | ||
By default the dataset will be auto-downloaded. | ||
You could access [cats and dogs dataset](https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip) for a view of the whole dataset. | ||
|
||
## Run example | ||
You can run this example with command line: | ||
|
||
```bash | ||
python inference_pipeline.py | ||
``` | ||
|
||
## Results | ||
|
||
It will take about 2 minutes to run inference optimization. Then you may find the result for inference as follows: | ||
``` | ||
accleration option: original, latency: 54.2669ms, accuracy: 0.9937 | ||
accleration option: fp32_ipex, latency: 40.3075ms, accuracy: 0.9937 | ||
accleration option: bf16_ipex, latency: 115.6182ms, accuracy: 0.9937 | ||
accleration option: int8, latency: 14.4857ms, accuracy: 0.4750 | ||
accleration option: jit_fp32, latency: 39.3361ms, accuracy: 0.9937 | ||
accleration option: jit_fp32_ipex, latency: 39.2949ms, accuracy: 0.9937 | ||
accleration option: jit_fp32_ipex_clast, latency: 24.5715ms, accuracy: 0.9937 | ||
accleration option: openvino_fp32, latency: 14.5771ms, accuracy: 0.9937 | ||
accleration option: openvino_int8, latency: 7.2186ms, accuracy: 0.9937 | ||
accleration option: onnxruntime_fp32, latency: 44.3872ms, accuracy: 0.9937 | ||
accleration option: onnxruntime_int8_qlinear, latency: 10.1866ms, accuracy: 0.9937 | ||
accleration option: onnxruntime_int8_integer, latency: 18.8731ms, accuracy: 0.9875 | ||
When accelerator is onnxruntime, the model with minimal latency is: inc + onnxruntime + qlinear | ||
When accuracy drop less than 5%, the model with minimal latency is: openvino + pot | ||
The model with minimal latency is: openvino + pot | ||
``` |
288 changes: 288 additions & 0 deletions
288
python/nano/example/pytorch/inference_pipeline/resnet/_finetune.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,288 @@ | ||
# | ||
# Copyright 2016 The BigDL Authors. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# This file is adapted from PyTorch Lightning. | ||
# https://github.com/Lightning-AI/lightning/blob/master/examples/ | ||
# pl_domain_templates/computer_vision_fine_tuning.py | ||
# Copyright The PyTorch Lightning team. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
"""Computer vision example on Transfer Learning. This computer vision example illustrates how one could fine-tune a | ||
pre-trained network (by default, a ResNet50 is used) using pytorch-lightning. For the sake of this example, the | ||
'cats and dogs dataset' (~60MB, see `DATA_URL` below) and the proposed network (denoted by `TransferLearningModel`, | ||
see below) is trained for 15 epochs. | ||
|
||
The training consists of three stages. | ||
|
||
From epoch 0 to 4, the feature extractor (the pre-trained network) is frozen except | ||
maybe for the BatchNorm layers (depending on whether `train_bn = True`). The BatchNorm | ||
layers (if `train_bn = True`) and the parameters of the classifier are trained as a | ||
single parameters group with lr = 1e-2. | ||
|
||
From epoch 5 to 9, the last two layer groups of the pre-trained network are unfrozen | ||
and added to the optimizer as a new parameter group with lr = 1e-4 (while lr = 1e-3 | ||
for the first parameter group in the optimizer). | ||
|
||
Eventually, from epoch 10, all the remaining layer groups of the pre-trained network | ||
are unfrozen and added to the optimizer as a third parameter group. From epoch 10, | ||
the parameters of the pre-trained network are trained with lr = 1e-5 while those of | ||
the classifier is trained with lr = 1e-4. | ||
|
||
Note: | ||
See: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html | ||
""" | ||
|
||
|
||
import logging | ||
from pathlib import Path | ||
from typing import Union | ||
import numpy as np | ||
|
||
import torch | ||
import torch.nn.functional as F | ||
from torch import nn, optim | ||
from torch.optim.lr_scheduler import MultiStepLR | ||
from torch.optim.optimizer import Optimizer | ||
from torch.utils.data import DataLoader, Subset | ||
from torchmetrics import Accuracy | ||
from torchvision import models, transforms | ||
from torchvision.datasets import ImageFolder | ||
from torchvision.datasets.utils import download_and_extract_archive | ||
|
||
from pytorch_lightning import LightningDataModule, LightningModule | ||
from pytorch_lightning.callbacks.finetuning import BaseFinetuning | ||
from pytorch_lightning.utilities.rank_zero import rank_zero_info | ||
|
||
|
||
log = logging.getLogger(__name__) | ||
DATA_URL = "https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip" | ||
|
||
|
||
class TransferLearningModel(LightningModule): | ||
def __init__( | ||
self, | ||
backbone: str = "resnet18", | ||
milestones: tuple = (5, 10), | ||
lr: float = 1e-3, | ||
lr_scheduler_gamma: float = 1e-1, | ||
num_workers: int = 6, | ||
**kwargs, | ||
) -> None: | ||
"""TransferLearningModel. | ||
|
||
Args: | ||
backbone: Name (as in ``torchvision.models``) of the feature extractor | ||
train_bn: Whether the BatchNorm layers should be trainable | ||
milestones: List of two epochs milestones | ||
lr: Initial learning rate | ||
lr_scheduler_gamma: Factor by which the learning rate is reduced at each milestone | ||
""" | ||
super().__init__() | ||
self.backbone = backbone | ||
self.milestones = milestones | ||
self.lr = lr | ||
self.lr_scheduler_gamma = lr_scheduler_gamma | ||
self.num_workers = num_workers | ||
|
||
self.__build_model() | ||
|
||
self.train_acc = Accuracy() | ||
self.valid_acc = Accuracy() | ||
self.save_hyperparameters() | ||
|
||
def __build_model(self): | ||
"""Define model layers & loss.""" | ||
|
||
# 1. Load pre-trained network: | ||
model_func = getattr(models, self.backbone) | ||
backbone = model_func(pretrained=True) | ||
|
||
_layers = list(backbone.children())[:-1] | ||
self.feature_extractor = nn.Sequential(*_layers) | ||
|
||
# 2. Classifier: | ||
_fc_layers = [nn.Linear(512, 256), nn.ReLU(), nn.Linear(256, 32), nn.Linear(32, 1)] | ||
self.fc = nn.Sequential(*_fc_layers) | ||
|
||
# 3. Loss: | ||
self.loss_func = F.binary_cross_entropy_with_logits | ||
|
||
def forward(self, x): | ||
"""Forward pass. | ||
|
||
Returns logits. | ||
""" | ||
|
||
# 1. Feature extraction: | ||
x = self.feature_extractor(x) | ||
x = x.squeeze(-1).squeeze(-1) | ||
|
||
# 2. Classifier (returns logits): | ||
x = self.fc(x) | ||
|
||
return x | ||
|
||
def loss(self, logits, labels): | ||
return self.loss_func(input=logits, target=labels) | ||
|
||
def training_step(self, batch, batch_idx): | ||
# 1. Forward pass: | ||
x, y = batch | ||
y_logits = self.forward(x) | ||
y_scores = torch.sigmoid(y_logits) | ||
y_true = y.view((-1, 1)).type_as(x) | ||
|
||
# 2. Compute loss | ||
train_loss = self.loss(y_logits, y_true) | ||
|
||
# 3. Compute accuracy: | ||
self.log("train_acc", self.train_acc(y_scores, y_true.int()), prog_bar=True) | ||
|
||
return train_loss | ||
|
||
def validation_step(self, batch, batch_idx): | ||
# 1. Forward pass: | ||
x, y = batch | ||
y_logits = self.forward(x) | ||
y_scores = torch.sigmoid(y_logits) | ||
y_true = y.view((-1, 1)).type_as(x) | ||
|
||
# 2. Compute loss | ||
self.log("val_loss", self.loss(y_logits, y_true), prog_bar=True) | ||
|
||
# 3. Compute accuracy: | ||
self.log("val_acc", self.valid_acc(y_scores, y_true.int()), prog_bar=True) | ||
|
||
def configure_optimizers(self): | ||
parameters = list(self.parameters()) | ||
trainable_parameters = list(filter(lambda p: p.requires_grad, parameters)) | ||
rank_zero_info( | ||
f"The model will start training with only {len(trainable_parameters)} " | ||
f"trainable parameters out of {len(parameters)}." | ||
) | ||
optimizer = optim.Adam(trainable_parameters, lr=self.lr) | ||
scheduler = MultiStepLR(optimizer, milestones=self.milestones, gamma=self.lr_scheduler_gamma) | ||
return [optimizer], [scheduler] | ||
|
||
|
||
class CatDogImageDataModule(LightningDataModule): | ||
def __init__(self, dl_path: Union[str, Path] = "data", num_workers: int = 0, batch_size: int = 8): | ||
"""CatDogImageDataModule. | ||
|
||
Args: | ||
dl_path: root directory where to download the data | ||
num_workers: number of CPU workers | ||
batch_size: number of sample in a batch | ||
""" | ||
super().__init__() | ||
|
||
self._dl_path = dl_path | ||
self._num_workers = num_workers | ||
self._batch_size = batch_size | ||
|
||
def prepare_data(self): | ||
"""Download images and prepare images datasets.""" | ||
download_and_extract_archive(url=DATA_URL, download_root=self._dl_path, | ||
remove_finished=True) | ||
|
||
@property | ||
def data_path(self): | ||
return Path(self._dl_path).joinpath("cats_and_dogs_filtered") | ||
|
||
@property | ||
def normalize_transform(self): | ||
return transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) | ||
|
||
@property | ||
def train_transform(self): | ||
return transforms.Compose( | ||
[ | ||
transforms.Resize((224, 224)), | ||
transforms.RandomHorizontalFlip(), | ||
transforms.ToTensor(), | ||
self.normalize_transform, | ||
] | ||
) | ||
|
||
@property | ||
def valid_transform(self): | ||
return transforms.Compose([transforms.Resize((224, 224)), | ||
transforms.ToTensor(), self.normalize_transform]) | ||
|
||
def create_dataset(self, root, transform): | ||
return ImageFolder(root=root, transform=transform) | ||
|
||
def __dataloader(self, train: bool, batch_size=None, limit_num_samples=None): | ||
"""Train/validation loaders.""" | ||
if batch_size is None: | ||
batch_size = self._batch_size | ||
if train: | ||
dataset = self.create_dataset(self.data_path.joinpath("train"), | ||
self.train_transform) | ||
return DataLoader(dataset=dataset, batch_size=batch_size, | ||
num_workers=self._num_workers, shuffle=True) | ||
else: | ||
dataset = self.create_dataset(self.data_path.joinpath("validation"), | ||
self.valid_transform) | ||
if limit_num_samples is not None: | ||
indices = np.random.permutation(len(dataset))[:limit_num_samples] | ||
dataset = Subset(dataset, indices) | ||
return DataLoader(dataset=dataset, batch_size=batch_size, | ||
num_workers=self._num_workers, shuffle=False) | ||
|
||
def train_dataloader(self, batch_size=None): | ||
log.info("Training data loaded.") | ||
return self.__dataloader(train=True, batch_size=batch_size) | ||
|
||
def val_dataloader(self, batch_size=None, limit_num_samples=None): | ||
log.info("Validation data loaded.") | ||
return self.__dataloader(train=False, batch_size=batch_size, | ||
limit_num_samples=limit_num_samples) | ||
|
||
|
||
class MilestonesFinetuning(BaseFinetuning): | ||
def __init__(self, milestones: tuple = (5, 10), train_bn: bool = False): | ||
super().__init__() | ||
self.milestones = milestones | ||
self.train_bn = train_bn | ||
|
||
def freeze_before_training(self, pl_module: LightningModule): | ||
self.freeze(modules=pl_module.feature_extractor, train_bn=self.train_bn) | ||
|
||
def finetune_function(self, pl_module: LightningModule, epoch: int, | ||
optimizer: Optimizer, opt_idx: int): | ||
if epoch == self.milestones[0]: | ||
# unfreeze 5 last layers | ||
self.unfreeze_and_add_param_group( | ||
modules=pl_module.feature_extractor[-5:], # type: ignore | ||
optimizer=optimizer, train_bn=self.train_bn | ||
) | ||
|
||
elif epoch == self.milestones[1]: | ||
# unfreeze remaining layers | ||
self.unfreeze_and_add_param_group( | ||
modules=pl_module.feature_extractor[:-5], # type: ignore | ||
optimizer=optimizer, train_bn=self.train_bn | ||
) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be we should tell our users to ignore the warnings might pop out here? (about numpy version)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done