Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nano : ResNet Demo for InferenceOptmizer #5580

Merged
merged 10 commits into from
Sep 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions python/nano/example/pytorch/inference_pipeline/resnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Bigdl-nano InferenceOptimizer example on Cat vs. Dog dataset

This example illustrates how to apply InferenceOptimizer to quickly find acceleration method with the minimum inference latency under specific restrictions or without restrictions for a trained model.
For the sake of this example, we first train the proposed network(by default, a ResNet18 is used) on the [cats and dogs dataset](https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip), which consists both [frozen and unfrozen stages](https://github.com/PyTorchLightning/pytorch-lightning/blob/495812878dfe2e31ec2143c071127990afbb082b/pl_examples/domain_templates/computer_vision_fine_tuning.py#L21-L35). Then, by calling `optimize()`, we can obtain all available accelaration combinations provided by BigDL-Nano for inference. By calling `get_best_mdoel()` , we could get an accelerated model whose inference is 7.5x times faster.


## Prepare the environment
We recommend you to use [Anaconda](https://www.anaconda.com/distribution/#linux) to prepare the environment.
**Note**: during your installation, there may be some warnings or errors about version, just ignore them.
```
conda create -n nano python=3.7 # "nano" is conda environment name, you can use any name you like.
conda activate nano
pip install jsonargparse[signatures]
pip install --pre --upgrade bigdl-nano[pytorch]

# bf16 is available only on torch1.12
pip install torch==1.12.0 torchvision --extra-index-url https://download.pytorch.org/whl/cpu
# Necessary packages for inference accelaration
pip install --upgrade intel-extension-for-pytorch
pip install onnx==1.12.0 onnxruntime==1.12.1 onnxruntime-extensions
pip install openvino-dev
pip install neural-compressor==1.12
pip install --upgrade numpy==1.21.6
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be we should tell our users to ignore the warnings might pop out here? (about numpy version)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Initialize environment variables with script `bigdl-nano-init` installed with bigdl-nano.
```
source bigdl-nano-init
```
You may find environment variables set like follows:
```
Setting OMP_NUM_THREADS...
Setting OMP_NUM_THREADS specified for pytorch...
Setting KMP_AFFINITY...
Setting KMP_BLOCKTIME...
Setting MALLOC_CONF...
+++++ Env Variables +++++
LD_PRELOAD=./../lib/libjemalloc.so
MALLOC_CONF=oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:-1,muzzy_decay_ms:-1
OMP_NUM_THREADS=112
KMP_AFFINITY=granularity=fine,compact,1,0
KMP_BLOCKTIME=1
TF_ENABLE_ONEDNN_OPTS=
+++++++++++++++++++++++++
Complete.
```

## Prepare Dataset
By default the dataset will be auto-downloaded.
You could access [cats and dogs dataset](https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip) for a view of the whole dataset.

## Run example
You can run this example with command line:

```bash
python inference_pipeline.py
```

## Results

It will take about 2 minutes to run inference optimization. Then you may find the result for inference as follows:
```
accleration option: original, latency: 54.2669ms, accuracy: 0.9937
accleration option: fp32_ipex, latency: 40.3075ms, accuracy: 0.9937
accleration option: bf16_ipex, latency: 115.6182ms, accuracy: 0.9937
accleration option: int8, latency: 14.4857ms, accuracy: 0.4750
accleration option: jit_fp32, latency: 39.3361ms, accuracy: 0.9937
accleration option: jit_fp32_ipex, latency: 39.2949ms, accuracy: 0.9937
accleration option: jit_fp32_ipex_clast, latency: 24.5715ms, accuracy: 0.9937
accleration option: openvino_fp32, latency: 14.5771ms, accuracy: 0.9937
accleration option: openvino_int8, latency: 7.2186ms, accuracy: 0.9937
accleration option: onnxruntime_fp32, latency: 44.3872ms, accuracy: 0.9937
accleration option: onnxruntime_int8_qlinear, latency: 10.1866ms, accuracy: 0.9937
accleration option: onnxruntime_int8_integer, latency: 18.8731ms, accuracy: 0.9875
When accelerator is onnxruntime, the model with minimal latency is: inc + onnxruntime + qlinear
When accuracy drop less than 5%, the model with minimal latency is: openvino + pot
The model with minimal latency is: openvino + pot
```
288 changes: 288 additions & 0 deletions python/nano/example/pytorch/inference_pipeline/resnet/_finetune.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,288 @@
#
# Copyright 2016 The BigDL Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# This file is adapted from PyTorch Lightning.
# https://github.com/Lightning-AI/lightning/blob/master/examples/
# pl_domain_templates/computer_vision_fine_tuning.py
# Copyright The PyTorch Lightning team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Computer vision example on Transfer Learning. This computer vision example illustrates how one could fine-tune a
pre-trained network (by default, a ResNet50 is used) using pytorch-lightning. For the sake of this example, the
'cats and dogs dataset' (~60MB, see `DATA_URL` below) and the proposed network (denoted by `TransferLearningModel`,
see below) is trained for 15 epochs.

The training consists of three stages.

From epoch 0 to 4, the feature extractor (the pre-trained network) is frozen except
maybe for the BatchNorm layers (depending on whether `train_bn = True`). The BatchNorm
layers (if `train_bn = True`) and the parameters of the classifier are trained as a
single parameters group with lr = 1e-2.

From epoch 5 to 9, the last two layer groups of the pre-trained network are unfrozen
and added to the optimizer as a new parameter group with lr = 1e-4 (while lr = 1e-3
for the first parameter group in the optimizer).

Eventually, from epoch 10, all the remaining layer groups of the pre-trained network
are unfrozen and added to the optimizer as a third parameter group. From epoch 10,
the parameters of the pre-trained network are trained with lr = 1e-5 while those of
the classifier is trained with lr = 1e-4.

Note:
See: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
"""


import logging
from pathlib import Path
from typing import Union
import numpy as np

import torch
import torch.nn.functional as F
from torch import nn, optim
from torch.optim.lr_scheduler import MultiStepLR
from torch.optim.optimizer import Optimizer
from torch.utils.data import DataLoader, Subset
from torchmetrics import Accuracy
from torchvision import models, transforms
from torchvision.datasets import ImageFolder
from torchvision.datasets.utils import download_and_extract_archive

from pytorch_lightning import LightningDataModule, LightningModule
from pytorch_lightning.callbacks.finetuning import BaseFinetuning
from pytorch_lightning.utilities.rank_zero import rank_zero_info


log = logging.getLogger(__name__)
DATA_URL = "https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip"


class TransferLearningModel(LightningModule):
def __init__(
self,
backbone: str = "resnet18",
milestones: tuple = (5, 10),
lr: float = 1e-3,
lr_scheduler_gamma: float = 1e-1,
num_workers: int = 6,
**kwargs,
) -> None:
"""TransferLearningModel.

Args:
backbone: Name (as in ``torchvision.models``) of the feature extractor
train_bn: Whether the BatchNorm layers should be trainable
milestones: List of two epochs milestones
lr: Initial learning rate
lr_scheduler_gamma: Factor by which the learning rate is reduced at each milestone
"""
super().__init__()
self.backbone = backbone
self.milestones = milestones
self.lr = lr
self.lr_scheduler_gamma = lr_scheduler_gamma
self.num_workers = num_workers

self.__build_model()

self.train_acc = Accuracy()
self.valid_acc = Accuracy()
self.save_hyperparameters()

def __build_model(self):
"""Define model layers & loss."""

# 1. Load pre-trained network:
model_func = getattr(models, self.backbone)
backbone = model_func(pretrained=True)

_layers = list(backbone.children())[:-1]
self.feature_extractor = nn.Sequential(*_layers)

# 2. Classifier:
_fc_layers = [nn.Linear(512, 256), nn.ReLU(), nn.Linear(256, 32), nn.Linear(32, 1)]
self.fc = nn.Sequential(*_fc_layers)

# 3. Loss:
self.loss_func = F.binary_cross_entropy_with_logits

def forward(self, x):
"""Forward pass.

Returns logits.
"""

# 1. Feature extraction:
x = self.feature_extractor(x)
x = x.squeeze(-1).squeeze(-1)

# 2. Classifier (returns logits):
x = self.fc(x)

return x

def loss(self, logits, labels):
return self.loss_func(input=logits, target=labels)

def training_step(self, batch, batch_idx):
# 1. Forward pass:
x, y = batch
y_logits = self.forward(x)
y_scores = torch.sigmoid(y_logits)
y_true = y.view((-1, 1)).type_as(x)

# 2. Compute loss
train_loss = self.loss(y_logits, y_true)

# 3. Compute accuracy:
self.log("train_acc", self.train_acc(y_scores, y_true.int()), prog_bar=True)

return train_loss

def validation_step(self, batch, batch_idx):
# 1. Forward pass:
x, y = batch
y_logits = self.forward(x)
y_scores = torch.sigmoid(y_logits)
y_true = y.view((-1, 1)).type_as(x)

# 2. Compute loss
self.log("val_loss", self.loss(y_logits, y_true), prog_bar=True)

# 3. Compute accuracy:
self.log("val_acc", self.valid_acc(y_scores, y_true.int()), prog_bar=True)

def configure_optimizers(self):
parameters = list(self.parameters())
trainable_parameters = list(filter(lambda p: p.requires_grad, parameters))
rank_zero_info(
f"The model will start training with only {len(trainable_parameters)} "
f"trainable parameters out of {len(parameters)}."
)
optimizer = optim.Adam(trainable_parameters, lr=self.lr)
scheduler = MultiStepLR(optimizer, milestones=self.milestones, gamma=self.lr_scheduler_gamma)
return [optimizer], [scheduler]


class CatDogImageDataModule(LightningDataModule):
def __init__(self, dl_path: Union[str, Path] = "data", num_workers: int = 0, batch_size: int = 8):
"""CatDogImageDataModule.

Args:
dl_path: root directory where to download the data
num_workers: number of CPU workers
batch_size: number of sample in a batch
"""
super().__init__()

self._dl_path = dl_path
self._num_workers = num_workers
self._batch_size = batch_size

def prepare_data(self):
"""Download images and prepare images datasets."""
download_and_extract_archive(url=DATA_URL, download_root=self._dl_path,
remove_finished=True)

@property
def data_path(self):
return Path(self._dl_path).joinpath("cats_and_dogs_filtered")

@property
def normalize_transform(self):
return transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

@property
def train_transform(self):
return transforms.Compose(
[
transforms.Resize((224, 224)),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
self.normalize_transform,
]
)

@property
def valid_transform(self):
return transforms.Compose([transforms.Resize((224, 224)),
transforms.ToTensor(), self.normalize_transform])

def create_dataset(self, root, transform):
return ImageFolder(root=root, transform=transform)

def __dataloader(self, train: bool, batch_size=None, limit_num_samples=None):
"""Train/validation loaders."""
if batch_size is None:
batch_size = self._batch_size
if train:
dataset = self.create_dataset(self.data_path.joinpath("train"),
self.train_transform)
return DataLoader(dataset=dataset, batch_size=batch_size,
num_workers=self._num_workers, shuffle=True)
else:
dataset = self.create_dataset(self.data_path.joinpath("validation"),
self.valid_transform)
if limit_num_samples is not None:
indices = np.random.permutation(len(dataset))[:limit_num_samples]
dataset = Subset(dataset, indices)
return DataLoader(dataset=dataset, batch_size=batch_size,
num_workers=self._num_workers, shuffle=False)

def train_dataloader(self, batch_size=None):
log.info("Training data loaded.")
return self.__dataloader(train=True, batch_size=batch_size)

def val_dataloader(self, batch_size=None, limit_num_samples=None):
log.info("Validation data loaded.")
return self.__dataloader(train=False, batch_size=batch_size,
limit_num_samples=limit_num_samples)


class MilestonesFinetuning(BaseFinetuning):
def __init__(self, milestones: tuple = (5, 10), train_bn: bool = False):
super().__init__()
self.milestones = milestones
self.train_bn = train_bn

def freeze_before_training(self, pl_module: LightningModule):
self.freeze(modules=pl_module.feature_extractor, train_bn=self.train_bn)

def finetune_function(self, pl_module: LightningModule, epoch: int,
optimizer: Optimizer, opt_idx: int):
if epoch == self.milestones[0]:
# unfreeze 5 last layers
self.unfreeze_and_add_param_group(
modules=pl_module.feature_extractor[-5:], # type: ignore
optimizer=optimizer, train_bn=self.train_bn
)

elif epoch == self.milestones[1]:
# unfreeze remaining layers
self.unfreeze_and_add_param_group(
modules=pl_module.feature_extractor[:-5], # type: ignore
optimizer=optimizer, train_bn=self.train_bn
)
Loading