Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Proposal] demo compressor #1402

Merged
merged 33 commits into from
Aug 28, 2019
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
a3ea2cb
fix negative time number in local mode when trial time is short
LeonardoWang Jun 20, 2019
01c60bd
fix bug of duration<0
Jun 21, 2019
ace5132
fix windows version and readme
LeonardoWang Jun 21, 2019
48d3e4d
change tab
LeonardoWang Jun 21, 2019
61dbdca
change line
Jun 21, 2019
38e683a
Merge branch 'fix-bug-ui' of github.com:LeonardoWang/nni
Jun 21, 2019
f2e6d16
add compressor
LeonardoWang Aug 1, 2019
c6e6b75
add __init__
LeonardoWang Aug 1, 2019
d9897c2
Merge branch 'master' of github.com:microsoft/nni
Aug 2, 2019
2979de3
compressor
Aug 2, 2019
365585f
new framework
Aug 13, 2019
1f92408
change import
LeonardoWang Aug 13, 2019
f925075
change import error
Aug 13, 2019
a1e36c3
Merge branch 'master' of https://github.com/microsoft/nni
Aug 14, 2019
7672887
add doc and change files
Aug 14, 2019
17a20fc
add test function in pynni/test
Aug 15, 2019
9ab12f3
fix bug F
Aug 15, 2019
491ef33
test
Aug 15, 2019
02877ed
add requirment
Aug 15, 2019
fe46e9f
change setup
Aug 15, 2019
2b5114b
change dependencies to zaure-piplines.yml
Aug 15, 2019
a68d911
change
Aug 15, 2019
555cbb8
add mac
Aug 15, 2019
6badf9e
add macos
Aug 15, 2019
6f3f1ff
change compressor with __call__(), add frame method
Aug 19, 2019
41751de
add configure without doc
Aug 20, 2019
baffb8d
update doc and test unit
Aug 21, 2019
2213dfa
add configure parser change doc and test
Aug 23, 2019
0538d30
test commit
Aug 23, 2019
3b02ef6
move example
Aug 26, 2019
3a94357
add user hint
Aug 26, 2019
8eff37e
add doc and change example name
Aug 27, 2019
b8c674d
change for PR
Aug 27, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ jobs:
steps:
- script: python3 -m pip install --upgrade pip setuptools --user
displayName: 'Install python tools'
- script: |
python3 -m pip install torch==0.4.1 --user
leckie-chn marked this conversation as resolved.
Show resolved Hide resolved
python3 -m pip install torchvision==0.2.1 --user
python3 -m pip install tensorflow==1.12.0 --user
displayName: 'Install dependencies for integration'
- script: |
source install.sh
displayName: 'Install nni toolkit via source code'
Expand Down Expand Up @@ -50,6 +55,11 @@ jobs:
steps:
- script: python3 -m pip install --upgrade pip setuptools
displayName: 'Install python tools'
- script: |
python3 -m pip install torch==0.4.1 --user
leckie-chn marked this conversation as resolved.
Show resolved Hide resolved
python3 -m pip install torchvision==0.2.1 --user
python3 -m pip install tensorflow --user
displayName: 'Install dependencies for integration'
- script: |
source install.sh
displayName: 'Install nni toolkit via source code'
Expand Down
39 changes: 39 additions & 0 deletions docs/en_US/Compressor/Overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Compressor
NNI provides easy-to-use toolkit to help user design and use compression algorithm.

## Framework
We use the instrumentation method to insert a node or function after the corresponding position in the model.
<br>
When compression algorithm designer implements one prune algorithm, he only need to pay attention to the generation method of mask, without caring about applying the mask to the garph.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: garph -> graph

Copy link
Contributor

@chicm-ms chicm-ms Aug 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'he' : the designer maybe female.

## algorithm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Algorithm

We now provide some naive compression algorithm and four popular compress agorithms for users, including two pruning algorithm and two quantization algorithm.
Below is a list of model compression algorithms supported in our compressor
| Name | Paper |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this table cannot be correctly rendered

| ---------- | ----------|
| AGPruner | [To prune, or not to prune: exploring the efficacy of pruning for model compression](https://arxiv.org/abs/1710.01878)|
| SensitivityPruner |[Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626)|
| QATquantizer |[Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference](http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf)|
| DoReFaQuantizer |[DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients](https://arxiv.org/abs/1606.06160)|

## Usage

Take naive level pruner as an example

If you want to prune all weight to 80% sparsity, you can add code below into your code before your training code.

Tensorflow code
```
nni.compressors.tfCompressor.LevelPruner(sparsity=0.8).compress(model_graph)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format (camel case) of package names 'tfCompressor' and 'torchCompressor' are inconsistent with current nni package names.

```

Pytorch code
```
nni.compressors.torchCompressor.LevelPruner(sparsity=0.8).compress(model)
```

Our compressor will automatically insert mask into your model, and you can train your model with masks without changing your training code. You will get a compressed model when you finish your training.

You can get more information in Algorithm details


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add another section to show how to customize a new pruning or quantize algorithm


73 changes: 73 additions & 0 deletions docs/en_US/Compressor/Pruner.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
Pruner on NNI Compressor
===

## AGPruner
In [To prune, or not to prune: exploring the efficacy of pruning for model compression](https://arxiv.org/abs/1710.01878), authors Michael Zhu and Suyog Gupta provide an algorithm to prune the weight gradually.

>We introduce a new automated gradual pruning algorithm in which the sparsity is increased from an initial sparsity value si (usually 0) to a final sparsity value sf over a span of n pruning steps, starting at training step t0 and with pruning frequency ∆t:
![](../../img/AGPruner.PNG)
>The binary weight masks are updated every ∆t steps as the network is trained to gradually increase the sparsity of the network while allowing the network training steps to recover from any pruning-induced loss in accuracy. In our experience, varying the pruning frequency ∆t between 100 and 1000 training steps had a negligible impact on the final model quality. Once the model achieves the target sparsity sf , the weight masks are no longer updated. The intuition behind this sparsity function in equation

### Usage
You can prune all weight from %0 to 80% sparsity in 10 epoch with the code below.

First, you should import pruner and add mask to model.

Tensorflow code
```
from nni.compressors.tfCompressor import AGPruner
pruner = AGPruner(initial_sparsity=0, final_sparsity=0.8, start_epoch=1, end_epoch=10, frequency=1).compress(tf.get_default_graph())
```
Pytorch code
```
from nni.compressors.torchCompressor import AGPruner
pruner = AGPruner(initial_sparsity=0, final_sparsity=0.8, start_epoch=1, end_epoch=10, frequency=1).compress(model)
```

Second, you should add code below to update epoch number when you finish one epoch in your training code.

Tensorflow code
```
pruner.update_epoch(epoch, sess)
```
Pytorch code
```
pruner.update_epoch(epoch)
```
You can view example for more information
***

## SensitivityPruner
In [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626), author Song Han and provide an algorithm to find the sensitivity of each layer and set the pruning threshold to each layer.

>We used the sensitivity results to find each layer’s threshold: for example, the smallest threshold was applied to the most sensitive layer, which is the first convolutional layer... The pruning threshold is chosen as a quality parameter multiplied by the standard deviation of a layer’s weights

### Usage
You can prune weight step by step and reach one target sparsity by SensitivityPruner with the code below.

Tensorflow code
```
from nni.compressors.tfCompressor import SensitivityPruner

pruner = SensitivityPruner(sparsity = 0.8)
pruner.compress(tf.get_default_graph())
```
Pytorch code
```
from nni.compressors.torchCompressor import SensitivityPruner

pruner = SensitivityPruner(sparsity = 0.8)
pruner.compress(model)
```
Like AGPruner, you should update mask information every epoch by adding code below

Tensorflow code
```
pruner.update_epoch(epoch, sess)
```
Pytorch code
```
pruner.update_epoch(epoch)
```
You can view example for more information
***
46 changes: 46 additions & 0 deletions docs/en_US/Compressor/Quantizer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
Quantizer on NNI Compressor
===
## QATquantizer
In [Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference](http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf), authors Benoit Jacob and Skirmantas Kligys provide an algorithm to quantize the model with training.

>We propose an approach that simulates quantization effects in the forward pass of training. Backpropagation still happens as usual, and all weights and biases are stored in floating point so that they can be easily nudged by small amounts. The forward propagation pass however simulates quantized inference as it will happen in the inference engine, by implementing in floating-point arithmetic the rounding behavior of the quantization scheme
>* Weights are quantized before they are convolved with the input. If batch normalization (see [17]) is used for the layer, the batch normalization parameters are “folded into” the weights before quantization.
>* Activations are quantized at points where they would be during inference, e.g. after the activation function is applied to a convolutional or fully connected layer’s output, or after a bypass connection adds or concatenates the outputs of several layers together such as in ResNets.



### Usage
You can quantize your model to 8 bits with the code below before your training code.

Tensorflow code
```
from nni.compressors.tfCompressor import QATquantizer
QATquantizer(q_bits = 8).compress(tf.get_default_graph())
```
Pytorch code
```
from nni.compressors.torchCompressor import QATquantizer
QATquantizer(q_bits = 8).compress(model)
```

You can view example for more information

***
## DoReFaQuantizer
In [DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients](https://arxiv.org/abs/1606.06160), authors Shuchang Zhou and Yuxin Wu provide an algorithm named DoReFa to quantize the weight, activation and gradients with training.

### Usage
To implement DoReFaQuantizer, you can add code below before your training code

Tensorflow code
```
from nni.compressors.tfCompressor import DoReFaQuantizer
DoReFaQuantizer(q_bits = 8).compress(tf.get_default_graph())
```
Pytorch code
```
from nni.compressors.torchCompressor import DoReFaQuantizer
DoReFaQuantizer(q_bits = 8).compress(model)
```

You can view example for more information
Binary file added docs/img/AGPruner.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ def read(fname):

setup(
name = 'nni',
version = '999.0.0-developing',
version = 'v0.8-263-g1f92408',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a temporary change by NNI installation script. Please remove this commit.

Copy link
Contributor

@chicm-ms chicm-ms Aug 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this change is made by makefile, do not check in this change.

author = 'Microsoft NNI Team',
author_email = '[email protected]',
description = 'Neural Network Intelligence project',
Expand Down
Empty file.
Empty file.
115 changes: 115 additions & 0 deletions src/sdk/pynni/nni/compressors/example/main_tf_pruner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
from nni.compressors.tfCompressor import AGPruner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's nni.get_next_parameter() and nni.report_final_results()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest change 'main' in file names to 'mnist'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it maybe confusing with the class Mnist

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data


def weight_variable(shape):
return tf.Variable(tf.truncated_normal(shape, stddev = 0.1))

def bias_variable(shape):
return tf.Variable(tf.constant(0.1, shape = shape))

def conv2d(x_input, w_matrix):
return tf.nn.conv2d(x_input, w_matrix, strides = [ 1, 1, 1, 1 ], padding = 'SAME')

def max_pool(x_input, pool_size):
size = [ 1, pool_size, pool_size, 1 ]
return tf.nn.max_pool(x_input, ksize = size, strides = size, padding = 'SAME')


class Mnist:
def __init__(self):
images = tf.placeholder(tf.float32, [ None, 784 ], name = 'input_x')
labels = tf.placeholder(tf.float32, [ None, 10 ], name = 'input_y')
keep_prob = tf.placeholder(tf.float32, name='keep_prob')

self.images = images
self.labels = labels
self.keep_prob = keep_prob

self.train_step = None
self.accuracy = None

self.w1 = None
self.b1 = None
self.fcw1 = None
self.cross = None
with tf.name_scope('reshape'):
x_image = tf.reshape(images, [ -1, 28, 28, 1 ])
with tf.name_scope('conv1'):
w_conv1 = weight_variable([ 5, 5, 1, 32 ])
self.w1 = w_conv1
b_conv1 = bias_variable([ 32 ])
self.b1 = b_conv1
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
with tf.name_scope('pool1'):
h_pool1 = max_pool(h_conv1, 2)
with tf.name_scope('conv2'):
w_conv2 = weight_variable([ 5, 5, 32, 64 ])
b_conv2 = bias_variable([ 64 ])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)
with tf.name_scope('pool2'):
h_pool2 = max_pool(h_conv2, 2)
with tf.name_scope('fc1'):
w_fc1 = weight_variable([ 7 * 7 * 64, 1024 ])
self.fcw1 = w_fc1
b_fc1 = bias_variable([ 1024 ])
h_pool2_flat = tf.reshape(h_pool2, [ -1, 7 * 7 * 64 ])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)
with tf.name_scope('dropout'):
h_fc1_drop = tf.nn.dropout(h_fc1, 0.5)
with tf.name_scope('fc2'):
w_fc2 = weight_variable([ 1024, 10 ])
b_fc2 = bias_variable([ 10 ])
y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2
with tf.name_scope('loss'):
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = labels, logits = y_conv))
self.cross = cross_entropy
with tf.name_scope('adam_optimizer'):
self.train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy)
with tf.name_scope('accuracy'):
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(labels, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


def main():
tf.set_random_seed(0)

data = input_data.read_data_sets('data', one_hot = True)

model = Mnist()

'''you can change this to SensitivityPruner to implement it
pruner = SensitivityPruner(sparsity = 0.8)
'''
pruner = AGPruner(initial_sparsity=0, final_sparsity=0.8, start_epoch=1, end_epoch=10, frequency=1)
pruner.compress(tf.get_default_graph())


with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for batch_idx in range(2000):
batch = data.train.next_batch(2000)
model.train_step.run(feed_dict = {
model.images: batch[0],
model.labels: batch[1],
model.keep_prob: 0.5
})
if batch_idx % 10 == 0:
test_acc = model.accuracy.eval(feed_dict = {
model.images: data.test.images,
model.labels: data.test.labels,
model.keep_prob: 1.0
})
pruner.update_epoch(batch_idx / 10,sess)
print('test accuracy', test_acc)


test_acc = model.accuracy.eval(feed_dict = {
model.images: data.test.images,
model.labels: data.test.labels,
model.keep_prob: 1.0
})
print('final result is', test_acc)

main()
Loading