[v1.x] Reduce after quantization memory usage #20925

bgawrych · 2022-03-02T10:23:01Z

Description

Port of #20894

Script:

import mxnet as mx
from mxnet.gluon.model_zoo import vision
import psutil
import os

def get_process_memory():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss * 1e-6


batch_shape = (1, 3, 224, 224)
data = mx.nd.random.normal(shape=batch_shape)

print("memory before loading model: ", get_process_memory())
net = vision.resnet50_v1(pretrained=True)
print("memory after loading model: ", get_process_memory()) 
out = net(data)
out.wait_to_read()
print("memory after fp32 forward pass", get_process_memory())

indata = {'data':data}
label = {'label':mx.nd.zeros(shape=(1,))}
dataiter = mx.io.NDArrayIter(indata, label, 3, True, last_batch_handle='discard')
net_quantized = mx.contrib.quant.quantize_net(net, quantized_dtype='auto',
                                                quantize_mode="smart",
                                                calib_mode='naive',
                                                calib_data=dataiter,
                                                num_calib_examples=1,
                                                ctx=mx.current_context())

print("memory after quantization: ", get_process_memory())

outputs = net_quantized(data)
outputs.wait_to_read()
print("memory after int8 forward pass: ", get_process_memory())

Output before:

memory before loading model:  201.936896
memory after loading model:  433.41004799999996
memory after fp32 forward pass 523.698176
memory after quantization:  1308.803072
memory after int8 forward pass:  1313.349632

Output after:

memory before loading model:  202.502144
memory after loading model:  434.184192
memory after fp32 forward pass 520.986624
memory after quantization:  1136.570368
memory after int8 forward pass:  1141.485568

mxnet-bot · 2022-03-02T10:23:06Z

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [sanity, website, windows-gpu, clang, centos-cpu, windows-cpu, unix-gpu, unix-cpu, edge, miscellaneous, centos-gpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

bgawrych · 2022-03-03T16:33:48Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-03-03T16:33:52Z

Jenkins CI successfully triggered : [windows-gpu]

[v1.x] Reduce after quantization memory usage

94aa421

bgawrych requested a review from szha as a code owner March 2, 2022 10:23

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 2, 2022

fix sanity

5771578

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 3, 2022

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 3, 2022

anko-intel approved these changes Mar 4, 2022

View reviewed changes

bgawrych merged commit 06e5c73 into apache:v1.x Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v1.x] Reduce after quantization memory usage #20925

[v1.x] Reduce after quantization memory usage #20925

bgawrych commented Mar 2, 2022

mxnet-bot commented Mar 2, 2022

bgawrych commented Mar 3, 2022

mxnet-bot commented Mar 3, 2022

[v1.x] Reduce after quantization memory usage #20925

[v1.x] Reduce after quantization memory usage #20925

Conversation

bgawrych commented Mar 2, 2022

Description

mxnet-bot commented Mar 2, 2022

bgawrych commented Mar 3, 2022

mxnet-bot commented Mar 3, 2022