Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[v1.x] Reduce after quantization memory usage #20925

Merged
merged 2 commits into from
Mar 10, 2022

Conversation

bgawrych
Copy link
Contributor

@bgawrych bgawrych commented Mar 2, 2022

Description

Port of #20894

Script:

import mxnet as mx
from mxnet.gluon.model_zoo import vision
import psutil
import os

def get_process_memory():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss * 1e-6


batch_shape = (1, 3, 224, 224)
data = mx.nd.random.normal(shape=batch_shape)

print("memory before loading model: ", get_process_memory())
net = vision.resnet50_v1(pretrained=True)
print("memory after loading model: ", get_process_memory()) 
out = net(data)
out.wait_to_read()
print("memory after fp32 forward pass", get_process_memory())

indata = {'data':data}
label = {'label':mx.nd.zeros(shape=(1,))}
dataiter = mx.io.NDArrayIter(indata, label, 3, True, last_batch_handle='discard')
net_quantized = mx.contrib.quant.quantize_net(net, quantized_dtype='auto',
                                                quantize_mode="smart",
                                                calib_mode='naive',
                                                calib_data=dataiter,
                                                num_calib_examples=1,
                                                ctx=mx.current_context())

print("memory after quantization: ", get_process_memory())

outputs = net_quantized(data)
outputs.wait_to_read()
print("memory after int8 forward pass: ", get_process_memory())

Output before:

memory before loading model:  201.936896
memory after loading model:  433.41004799999996
memory after fp32 forward pass 523.698176
memory after quantization:  1308.803072
memory after int8 forward pass:  1313.349632

Output after:

memory before loading model:  202.502144
memory after loading model:  434.184192
memory after fp32 forward pass 520.986624
memory after quantization:  1136.570368
memory after int8 forward pass:  1141.485568

@bgawrych bgawrych requested a review from szha as a code owner March 2, 2022 10:23
@mxnet-bot
Copy link

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [sanity, website, windows-gpu, clang, centos-cpu, windows-cpu, unix-gpu, unix-cpu, edge, miscellaneous, centos-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 2, 2022
@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 3, 2022
@bgawrych
Copy link
Contributor Author

bgawrych commented Mar 3, 2022

@mxnet-bot run ci [windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Mar 3, 2022
@bgawrych bgawrych merged commit 06e5c73 into apache:v1.x Mar 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants