Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Reduce after quantization memory usage #20894

Merged
merged 1 commit into from
Feb 21, 2022

Conversation

bgawrych
Copy link
Contributor

Description

This change prevents MXNet from allocating additional memory space for gradients in quantized model as it can't be used anyway.

Memory measurement script:

import mxnet as mx
from mxnet.gluon.model_zoo import vision
import psutil
import os

def get_process_memory():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss * 1e-6


batch_shape = (1, 3, 224, 224)
data = mx.np.random.normal(size=batch_shape)

print("memory before loading model: ", get_process_memory())
net = vision.resnet50_v1(pretrained=True)
print("memory after loading model: ", get_process_memory())
out = net(data)
out.wait_to_read()
print("memory after fp32 forward pass", get_process_memory())

dataset = mx.gluon.data.ArrayDataset(data)
data_loader = mx.gluon.data.DataLoader(dataset, batch_size=1)
net_quantized = mx.contrib.quant.quantize_net(net, quantized_dtype='int8',
                                                quantize_mode="smart",
                                                calib_mode='naive',
                                                calib_data=data_loader,
                                                num_calib_batches=1,
                                                ctx=mx.current_context())

print("memory after quantization: ", get_process_memory())

outputs = net_quantized(data)
outputs.wait_to_read()
print("memory after int8 forward pass: ", get_process_memory())

Output before:

memory before loading model:  213.430272
[15:14:11] ../src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU
memory after loading model:  530.702336
memory after fp32 forward pass 611.241984
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1918: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
        data: None
  input_sym_arg_type = in_param.infer_type()[0]
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1251: UserWarning: register_op_hook is experimental when static_alloc=True / static_shape=True  and may not work correctly
  warnings.warn("register_op_hook is experimental when static_alloc=True / static_shape=True "
memory after quantization:  1064.57088
memory after int8 forward pass:  1071.005696

Output after:

memory before loading model:  214.28633599999998
[15:13:17] ../src/storage/storage.cc:202: Using Pooled (Naive) StorageManager for CPU
memory after loading model:  531.2593919999999
memory after fp32 forward pass 609.513472
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1918: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
        data: None
  input_sym_arg_type = in_param.infer_type()[0]
/home/bg/work/MXNet/python/mxnet/gluon/block.py:1251: UserWarning: register_op_hook is experimental when static_alloc=True / static_shape=True  and may not work correctly
  warnings.warn("register_op_hook is experimental when static_alloc=True / static_shape=True "
memory after quantization:  890.273792
memory after int8 forward pass:  895.2258559999999

Significant memory usage reduction can be observed

@bgawrych bgawrych requested a review from szha as a code owner February 14, 2022 14:16
@mxnet-bot
Copy link

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [edge, miscellaneous, windows-gpu, centos-cpu, unix-cpu, unix-gpu, windows-cpu, website, sanity, clang, centos-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 14, 2022
Copy link
Contributor

@agrabows agrabows left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bgawrych bgawrych merged commit f6266f0 into apache:master Feb 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants