Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

row_sparse numpy Parameter and row_sparse gradient in npx.embedding? #20391

Open
fhieber opened this issue Jun 27, 2021 · 3 comments
Open

row_sparse numpy Parameter and row_sparse gradient in npx.embedding? #20391

fhieber opened this issue Jun 27, 2021 · 3 comments

Comments

@fhieber
Copy link
Contributor

fhieber commented Jun 27, 2021

Description

While migrating to the numpy namespaces in MXnet 2.0 I observed an error when trying to create a row_sparse parameter (see example below). The example shows our current pattern in mxnet 1.x (using NDArrays/symbols).

Does the new numpy interface not yet support row_sparse parameters/gradients?

Error Message

[12:07:12] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
Traceback (most recent call last):
  File "sparse.py", line 14, in <module>
    b.initialize()
  File "/Users/fhieber/anaconda3/lib/python3.7/site-packages/mxnet/gluon/block.py", line 574, in initialize
    v.initialize(None, ctx, init, force_reinit=force_reinit)
  File "/Users/fhieber/anaconda3/lib/python3.7/site-packages/mxnet/gluon/parameter.py", line 485, in initialize
    self._finish_deferred_init()
  File "/Users/fhieber/anaconda3/lib/python3.7/site-packages/mxnet/gluon/parameter.py", line 364, in _finish_deferred_init
    self._init_impl(data, ctx)
  File "/Users/fhieber/anaconda3/lib/python3.7/site-packages/mxnet/gluon/parameter.py", line 377, in _init_impl
    self._init_grad()
  File "/Users/fhieber/anaconda3/lib/python3.7/site-packages/mxnet/gluon/parameter.py", line 388, in _init_grad
    .format(self._grad_stype))
ValueError: mxnet.numpy.zeros does not support stype = row_sparse

To Reproduce

from mxnet import np, npx, gluon

class Block(gluon.Block):
  def __init__(self):
    super().__init__()
    self.weight = gluon.Parameter('weight', shape=(32,32)), grad_stype='row_sparse')
  def forward(self, x):
    return npx.embedding(x, weight=self.weight.data(), input_dim=32, output_dim=32, sparse_grad=True)

b = Block()
b.initialize()

x = np.ones((32, 32))
r = b(x)
print(r)

Environment

----------Python Info----------
Version      : 3.7.5
Compiler     : Clang 4.0.1 (tags/RELEASE_401/final)
Build        : ('default', 'Oct 25 2019 10:52:18')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 21.1.2
Directory    : /Users/fhieber/anaconda3/lib/python3.7/site-packages/pip
----------MXNet Info-----------
Version      : 2.0.0
Directory    : /Users/fhieber/anaconda3/lib/python3.7/site-packages/mxnet
Commit Hash   : dc69b04070c55f33c1ac2dc83be42be9c1a8c56f
Library      : ['/Users/fhieber/anaconda3/lib/python3.7/site-packages/mxnet/libmxnet.dylib']
@barry-jin
Copy link
Contributor

@fhieber Currently in numpy mode Gluon 2.0, sparse feature is not supported.

@fhieber
Copy link
Contributor Author

fhieber commented Jun 30, 2021

I see, thanks. Are there plans to re-add this? Sparse gradient updates for embedding matrices provided noticable improvements in training throughput in the past.

@barry-jin
Copy link
Contributor

MXNet2.0 NumPy array will need to follow the python array API standard, so we will probably not add sparse feature for NumPy arrays. But, I'm working on a work around to help users to fallback to legacy and sparse gradients when sparse grad is required in parameters and some operators.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants