Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Numpy] Backward error in mixed int64 + float32 #18084

Closed
sxjscience opened this issue Apr 16, 2020 · 3 comments
Closed

[Numpy] Backward error in mixed int64 + float32 #18084

sxjscience opened this issue Apr 16, 2020 · 3 comments

Comments

@sxjscience
Copy link
Member

sxjscience commented Apr 16, 2020

This is related to #18022.

Reproducible example:

import mxnet as mx
from mxnet.gluon import HybridBlock
mx.npx.set_np()

class Foo(HybridBlock):
    def hybrid_forward(self, F, query):
        query_shape = F.npx.shape_array(query)
        return query / F.np.sqrt(query_shape[-1])

foo = Foo()
foo.hybridize()
a = mx.np.ones((5, 5, 5))
out = foo(a)
print(out)

a.attach_grad()
with mx.autograd.record():
    out = foo(a)
    out.backward()
print(a.grad)

Error message:

MXNetError: Traceback (most recent call last):
  File "include/mxnet/./tensor_blob.h", line 256
MXNetError: Check failed: mshadow: :DataType<DType>::kFlag == type_flag_: TBlob.get_with_shape: data type do not match specified type.Expected: long long v.s. given float

Currently, I have to use query / F.np.sqrt(query_shape[-1].astype(np.float32)).

@sxjscience
Copy link
Member Author

Another failure case:

import mxnet as mx
from mxnet.gluon import HybridBlock
mx.npx.set_np()

class Foo(HybridBlock):
    def hybrid_forward(self, F, query):
        query_shape = F.npx.shape_array(query)
        return query / F.np.sqrt(query_shape[-1].astype(mx.np.float32))

foo = Foo()
foo.hybridize()
a = mx.np.ones((5, 5, 5), dtype=mx.np.float16)
out = foo(a)
print(out)

a.attach_grad()
with mx.autograd.record():
    out = foo(a)
    out.backward()
print(a.grad)

Error:

[10:10:29] src/operator/numpy/./np_true_divide-inl.h:244: not implemented yet...
[[[ 0.0000000e+00 -0.0000000e+00 -2.0039270e-21 -3.6902478e+19
    1.0000038e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]]

 [[ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]]

 [[ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]]

 [[ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]]

 [[ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]
  [ 1.0000000e+00  1.0000000e+00  1.0000000e+00  1.0000000e+00
    1.0000000e+00]]]
[10:10:29] src/operator/numpy/./np_true_divide-inl.h:244: not implemented yet...
---------------------------------------------------------------------------
MXNetError                                Traceback (most recent call last)
<ipython-input-5-cc974fcd01c7> in <module>
     18     out = foo(a)
     19     out.backward()
---> 20 print(a.grad)

~/miniconda3/lib/python3.7/site-packages/mxnet/numpy/multiarray.py in __str__(self)
   1178     def __str__(self):
   1179         """Returns a string representation of the array."""
-> 1180         array_str = self.asnumpy().__str__()
   1181         context = self.ctx
   1182         if context.device_type == 'cpu' or self.ndim == 0:

~/miniconda3/lib/python3.7/site-packages/mxnet/ndarray/ndarray.py in asnumpy(self)
   2564             self.handle,
   2565             data.ctypes.data_as(ctypes.c_void_p),
-> 2566             ctypes.c_size_t(data.size)))
   2567         return data
   2568 

~/miniconda3/lib/python3.7/site-packages/mxnet/base.py in check_call(ret)
    244     """
    245     if ret != 0:
--> 246         raise get_last_ffi_error()
    247 
    248 

MXNetError: Traceback (most recent call last):
  File "include/mxnet/./tensor_blob.h", line 256
MXNetError: Check failed: mshadow: :DataType<DType>::kFlag == type_flag_: TBlob.get_with_shape: data type do not match specified type.Expected: float v.s. given half

@yzhliu yzhliu added the v2.0 label Apr 29, 2020
@yzhliu
Copy link
Member

yzhliu commented Apr 29, 2020

Assignee: @BenjaminCHEN2016

@yzhliu
Copy link
Member

yzhliu commented May 9, 2020

fixed by #18250

@yzhliu yzhliu closed this as completed May 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants