You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
Theres a difference in the flow between get_backend_symbol and optimize_for when a single backend has multiple subgraph properties that results in the indexed graph not being updated between subgraph properties.
Error Message
Traceback (most recent call last):
File "test.py", line 22, in <module>
sym_block.optimize_for(mx.nd.zeros((64, 4, 10, 10)), backend='MKLDNN_QUANTIZE')
File "/home/ubuntu/v1.7.x/python/mxnet/gluon/block.py", line 1089, in optimize_for
self._build_cache(x, *args)
File "/home/ubuntu/v1.7.x/python/mxnet/gluon/block.py", line 979, in _build_cache
out = out.optimize_for(self._backend, arg_array, aux_array, ctx, **self._backend_opts)
File "/home/ubuntu/v1.7.x/python/mxnet/symbol/symbol.py", line 1531, in optimize_for
ctypes.byref(new_aux_names)))
File "/home/ubuntu/v1.7.x/python/mxnet/base.py", line 246, in check_call
raise get_last_ffi_error()
mxnet.base.MXNetError: Traceback (most recent call last):
File "src/operator/subgraph/build_subgraph.cc", line 80
MXNetError: Check failed: input_nid < simple_nodes->size() (6 vs. 6) :
So after subsequent subgraph properties the graph is not updated, causing the error in this issue. Since the indexed_graph_ is private, theres no way to reset it.
Description
Theres a difference in the flow between
get_backend_symbol
andoptimize_for
when a single backend has multiple subgraph properties that results in the indexed graph not being updated between subgraph properties.Error Message
To Reproduce
Steps to reproduce
Analysis
The error is occurring at the
CHECK_LT
on line 80 where theres a mismatch in the Graph definition (the input of a node is pointing to another node ID that hasnt been seen yet, so this is breaking topological ordering):https://github.com/apache/incubator-mxnet/blob/93b7ff787f56ab0e386df0fc177ae1da26dcd47e/src/operator/subgraph/build_subgraph.cc#L70-L80
Some debugging finds that the indexed graph inside of an nnvm::Graph object is only built once:
So after subsequent subgraph properties the graph is not updated, causing the error in this issue. Since the
indexed_graph_
is private, theres no way to reset it.Comparing the
get_backend_symbol
flow the Graph is created every iteration for each subgraph property and stored in the Symbol between iterations:https://github.com/apache/incubator-mxnet/blob/16280ad72aaa35f089ca4d4db680a5c5c7ff1676/src/c_api/c_api_symbolic.cc#L1302-L1308
Where as in the
optimize_for
flow the Graph is created once:https://github.com/apache/incubator-mxnet/blob/16280ad72aaa35f089ca4d4db680a5c5c7ff1676/src/c_api/c_api_symbolic.cc#L1375
and then re-used for each subgraph propery:
https://github.com/apache/incubator-mxnet/blob/16280ad72aaa35f089ca4d4db680a5c5c7ff1676/src/c_api/c_api_symbolic.cc#L1498-L1502
So we need to create a new Graph object for each iteration in order to fix this.
FYI @mseth10 @Kh4L @ptrendx
The text was updated successfully, but these errors were encountered: