TVMError: src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX #1027

arassadin · 2018-03-20T10:32:10Z

Hi everyone.

I got such error reproducing toy example from nnvm but with my own model. Calling

m.run()

I get the error similar to #315 (comment):

TVMError: [09:11:33] src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX

Can you clarify me what can be wrong now?

Thanks in advance!

BTW, I'm a bit confused by tvm.gpu() docstring 😃:

Construct a CPU device

The text was updated successfully, but these errors were encountered:

arassadin · 2018-03-20T10:36:43Z

@tqchen already commented in #315 (comment):

it is likely the gpu schedule for nchw did not work for your specific shape of conv2d and the nvcc compiler failed to compile

But it's still not really clear for me where to looking further.

eqy · 2018-03-21T20:28:32Z

Are you using a custom schedule for your model? Usually this is caused by a schedule not being able to handle a specific input shape e.g., the input shape causes too much local memory to be used or too many threads per blocked to be allocated.

arassadin · 2018-03-21T20:36:01Z

Hi,

My code is exactly the same as here with the only difference - the model is my own. Its input is 288x512x3 which, I suppose, not too much for the GTX 1070 with 8 Gb.

eqy · 2018-03-21T20:47:24Z

Ok, then the issue is likely due to an operator not handling one or more of the shapes in your model correctly. One way to verify this is to temporarily try out more common shapes, e.g., those in Resnet-18 such as (224x224x3) and see that works.

arassadin · 2018-03-21T20:48:54Z

Ok, thanks for the tip, I'll try it

arassadin · 2018-03-22T09:11:13Z

Actually, following the original example with Keras ResNet-50 model, I got the error even earlier, on

with nnvm.compiler.build_config(opt_level=2):
    graph, lib, params = nnvm.compiler.build(sym, 'cuda', {'data': (1, 3, 224, 224)}, params=params)

---------------------------------------------------------------------------
NNVMError                                 Traceback (most recent call last)
<ipython-input-10-3c6e3b9f11a9> in <module>()
      1 with nnvm.compiler.build_config(opt_level=2):
----> 2     graph, lib, params = nnvm.compiler.build(sym, 'cuda', {'data': (1, 3, 224, 224)}, params=params)

/usr/local/lib/python2.7/dist-packages/nnvm-0.8.0-py2.7.egg/nnvm/compiler/build_module.pyc in build(graph, target, shape, dtype, params, target_host)
    235     # Precompute prune
    236     if params and cfg.pass_enabled("PrecomputePrune"):
--> 237         graph, params = precompute_prune(graph, params)
    238         shape, dtype = _update_shape_dtype(shape, dtype, params)
    239     # Operator Fusion and generation

/usr/local/lib/python2.7/dist-packages/nnvm-0.8.0-py2.7.egg/nnvm/compiler/build_module.pyc in precompute_prune(graph, params)
    328         return graph, params
    329     with tvm.build_config(auto_unroll_max_step=0):
--> 330         out_arrs = _run_graph(pre_graph, params)
    331     return graph, dict(zip(out_names, out_arrs))

/usr/local/lib/python2.7/dist-packages/nnvm-0.8.0-py2.7.egg/nnvm/compiler/build_module.pyc in _run_graph(graph, params)
    277     _, oshape = graph_util.infer_shape(graph, **shape)
    278     _, odtype = graph_util.infer_dtype(graph, **dtype)
--> 279     graph, libmod, _ = build(graph, target, shape, dtype)
    280     m = graph_runtime.create(graph, libmod, ctx)
    281     set_input, run, get_output = m["set_input"], m["run"], m["get_output"]

/usr/local/lib/python2.7/dist-packages/nnvm-0.8.0-py2.7.egg/nnvm/compiler/build_module.pyc in build(graph, target, shape, dtype, params, target_host)
    249     graph = graph.apply("InferShape").apply("InferType")
    250     with target:
--> 251         graph = graph.apply("GraphFusePartition").apply("GraphFuseCompile")
    252     libmod = graph_attr._move_out_module(graph, "module")
    253     return graph, libmod, params

/usr/local/lib/python2.7/dist-packages/nnvm-0.8.0-py2.7.egg/nnvm/graph.pyc in apply(self, passes)
    232         ghandle = GraphHandle()
    233         npass = nn_uint(len(passes))
--> 234         check_call(_LIB.NNGraphApplyPasses(self.handle, npass, cpass, ctypes.byref(ghandle)))
    235         return Graph(ghandle)
    236 

/usr/local/lib/python2.7/dist-packages/nnvm-0.8.0-py2.7.egg/nnvm/_base.pyc in check_call(ret)
     70     """
     71     if ret != 0:
---> 72         raise NNVMError(py_str(_LIB.NNGetLastError()))
     73 
     74 def c_str(string):

NNVMError: TVMCall CFunc Error:
Traceback (most recent call last):
  File "tvm/_ffi/_cython/./function.pxi", line 39, in core.tvm_callback
  File "/usr/local/lib/python2.7/dist-packages/nnvm-0.8.0-py2.7.egg/nnvm/compiler/build_module.py", line 119, in _build
    return tvm.build(funcs, target=target, target_host=target_host)
  File "/usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/build_module.py", line 471, in build
    mhost = codegen.build_module(fhost, str(target_host))
  File "/usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/codegen.py", line 20, in build_module
    return _Build(lowered_func, target)
  File "/usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/_ffi/function.py", line 280, in my_api_func
    return flocal(*args)
  File "tvm/_ffi/_cython/./function.pxi", line 264, in core.FunctionBase.__call__
  File "tvm/_ffi/_cython/./function.pxi", line 213, in core.FuncCall
  File "tvm/_ffi/_cython/./function.pxi", line 205, in core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 131, in core.CALL
TVMError: [12:04:22] src/codegen/codegen.cc:27: Check failed: bf != nullptr Target llvm is not enabled

Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(dmlc::StackTrace[abi:cxx11]()+0x5a) [0x7fbbd930701a]
[bt] (1) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(tvm::codegen::Build(tvm::Array<tvm::LoweredFunc, void> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xdac) [0x7fbbd94e4e9c]
[bt] (2) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(+0x341449) [0x7fbbd9491449]
[bt] (3) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x5e) [0x7fbbd96ab8fe]
[bt] (4) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/_ffi/_cy2/core.so(+0x18be7) [0x7fbbc59d8be7]
[bt] (5) /usr/bin/python2(PyObject_Call+0x3e) [0x4a577e]
[bt] (6) /usr/bin/python2(PyEval_EvalFrameEx+0x2f0d) [0x4bed3d]
[bt] (7) /usr/bin/python2(PyEval_EvalCodeEx+0x306) [0x4b9ab6]
[bt] (8) /usr/bin/python2(PyEval_EvalFrameEx+0x603f) [0x4c1e6f]
[bt] (9) /usr/bin/python2(PyEval_EvalFrameEx+0x553f) [0x4c136f]

but probably it more about nnvm. What do you think about it?.. Will it be correct to answer "no, common shape not really works"?..

eqy · 2018-03-22T19:26:24Z

This is confusing because the error is complaining about llvm not being enabled, though llvm should not be a requirement for CUDA codegen (https://github.com/dmlc/tvm/blob/master/docs/how_to/install.md).

Can you verify that the target is the CUDA GPU?

arassadin · 2018-03-22T19:28:59Z

The built-in example fails before the ctx = tvm.gpu(0)...

arassadin · 2018-03-22T19:41:07Z

BTW, with CPU context and my own model, I got such trace

---------------------------------------------------------------------------
TVMError                                  Traceback (most recent call last)
<ipython-input-17-02f7defa23aa> in <module>()
----> 1 m.run()

/usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/contrib/graph_runtime.pyc in run(self, **input_dict)
    111         if input_dict:
    112             self.set_input(**input_dict)
--> 113         self._run()
    114 
    115     def get_input(self, index, out):

tvm/_ffi/_cython/./function.pxi in core.FunctionBase.__call__()

tvm/_ffi/_cython/./function.pxi in core.FuncCall()

tvm/_ffi/_cython/./function.pxi in core.FuncCall3()

tvm/_ffi/_cython/./base.pxi in core.CALL()

TVMError: [22:40:15] src/codegen/stack_vm/stack_vm.cc:287: Check failed: stack[sp].v_int64 device_type need to be 2

Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(dmlc::StackTrace[abi:cxx11]()+0x5a) [0x7f469448703a]
[bt] (1) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f4694487c28]
[bt] (2) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(tvm::codegen::StackVM::Run(tvm::codegen::StackVM::State*) const+0x2109) [0x7f46946e4b69]
[bt] (3) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::codegen::StackVMModuleNode::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<tvm::runtime::ModuleNode> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x38) [0x7f46946e5f58]
[bt] (4) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(+0x5abac7) [0x7f469487bac7]
[bt] (5) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(+0x5aa267) [0x7f469487a267]
[bt] (6) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/libtvm.so(TVMFuncCall+0x5e) [0x7f469482b91e]
[bt] (7) /usr/local/lib/python2.7/dist-packages/tvm-0.2.0-py2.7-linux-x86_64.egg/tvm/_ffi/_cy2/core.so(+0x18be7) [0x7f468108abe7]
[bt] (8) /usr/bin/python2(PyEval_EvalFrameEx+0x578f) [0x4c15bf]
[bt] (9) /usr/bin/python2(PyEval_EvalCodeEx+0x306) [0x4b9ab6]

tqchen · 2018-03-25T20:12:43Z

The error for passing in cpu context is correct because we expect gpu. try the latest version in master and it might throw a error and tell you which graph causes the problem

arassadin · 2018-03-26T06:39:22Z

Hi,

According to your suggestion, I rebuilt latest nnvm (c8832cc1c57fc35d8f1e8042c258ac32c0309ebc) with the latest tvm (567a10bb0947180b067f39a97c76d7fe7a3ca1f2) and traceback doesn't changed:

src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX

for my own model and

src/codegen/codegen.cc:27: Check failed: bf != nullptr Target llvm is not enabled

for Keras ResNet-50 example.

tqchen · 2018-05-19T16:49:42Z

close due to we are not able to further act on this, discussions are moved to https://discuss.tvmlang.org/

expectopatronm · 2020-01-30T15:38:49Z

I get the exact same issue.

jetson@jetson:~/fast-depth/deploy$ python3 tx2_run_tvm.py --input-fp data/rgb.npy --output-fp data/pred.npy --model-dir ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/ --cuda True
=> [TVM on TX2] using model files in ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/
=> [TVM on TX2] loading model lib and ptx
=> [TVM on TX2] loading model graph and params
=> [TVM on TX2] creating TVM runtime module
=> [TVM on TX2] feeding inputs and params into TVM module
=> [TVM on TX2] running TVM module, saving output
Traceback (most recent call last):

File "tx2_run_tvm.py", line 91, in
main()

File "tx2_run_tvm.py", line 88, in main
run_model(args.model_dir, args.input_fp, args.output_fp, args.warmup, args.run, args.cuda, try_randin=args.randin)

File "tx2_run_tvm.py", line 36, in run_model
run() # not gmodule.run()

File "/home/jetson/tvm/python/tvm/_ffi/_ctypes/function.py", line 207, in call
raise get_last_ffi_error()

tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (3) /home/jetson/tvm/build/libtvm.so(TVMFuncCall+0x70) [0x7fad7ccec0]
[bt] (2) /home/jetson/tvm/build/libtvm.so(std::Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::detail::PackFuncVoidAddr<4, tvm::runtime::CUDAWrappedFunc>(tvm::runtime::CUDAWrappedFunc, std::vector<tvm::runtime::detail::ArgConvertCode, std::allocatortvm::runtime::detail::ArgConvertCode > const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::M_invoke(std::Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0xe8) [0x7fad850b08]
[bt] (1) /home/jetson/tvm/build/libtvm.so(tvm::runtime::CUDAWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, void**) const+0x6cc) [0x7fad85093c]
[bt] (0) /home/jetson/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4c) [0x7facfdebac]
File "/home/jetson/tvm/src/runtime/cuda/cuda_module.cc", line 110
File "/home/jetson/tvm/src/runtime/library_module.cc", line 91
CUDAError: Check failed: ret == 0 (-1 vs. 0) : cuModuleLoadData(&(module[device_id]), data.c_str()) failed with error: CUDA_ERROR_INVALID_PTX

Still haven't found a solution to it. I am runnig it on a Jetson Nano. Please help.

tiandiao123 · 2020-07-28T07:17:19Z

I get the exact same issue.

jetson@jetson:~/fast-depth/deploy$ python3 tx2_run_tvm.py --input-fp data/rgb.npy --output-fp data/pred.npy --model-dir ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/ --cuda True
=> [TVM on TX2] using model files in ../results/tvm_compiled/tx2_gpu_mobilenet_nnconv5dw_skipadd_pruned/
=> [TVM on TX2] loading model lib and ptx
=> [TVM on TX2] loading model graph and params
=> [TVM on TX2] creating TVM runtime module
=> [TVM on TX2] feeding inputs and params into TVM module
=> [TVM on TX2] running TVM module, saving output
Traceback (most recent call last):

File "tx2_run_tvm.py", line 91, in
main()

File "tx2_run_tvm.py", line 88, in main
run_model(args.model_dir, args.input_fp, args.output_fp, args.warmup, args.run, args.cuda, try_randin=args.randin)

File "tx2_run_tvm.py", line 36, in run_model
run() # not gmodule.run()

File "/home/jetson/tvm/python/tvm/_ffi/_ctypes/function.py", line 207, in call
raise get_last_ffi_error()

tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (3) /home/jetson/tvm/build/libtvm.so(TVMFuncCall+0x70) [0x7fad7ccec0]
[bt] (2) /home/jetson/tvm/build/libtvm.so(std::Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::detail::PackFuncVoidAddr<4, tvm::runtime::CUDAWrappedFunc>(tvm::runtime::CUDAWrappedFunc, std::vector<tvm::runtime::detail::ArgConvertCode, std::allocatortvm::runtime::detail::ArgConvertCode > const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::M_invoke(std::Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0xe8) [0x7fad850b08] [bt] (1) /home/jetson/tvm/build/libtvm.so(tvm::runtime::CUDAWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, void**) const+0x6cc) [0x7fad85093c] [bt] (0) /home/jetson/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4c) [0x7facfdebac] File "/home/jetson/tvm/src/runtime/cuda/cuda_module.cc", line 110 File "/home/jetson/tvm/src/runtime/library_module.cc", line 91 CUDAError: Check failed: ret == 0 (-1 vs. 0) : cuModuleLoadData(&(module[device_id]), data.c_str()) failed with error: CUDA_ERROR_INVALID_PTX

Still haven't found a solution to it. I am runnig it on a Jetson Nano. Please help.

did you find some solution? I have exact same issue. I don't know how to fix it, could you help me?

tqchen closed this as completed May 19, 2018

VertexC mentioned this issue Feb 2, 2021

tf->tvm cuda mobilenet fail VertexC/dl-infer-perf#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TVMError: src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX #1027

TVMError: src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX #1027

arassadin commented Mar 20, 2018

arassadin commented Mar 20, 2018

eqy commented Mar 21, 2018

arassadin commented Mar 21, 2018

eqy commented Mar 21, 2018

arassadin commented Mar 21, 2018

arassadin commented Mar 22, 2018

eqy commented Mar 22, 2018 •

edited

Loading

arassadin commented Mar 22, 2018

arassadin commented Mar 22, 2018

tqchen commented Mar 25, 2018

arassadin commented Mar 26, 2018

tqchen commented May 19, 2018

expectopatronm commented Jan 30, 2020

tiandiao123 commented Jul 28, 2020

TVMError: src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX #1027

TVMError: src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX #1027

Comments

arassadin commented Mar 20, 2018

arassadin commented Mar 20, 2018

eqy commented Mar 21, 2018

arassadin commented Mar 21, 2018

eqy commented Mar 21, 2018

arassadin commented Mar 21, 2018

arassadin commented Mar 22, 2018

eqy commented Mar 22, 2018 • edited Loading

arassadin commented Mar 22, 2018

arassadin commented Mar 22, 2018

tqchen commented Mar 25, 2018

arassadin commented Mar 26, 2018

tqchen commented May 19, 2018

expectopatronm commented Jan 30, 2020

tiandiao123 commented Jul 28, 2020

eqy commented Mar 22, 2018 •

edited

Loading