Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[autotvm] runtime errors for simple matmul example with double matrix size #3823

Closed
jdomke opened this issue Aug 23, 2019 · 1 comment
Closed

Comments

@jdomke
Copy link

jdomke commented Aug 23, 2019

steps to reproduce:
(used master branch with commit: ebda258)

download tune_simple_template.py from
https://docs.tvm.ai/tutorials/autotvm/tune_simple_template.html

change matrix size to 1k*1k:
N, L, M = 1024, 1024, 1024 # 512, 512, 512

output when running with python3.6:
ConfigSpace (len=121, space_map=
0 tile_y: Split(policy=all, product=1024, num_outputs=2) len=11
1 tile_x: Split(policy=all, product=1024, num_outputs=2) len=11
)
Get devices for measurement successfully!
No: 1 GFLOPS: 5.62/5.62 result: MeasureResult(costs=(0.3818782868,), error_no=0, all_cost=6.303944110870361, timestamp=1566519646.8153532) [('tile_y', [1, 1024]), ('tile_x', [16, 64])],,None,76
No: 2 GFLOPS: 0.00/5.62 result: MeasureResult(costs=(RuntimeError('Traceback (most recent call last):\n [bt] (3) /scr0/jens/tvm/build/libtvm.so(TVMFuncCall+0x46) [0x7f6f28375d26]\n [bt] (2) /scr0/jens/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::RPCModuleNode::WrapRemote(void*)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)+0x49) [0x7f6f283b6bc9]\n [bt] (1) /scr0/jens/tvm/build/libtvm.so(tvm::runtime::RPCSession::CallFunc(void*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, tvm::runtime::PackedFunc const*)+0x121) [0x7f6f283beca1]\n [bt] (0) /scr0/jens/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x22) [0x7f6f27cc1e82]\n File "/scr0/jens/tvm/src/runtime/rpc/rpc_session.cc", line 962\nTVMError: Check failed: code == RPCCode: :kReturn: code=4',),), error_no=4, all_cost=10.16835331916809, timestamp=1566519657.0405073) [('tile_y', [512, 2]), ('tile_x', [256, 4])],,None,23
No: 3 GFLOPS: 5.24/5.62 result: MeasureResult(costs=(0.4094461226,), error_no=0, all_cost=6.739230632781982, timestamp=1566519663.857611) [('tile_y', [1024, 1]), ('tile_x', [32, 32])],,None,55
No: 4 GFLOPS: 0.00/5.62 result: MeasureResult(costs=(RuntimeError('Traceback (most recent call last):\n [bt] (3) /scr0/jens/tvm/build/libtvm.so(TVMFuncCall+0x46) [0x7f6f28375d26]\n [bt] (2) /scr0/jens/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::RPCModuleNode::WrapRemote(void*)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)+0x49) [0x7f6f283b6bc9]\n [bt] (1) /scr0/jens/tvm/build/libtvm.so(tvm::runtime::RPCSession::CallFunc(void*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, tvm::runtime::PackedFunc const*)+0x121) [0x7f6f283beca1]\n [bt] (0) /scr0/jens/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x22) [0x7f6f27cc1e82]\n File "/scr0/jens/tvm/src/runtime/rpc/rpc_session.cc", line 962\nTVMError: Check failed: code == RPCCode: :kReturn: code=4',),), error_no=4, all_cost=10.166363716125488, timestamp=1566519674.0822878) [('tile_y', [32, 32]), ('tile_x', [1024, 1])],,None,5
No: 5 GFLOPS: 0.00/5.62 result: MeasureResult(costs=(RuntimeError('Traceback (most recent call last):\n [bt] (3) /scr0/jens/tvm/build/libtvm.so(TVMFuncCall+0x46) [0x7f6f28375d26]\n [bt] (2) /scr0/jens/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::RPCModuleNode::WrapRemote(void*)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)+0x49) [0x7f6f283b6bc9]\n [bt] (1) /scr0/jens/tvm/build/libtvm.so(tvm::runtime::RPCSession::CallFunc(void*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, tvm::runtime::PackedFunc const*)+0x121) [0x7f6f283beca1]\n [bt] (0) /scr0/jens/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x22) [0x7f6f27cc1e82]\n File "/scr0/jens/tvm/src/runtime/rpc/rpc_session.cc", line 962\nTVMError: Check failed: code == RPCCode: :kReturn: code=4',),), error_no=4, all_cost=10.1862473487854, timestamp=1566519684.325056) [('tile_y', [4, 256]), ('tile_x', [256, 4])],,None,30
No: 6 GFLOPS: 5.12/5.62 result: MeasureResult(costs=(0.41928264740000004,), error_no=0, all_cost=6.90018105506897, timestamp=1566519691.3031137) [('tile_y', [8, 128]), ('tile_x', [32, 32])],,None,62
No: 7 GFLOPS: 0.00/5.62 result: MeasureResult(costs=(RuntimeError('Traceback (most recent call last):\n [bt] (3) /scr0/jens/tvm/build/libtvm.so(TVMFuncCall+0x46) [0x7f6f28375d26]\n [bt] (2) /scr0/jens/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::RPCModuleNode::WrapRemote(void*)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)+0x49) [0x7f6f283b6bc9]\n [bt] (1) /scr0/jens/tvm/build/libtvm.so(tvm::runtime::RPCSession::CallFunc(void*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, tvm::runtime::PackedFunc const*)+0x121) [0x7f6f283beca1]\n [bt] (0) /scr0/jens/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x22) [0x7f6f27cc1e82]\n File "/scr0/jens/tvm/src/runtime/rpc/rpc_session.cc", line 962\nTVMError: Check failed: code == RPCCode: :kReturn: code=4',),), error_no=4, all_cost=10.168351650238037, timestamp=1566519701.527077) [('tile_y', [4, 256]), ('tile_x', [128, 8])],,None,41
No: 8 GFLOPS: 6.87/6.87 result: MeasureResult(costs=(0.3125918934,), error_no=0, all_cost=5.210693359375, timestamp=1566519706.7963977) [('tile_y', [4, 256]), ('tile_x', [4, 256])],,None,96
No: 9 GFLOPS: 8.50/8.50 result: MeasureResult(costs=(0.2527610564,), error_no=0, all_cost=4.379783868789673, timestamp=1566519711.0908685) [('tile_y', [128, 8]), ('tile_x', [8, 128])],,None,80
No: 10 GFLOPS: 7.41/8.50 result: MeasureResult(costs=(0.2897818942,), error_no=0, all_cost=4.822477102279663, timestamp=1566519715.9805415) [('tile_y', [1, 1024]), ('tile_x', [1, 1024])],,None,120
Finish loading 60 records
execution time for kernel: 0.4322652816772461

@tqchen
Copy link
Member

tqchen commented Aug 29, 2019

Please open a new thread on https://discuss.tvm.ai

AutoTVM will indeed try to produce some kernels that leads to failures, and the ideas is that runtime should be able to catchup and recover when this happens. Depending on how robust the driver is, it may not be true for all drivers

@tqchen tqchen closed this as completed Aug 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants