You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had tried several times, but it still came out a seems random but the same core dump error at different Iteration counts(the last iteration breakpoint is "Iteration 9920"). And the current temp caffemodel is not be auto-saved when failed. The iteration is begin from 0 again. Time wasted!
I found the error code on cuda toolkit docment. It seems to be cuBLAS library or Driver issue. 2.2.2. cublasStatus_t CUBLAS_STATUS_EXECUTION_FAILED
The weird thing is that I had trained a ssd caffemodel(weiliu89's version, voc0712, vgg16, iter=12000) successfully on this computer several days before. Waiting for your reply.
I am using tx1, the cuda version is the latest version from nvidia, cuda_8.0.61_375.26_linux.run
Here is my computer information:
Fri Apr 21 10:49:22 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39 Driver Version: 375.39 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN X (Pascal) Off | 0000:01:00.0 On | N/A |
| 23% 33C P8 15W / 250W | 186MiB / 12188MiB | 4% Default |
+-------------------------------+----------------------+----------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
The text was updated successfully, but these errors were encountered:
when I tried the steps follow your instructions at this point:
" # Train the SSD-ResNet-101 321x321
python examples/ssd/ssd_pascal_resnet_321.py "
Error showing:
I0420 22:48:18.898177 7233 sgd_solver.cpp:138] Iteration 2520, lr = 0.001
F0421 09:02:45.161278 7233 math_functions.cu:52] Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0) CUBLAS_STATUS_EXECUTION_FAILED
*** Check failure stack trace: ***
@ 0x7f2f6e971daa (unknown)
@ 0x7f2f6e971ce4 (unknown)
@ 0x7f2f6e9716e6 (unknown)
@ 0x7f2f6e974687 (unknown)
@ 0x7f2f6f1d36a5 caffe::caffe_gpu_gemv<>()
@ 0x7f2f6f17a51a caffe::BiasLayer<>::Backward_gpu()
@ 0x7f2f6f193a47 caffe::ScaleLayer<>::Backward_gpu()
@ 0x7f2f6f15c817 caffe::Net<>::BackwardFromTo()
@ 0x7f2f6f15c981 caffe::Net<>::Backward()
@ 0x7f2f6f0b4c8b caffe::Solver<>::Step()
@ 0x7f2f6f0b538e caffe::Solver<>::Solve()
@ 0x40b568 train()
@ 0x40899c main
@ 0x7f2f6d0f1f45 (unknown)
@ 0x4092a3 (unknown)
@ (nil) (unknown)
Aborted (core dumped)
I had tried several times, but it still came out a seems random but the same core dump error at different Iteration counts(the last iteration breakpoint is "Iteration 9920"). And the current temp caffemodel is not be auto-saved when failed. The iteration is begin from 0 again. Time wasted!
I found the error code on cuda toolkit docment. It seems to be cuBLAS library or Driver issue.
2.2.2. cublasStatus_t CUBLAS_STATUS_EXECUTION_FAILED
The weird thing is that I had trained a ssd caffemodel(weiliu89's version, voc0712, vgg16, iter=12000) successfully on this computer several days before. Waiting for your reply.
I am using tx1, the cuda version is the latest version from nvidia, cuda_8.0.61_375.26_linux.run
Here is my computer information:
Fri Apr 21 10:49:22 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39 Driver Version: 375.39 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN X (Pascal) Off | 0000:01:00.0 On | N/A |
| 23% 33C P8 15W / 250W | 186MiB / 12188MiB | 4% Default |
+-------------------------------+----------------------+----------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
The text was updated successfully, but these errors were encountered: