You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trying to train GpuIndexIVFPQ and add vectors to the index. Seg fault.
Platform
OS: Ubuntu 16.04
Faiss version:
Faiss compilation options:
Running on :
GPU Titan X 12GB
Reproduction instructions
Briefly, I was trying to search in 50 million 128D vectors. I used GpuIndexIVFPQ(PQ8) with a GTX Titan X with 12 GB memory. It crashes when I added the vectors to the index. However, it works well when I add 40 millions vectors. I checked the nvidia-smi and find that there is enough mem to use.
Here is the bt of gdb:
Program received signal SIGSEGV, Segmentation fault.
0x00007fffdc1e9b9c in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
(gdb) bt
#0 0x00007fffdc1e9b9c in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #1 0x00007fffdc29610e in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #2 0x00007fffdc35edf9 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #3 0x00007fffdc35f9c5 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #4 0x00007fffdc297620 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #5 0x00007fffdc1b93f8 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #6 0x00007fffdc1ba910 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #7 0x00007fffdc2fa8b2 in cuMemcpyHtoDAsync_v2 ()
from /usr/lib/x86_64-linux-gnu/libcuda.so.1 #8 0x00007ffff416e8cc in ?? ()
from /usr/local/cuda-8.0/lib64/libcudart.so.8.0 #9 0x00007ffff414ab5b in ?? ()
from /usr/local/cuda-8.0/lib64/libcudart.so.8.0 #10 0x00007ffff4184b08 in cudaMemcpyAsync ()
from /usr/local/cuda-8.0/lib64/libcudart.so.8.0 #11 0x0000000000435609 in faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>::copyFrom(faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>&, CUstream_st*) () #12 0x0000000000433a69 in faiss::gpu::DeviceTensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits> faiss::gpu::toDevice<float, 2>(faiss::gpu::GpuResources*, int, float*, CUstream_st*, std::initializer_list) () #13 0x000000000043de15 in faiss::gpu::GpuIndexIVFPQ::addImpl_(long, float const*, long const*) () #14 0x000000000043a9e2 in faiss::gpu::GpuIndex::addInternal_(long, float const*, long const*) () #15 0x000000000043a744 in faiss::gpu::GpuIndex::add_with_ids(long, float const*, long const*) () #16 0x000000000040c90b in CwAnnTopkImpl::add_with_batch_gpu (
this=0x7fffffffc080, vec_feats=0x7ff9d3128010, feat_num=50099900,
ids=0x7ff9b52e0010) at CwAnnTopkImpl.cpp:219
---Type to continue, or q to quit--- #17 0x000000000040daed in CwAnnTopkImpl::add_with_ids_cwfeat_gpu (
this=0x7fffffffc080, vec_feats=0x7ff9d3128010, feat_num=50099900,
feat_dim=128, ids=0x7ff9b52e0010) at CwAnnTopkImpl.cpp:560 #18 0x0000000000409b00 in main (argc=1, argv=0x7fffffffe118)
at test/test_cwimpl_testhitrate.cpp:143
Appreciate any help.
The text was updated successfully, but these errors were encountered:
@ZhuoranLyu can you use gdb to print out the locals and arguments to stack frame 11 above (the one with faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>::copyFrom(faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>&, CUstream_st*) ()?
@wickedfoo I try to use info args to get the arguments of certain stack frame. However, it always says no symbol table info available. Any other ways to print out the locals?
Summary
Trying to train GpuIndexIVFPQ and add vectors to the index. Seg fault.
Platform
OS: Ubuntu 16.04
Faiss version:
Faiss compilation options:
Running on :
Reproduction instructions
Briefly, I was trying to search in 50 million 128D vectors. I used GpuIndexIVFPQ(PQ8) with a GTX Titan X with 12 GB memory. It crashes when I added the vectors to the index. However, it works well when I add 40 millions vectors. I checked the nvidia-smi and find that there is enough mem to use.
Here is the bt of gdb:
Program received signal SIGSEGV, Segmentation fault.
0x00007fffdc1e9b9c in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
(gdb) bt
#0 0x00007fffdc1e9b9c in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1 0x00007fffdc29610e in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007fffdc35edf9 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007fffdc35f9c5 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007fffdc297620 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007fffdc1b93f8 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#6 0x00007fffdc1ba910 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#7 0x00007fffdc2fa8b2 in cuMemcpyHtoDAsync_v2 ()
from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#8 0x00007ffff416e8cc in ?? ()
from /usr/local/cuda-8.0/lib64/libcudart.so.8.0
#9 0x00007ffff414ab5b in ?? ()
from /usr/local/cuda-8.0/lib64/libcudart.so.8.0
#10 0x00007ffff4184b08 in cudaMemcpyAsync ()
from /usr/local/cuda-8.0/lib64/libcudart.so.8.0
#11 0x0000000000435609 in faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>::copyFrom(faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>&, CUstream_st*) ()
#12 0x0000000000433a69 in faiss::gpu::DeviceTensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits> faiss::gpu::toDevice<float, 2>(faiss::gpu::GpuResources*, int, float*, CUstream_st*, std::initializer_list) ()
#13 0x000000000043de15 in faiss::gpu::GpuIndexIVFPQ::addImpl_(long, float const*, long const*) ()
#14 0x000000000043a9e2 in faiss::gpu::GpuIndex::addInternal_(long, float const*, long const*) ()
#15 0x000000000043a744 in faiss::gpu::GpuIndex::add_with_ids(long, float const*, long const*) ()
#16 0x000000000040c90b in CwAnnTopkImpl::add_with_batch_gpu (
this=0x7fffffffc080, vec_feats=0x7ff9d3128010, feat_num=50099900,
ids=0x7ff9b52e0010) at CwAnnTopkImpl.cpp:219
---Type to continue, or q to quit---
#17 0x000000000040daed in CwAnnTopkImpl::add_with_ids_cwfeat_gpu (
this=0x7fffffffc080, vec_feats=0x7ff9d3128010, feat_num=50099900,
feat_dim=128, ids=0x7ff9b52e0010) at CwAnnTopkImpl.cpp:560
#18 0x0000000000409b00 in main (argc=1, argv=0x7fffffffe118)
at test/test_cwimpl_testhitrate.cpp:143
Appreciate any help.
The text was updated successfully, but these errors were encountered: