You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
您好,
在Imagenet.py 读取数据时直接出现了 an illegal memory 的错误,请问是什么原因呢?我的显卡是2 * V100,应该不会出现显存不足的错误呀,源码除了数据集位置没有做任何改变,
以下是错误日志
root@test-6gwz28fvc:/data1/test# python imagenet.py
DALI "gpu" variant
read 1281167 files from 1000 directories
140020509374208 Exception in thread: CUDA runtime API error cudaErrorIllegalAddress (77):
an illegal memory access was encountered
Traceback (most recent call last):
File "imagenet.py", line 105, in
num_threads=4, crop=224, device_id=0, num_gpus=1)
File "imagenet.py", line 67, in get_imagenet_iter_dali
dali_iter_train = DALIClassificationIterator(pip_train, size=pip_train.epoch_size("Reader") // world_size)
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 338, in init
last_batch_padded = last_batch_padded)
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 148, in init
self._first_batch = self.next()
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 245, in next
return self.next()
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 163, in next
outputs.append(p.share_outputs())
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/pipeline.py", line 409, in share_outputs
return self._pipe.ShareOutputs()
RuntimeError: Critical error in pipeline: Error in thread 0: CUDA runtime API error cudaErrorIllegalAddress (77):
an illegal memory access was encountered
Current pipeline object is no longer valid.
terminate called after throwing an instance of 'dali::CUDAError'
what(): CUDA runtime API error cudaErrorIllegalAddress (77):
an illegal memory access was encountered
已放弃 (核心已转储)
能帮忙看一下吗?谢谢
The text was updated successfully, but these errors were encountered:
您好,
在Imagenet.py 读取数据时直接出现了 an illegal memory 的错误,请问是什么原因呢?我的显卡是2 * V100,应该不会出现显存不足的错误呀,源码除了数据集位置没有做任何改变,
以下是错误日志
root@test-6gwz28fvc:/data1/test# python imagenet.py
DALI "gpu" variant
read 1281167 files from 1000 directories
140020509374208 Exception in thread: CUDA runtime API error cudaErrorIllegalAddress (77):
an illegal memory access was encountered
Traceback (most recent call last):
File "imagenet.py", line 105, in
num_threads=4, crop=224, device_id=0, num_gpus=1)
File "imagenet.py", line 67, in get_imagenet_iter_dali
dali_iter_train = DALIClassificationIterator(pip_train, size=pip_train.epoch_size("Reader") // world_size)
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 338, in init
last_batch_padded = last_batch_padded)
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 148, in init
self._first_batch = self.next()
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 245, in next
return self.next()
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 163, in next
outputs.append(p.share_outputs())
File "/usr/local/miniconda3/lib/python3.6/site-packages/nvidia/dali/pipeline.py", line 409, in share_outputs
return self._pipe.ShareOutputs()
RuntimeError: Critical error in pipeline: Error in thread 0: CUDA runtime API error cudaErrorIllegalAddress (77):
an illegal memory access was encountered
Current pipeline object is no longer valid.
terminate called after throwing an instance of 'dali::CUDAError'
what(): CUDA runtime API error cudaErrorIllegalAddress (77):
an illegal memory access was encountered
已放弃 (核心已转储)
能帮忙看一下吗?谢谢
The text was updated successfully, but these errors were encountered: