🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT #1418

Mansterteddy · 2022-10-25T15:02:43Z

Bug Description

I wanted to use Torch-TensorRT to boost BERT model inference, but met following errors:

../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [32,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [33,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [34,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [35,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [36,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [37,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [38,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [39,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [40,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:975: indexSelectLargeIndex: block: [239,0,0], thread: [41,0,0] Assertion srcIndex < srcSelectDimSize failed.

CUDA initialization failure with error: 710. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
Segmentation fault (core dumped)

To Reproduce

from transformers import BertModel, BertTokenizer, BertConfig
import numpy as np
import torch
import torch_tensorrt
import time

print("VERSION:", torch_tensorrt.__version__)

# Creating a dummy input
test_batchsz = 128
tokens_tensor = torch.ones((test_batchsz, 20)).to(torch.int32).cuda()
segments_tensors = torch.zeros((test_batchsz, 20)).to(torch.int32).cuda()
mask_tensors = torch.ones((test_batchsz, 20)).to(torch.int32).cuda()

model = BertModel.from_pretrained("bert-base-chinese", torchscript=True)
torch_script_module = torch.jit.trace(model.eval().cuda(), (tokens_tensor, mask_tensors, segments_tensors))

trt_ts_module = torch_tensorrt.compile(torch_script_module.float(),
                        inputs= [torch_tensorrt.Input(shape=[test_batchsz, 20], dtype=torch.int32),
                        torch_tensorrt.Input(shape=[test_batchsz, 20], dtype=torch.int32),
                        torch_tensorrt.Input(shape=[test_batchsz, 20], dtype=torch.int32),
                        ], 
                        enabled_precisions= {torch.float},
                        workspace_size=2000000000,
                        truncate_long_and_double=True)

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Torch-TensorRT Version (e.g. 1.0.0): 1.2.0
PyTorch Version (e.g. 1.0): 1.12.1
CPU Architecture:
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, libtorch, source): pip
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version: 3.8
CUDA version: 11.6
GPU models and configuration:
Any other relevant information:

Additional context

The text was updated successfully, but these errors were encountered:

gs-olive · 2022-10-26T06:45:09Z

I can confirm the issue is still occurring on the latest commit (ce29cc), built with PyTorch 1.13, and I am investigating the cause. When only using the first two arguments (tokens_tensor and segments_tensors), the model is currently succeeding (compilation + inference) with my configuration, so it seems passing 3+ tensor arguments to the model is causing the error.

Mansterteddy · 2022-10-27T02:22:24Z

I can confirm the issue is still occurring on the latest commit (ce29cc), built with PyTorch 1.13, and I am investigating the cause. When only using the first two arguments (tokens_tensor and segments_tensors), the model is currently succeeding (compilation + inference) with my configuration, so it seems passing 3+ tensor arguments to the model is causing the error.

Yes, same for me.

- Issue arising when compiling BERT models with 3+ inputs - Added temporary fix by decreasing the range of allowed values to the random number generator for creating input tensors to [0,2), instead of [0,5) - Used random float inputs in the range [0, 2) instead of int, then casted to desired type. The ultimate effect of this change with regard to bug pytorch#1418, is random floats are selected in the range [0, 2), then casted to Int, effectively making the range of allowed ints {0, 1}, as required by the model - More robust fix to follow

Mansterteddy added the bug Something isn't working label Oct 25, 2022

Mansterteddy changed the title ~~🐛 [Bug] Encountered 710 error when apply Torch-TensorRT to BERT~~ 🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT Oct 25, 2022

narendasan assigned gs-olive Oct 25, 2022

valeriosofi mentioned this issue Oct 26, 2022

Encountered cuda 710 error when optimizing BERT with nebullvm nebuly-ai/optimate#100

Closed

narendasan assigned bowang007 Oct 27, 2022

gs-olive mentioned this issue Oct 28, 2022

fix: CUDA error 710 bugfix #1424

Merged

narendasan added the bug: triaged [verified] We can replicate the bug label Oct 28, 2022

gs-olive closed this as completed in #1424 Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT #1418

🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT #1418

Mansterteddy commented Oct 25, 2022 •

edited

Loading

gs-olive commented Oct 26, 2022 •

edited

Loading

Mansterteddy commented Oct 27, 2022

🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT #1418

🐛 [Bug] Encountered cuda 710 error when apply Torch-TensorRT to BERT #1418

Comments

Mansterteddy commented Oct 25, 2022 • edited Loading

Bug Description

To Reproduce

Environment

Additional context

gs-olive commented Oct 26, 2022 • edited Loading

Mansterteddy commented Oct 27, 2022

Mansterteddy commented Oct 25, 2022 •

edited

Loading

gs-olive commented Oct 26, 2022 •

edited

Loading