You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run torch_tensorrt.compile with t5-base model as input, using fp32 precision.
Choose two fixed-size inputs of shape [1, 128] and [1, 128] and enable truncate_long_and_double with 12 GB workspace.
Pass in model keyword args to disable attention and hidden state outputs
Run inference using the compiled model on two sample inputs.
Expected behavior
Model should successfully compile with Torch-TRT. Specifically, internal device mismatch issues should either be addressed with a warning at compile time, or should otherwise not cause errors.
Environment
Torch-TensorRT Version: 1.4.0.dev0+f43be5b6
PyTorch Version: 1.14.0.dev20221114+cu116
CPU Architecture: Intel Xeon CPU
OS: Ubuntu 20.04
How you installed PyTorch: pip
Build command you used: python setup.py develop
Are you using local sources or building from archives: local
Python version: 3.8.13
CUDA version: 11.6
Additional context
The problem seems related to #1416 which was intended to address device mismatch issues of this sort. Since this case is not caught by that PR, it likely arises in a different area, for example as a result of an internal computation in a Torch block.
The text was updated successfully, but these errors were encountered:
Root cause is related to various model-internal auxiliary tensors being initialized on CPU. Running model.cuda() and putting both input tensors on GPU resolves the compilation issue.
This model is one operator away from full TensorRT support (only requiring aten::full_like), however full compilation is not currently functional since the model outputs are in Tuple form which is not currently supported by Torch-TensorRT, and could warrant a new feature, as in #629.
Bug Description
When compiling the T5-base network (https://huggingface.co/t5-base), the following error is encountered:
To Reproduce
Steps to reproduce the behavior:
t5-base
model as input, using fp32 precision.Expected behavior
Model should successfully compile with Torch-TRT. Specifically, internal device mismatch issues should either be addressed with a warning at compile time, or should otherwise not cause errors.
Environment
python setup.py develop
Additional context
The problem seems related to #1416 which was intended to address device mismatch issues of this sort. Since this case is not caught by that PR, it likely arises in a different area, for example as a result of an internal computation in a Torch block.
The text was updated successfully, but these errors were encountered: