whisper.transcribe() Fails on Multi-GPU Systems: Incorrect Tensor Transfer to CUDA Device #2382

jaimebs2 · 2024-10-10T10:50:00Z

jaimebs2
Oct 10, 2024

I’ve encountered a bug in the dtw_cuda() function located in timing.py. The issue arises when the cost tensor is sent to .cuda() without considering the device on which the input tensor x resides.

In the function dtw_cuda(), the cost tensor is being sent to .cuda() directly, which defaults to "cuda:0". This causes an issue if the tensor x is on a different device, such as "cuda:1", resulting in a device mismatch error. The cost tensor should be sent to the device where x is located (i.e., x.device) instead of assuming it is on "cuda:0".

If x is on "cuda:1" it will raise the following error:

ValueError: Pointer argument (at 2) cannot be accessed from Triton (cpu tensor?)

Solution:
Change dtw_cuda() function.

cost = cost.to(device=x.device)

lukebelbina · 2024-11-04T23:09:49Z

lukebelbina
Nov 4, 2024

I am also running into this issue

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whisper.transcribe() Fails on Multi-GPU Systems: Incorrect Tensor Transfer to CUDA Device #2382

{{title}}

Replies: 1 comment

{{title}}

Select a reply

whisper.transcribe() Fails on Multi-GPU Systems: Incorrect Tensor Transfer to CUDA Device #2382

jaimebs2 Oct 10, 2024

Replies: 1 comment

lukebelbina Nov 4, 2024

jaimebs2
Oct 10, 2024

lukebelbina
Nov 4, 2024