Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
[ROCm][Hardware][AMD] Adding Navi21 to fallback to naive attention if…
Browse files Browse the repository at this point in the history
… Triton is not used (vllm-project#4658)
  • Loading branch information
alexeykondrat authored and robertgshaw2-redhat committed May 19, 2024
1 parent b1a73b5 commit 3bbe65e
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions vllm/attention/backends/rocm_flash_attn.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,8 +231,9 @@ def __init__(
self.attn_func = triton_attention
logger.debug("Using Triton FA in ROCmBackend")
else:
# if not using triton, navi3x not use flash-attn either
if torch.cuda.get_device_capability()[0] == 11:
# if not using triton, navi3x/navi21/navi10 do not use flash-attn
# either
if torch.cuda.get_device_capability()[0] != 9:
self.use_naive_attn = True
else:
try:
Expand Down

0 comments on commit 3bbe65e

Please sign in to comment.