Skip to content

Commit

Permalink
fix qwen2 attention_mask slice (#12307)
Browse files Browse the repository at this point in the history
  • Loading branch information
MeouSker77 authored Oct 31, 2024
1 parent 3df6195 commit b9853f9
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions python/llm/src/ipex_llm/transformers/models/qwen2.py
Original file line number Diff line number Diff line change
Expand Up @@ -560,6 +560,9 @@ def qwen2_attention_forward(
if past_key_value is not None:
kv_seq_len += past_key_value.get_usable_length(kv_seq_len, self.layer_idx)

if attention_mask is not None:
attention_mask = attention_mask[:, :, :, :kv_seq_len]

if should_use_fuse_rope(hidden_states, position_ids, self.training):
import xe_addons
xe_addons.rotary_half_inplaced(self.rotary_emb.inv_freq, position_ids,
Expand Down

0 comments on commit b9853f9

Please sign in to comment.