forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 58
Pull requests: HabanaAI/vllm-fork
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[HPU] Add mark_step configurable for the decoder layer.
#525
opened Nov 20, 2024 by
jiminha
Loading…
[BUG FIX] [SPEC DECODE] 0.6.4 rebase cause incorrectness in spec decode, fix in this PR
#523
opened Nov 19, 2024 by
xuechendi
Loading…
Resolved ALIBI bias regression due to porting flat PA
#503
opened Nov 15, 2024 by
tannervoas742
Loading…
[DO NOT MERGE] Upstream codebase diff
habana
Issues or PRs submitted by Habana Labs
#470
opened Nov 6, 2024 by
kzawora-intel
•
Draft
Add models-tiny CI step with Llama3.2-1B
habana
Issues or PRs submitted by Habana Labs
#440
opened Oct 28, 2024 by
kzawora-intel
•
Draft
Add HPU information to collect_env script
habana
Issues or PRs submitted by Habana Labs
#430
opened Oct 25, 2024 by
michalkuligowski
Loading…
[PoC] Add max padding ratio to padding aware scheduler
habana
Issues or PRs submitted by Habana Labs
#407
opened Oct 18, 2024 by
kzawora-intel
•
Draft
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.