-
Notifications
You must be signed in to change notification settings - Fork 203
Pull requests: mit-han-lab/llm-awq
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Replace FasterTransformers like KV cache layout and kernel with flash attention for better support for longer sequence
#239
opened Nov 16, 2024 by
JerryGJX
Loading…
Suggest: Add Bayesian optimization support for ratio search
#104
opened Oct 26, 2023 by
trotsky1997
Loading…
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.