This repository has been archived by the owner on Oct 11, 2024. It is now read-only.
Upstream sync 2024 03 14#127
Merged
robertgshaw2-neuralmagic merged 114 commits intomain from upstream-sync-2024-03-14Mar 15, 2024
+3,781-1,422
Commits
Commits on Feb 22, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Feb 23, 2024
Commits on Feb 25, 2024
Commits on Feb 26, 2024
Commits on Feb 27, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Feb 28, 2024
- authored
- authored
- authored
- authored
Commits on Feb 29, 2024
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Mar 1, 2024
- authored
- authored
- authored
- authored
- authored
Commits on Mar 2, 2024
Commits on Mar 4, 2024
- authored
- authored
- authored
Commits on Mar 5, 2024
Commits on Mar 6, 2024
- authored
- authored
- authored
Commits on Mar 7, 2024
Commits on Mar 8, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Mar 11, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Mar 12, 2024
Commits on Mar 13, 2024
- authored
- authored
- authored
- authored
- authored
Add missing kernel for CodeLlama-34B on A/H100 (no tensor parallelism) when using Multi-LoRA. (vllm-project#3350)
authored- authored
- authored
- authored
Commits on Mar 14, 2024
- authored
- authored
[Kernel] change benchmark script so that result can be directly used; tune moe kernel in A100/H100 with tp=2,4,8 (vllm-project#3389)
authored- authored
- authored
- authored
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Mar 15, 2024
- committed
- committed
- committed
- committed
- committed
- authored