Skip to content

[ Bugfix ] Enabling Loading Models With Fused QKV/MLP on Disk with FP8#5921

Merged
robertgshaw2-redhat merged 4 commits intovllm-project:mainfrom neuralmagic:fp8-phiJun 28, 2024

Commits

Commits on Jun 27, 2024

Commits on Jun 28, 2024