This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Upstream sync 2024 03 14#127

Merged

robertgshaw2-neuralmagic merged 114 commits intomainfrom upstream-sync-2024-03-14

Mar 15, 2024

+3,781-1,422

Commits on Feb 22, 2024

Commits on Feb 23, 2024

[Fix] Fissertion on YaRN model len (vllm-project#2984 )
WoosukKwon
authored

Commits on Feb 25, 2024

Port metrics from aioprometheus to prometheus_client (vllm-project#2730 )
hmellor
authored

Commits on Mar 2, 2024

Reorder kv dtype check to avoid nvcc not found error on AMD platform (vllm-project#3104 )
cloudhan
authored
Add Automatic Prefix Caching (vllm-project#2762 )

authored

Commits on Mar 3, 2024

Commits on Mar 4, 2024

Commits on Mar 5, 2024

Commits on Mar 9, 2024

Commits on Mar 10, 2024

Commits on Mar 12, 2024

docs: Add BentoML deployment doc (vllm-project#3336 )
Sherlock113
authored

Commits on Mar 13, 2024

Commits on Mar 14, 2024