Add multi-lora support for Triton vLLM backend#23

Merged

oandreeva-nv merged 28 commits intotriton-inference-server:mainfrom l1cacheDell:main

Apr 18, 2024

+762-9

Commits on Nov 28, 2023

add lora support for backend
l1cacheDell
committed

Commits on Nov 29, 2023

Commits on Nov 30, 2023

bug fix
l1cacheDell
committed

Commits on Dec 11, 2023

Merge branch 'triton-inference-server:main' into main
l1cacheDell
authored

Commits on Dec 28, 2023

CodeReview: remove comment and update docs
l1cacheDell
committed

Commits on Dec 29, 2023

bug fix: non-graceful terminate
l1cacheDell
committed

Commits on Dec 30, 2023

update docs to specify container version
l1cacheDell
committed

Commits on Jan 30, 2024

Commits on Jan 31, 2024

Commits on Mar 2, 2024

Commits on Mar 3, 2024

update client_lora.py for main branch recent commits
l1cacheDell
committed

Commits on Mar 5, 2024

Commits on Mar 13, 2024

fix client_lora process_stream
l1cacheDell
committed

Commits on Mar 15, 2024

add ci test for multi-lora
l1cacheDell
committed

Commits on Apr 9, 2024

Commits on Apr 10, 2024

Commits on Apr 11, 2024

Merge branch 'triton-inference-server:main' into main
l1cacheDell
authored

Commits on Apr 13, 2024