[v0.3.1] Release Tracker #2859

WoosukKwon · 2024-02-13T23:36:38Z

ETA: Feb 14-16 th

Major changes

TBD

PRs to be merged before the release

Don't use cupy NCCL for AMD backends #2855
Fix docker python version #2845
~~Support per-request seed #2514~~
Ensure memory release when LLM class is deleted. [BugFix] Fix GC bug for LLM class #2882
Fix AttributeError: MixtralModel object has no attribute org_vocab_size. #2875 Align LoRA code between Mistral and Mixtral (fixes #2875) #2880

The text was updated successfully, but these errors were encountered:

WoosukKwon · 2024-02-13T23:37:13Z

@simon-mo Please feel free to add more!

simon-mo · 2024-02-14T00:20:01Z

I would really like #2804 but it seems to be blocked by either FlashInfer or other libraries

WoosukKwon · 2024-02-14T00:44:56Z

@simon-mo I think the main concern here is AMD because the ROCm xformers patch uses xformers==0.0.23 while we're upgrading to xformers==0.0.24 for CUDA.

simon-mo · 2024-02-14T01:24:47Z

I see. We might be able to distribute different version with varying version pins...

tutu329 · 2024-02-14T02:45:48Z

need supporting miqu-1-70b-sf-gptq. thanks a lot！

umarbutler · 2024-02-14T03:54:38Z

I would like to see a fix to #2795. I and two other users have been unable to use the latest version of vLLM with Ray, however, it works perfectly well after downgrading to the previous version.

casper-hansen · 2024-02-14T09:40:25Z

#2761 brings back support for quantized MoE models like Mixtral/Deepseek. Also brings a great speedup (2-3x).

Possible to include it in the next release so that quantized models are not broken?

simon-mo · 2024-02-14T18:25:41Z

@umarbutler In this release, we will disable to the custom all reduce, which should address #2795.
@casper-hansen We should definitely get this PR in in next two weeks, but we need to cut a release today to address critical issues.

pcmoritz · 2024-02-14T22:33:00Z

We will need to include #2875 in the release as well

WoosukKwon · 2024-02-14T22:34:06Z

@pcmoritz Added. Thanks!

WoosukKwon · 2024-02-15T06:58:42Z

@caoshiyi Would it be possible to merge #2517 before the release?

caoshiyi · 2024-02-15T07:03:17Z

@WoosukKwon Sorry for the delay. Will address the comments tonight.

WoosukKwon added the release Related to new version release label Feb 13, 2024

robertgshaw2-neuralmagic mentioned this issue Feb 14, 2024

Missing prometheus metrics in 0.3.0 #2850

Closed

WoosukKwon mentioned this issue Feb 15, 2024

Bump up to v0.3.1 #2887

Merged

WoosukKwon closed this as completed in #2887 Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v0.3.1] Release Tracker #2859

[v0.3.1] Release Tracker #2859

WoosukKwon commented Feb 13, 2024 •

edited

Loading

WoosukKwon commented Feb 13, 2024

simon-mo commented Feb 14, 2024

WoosukKwon commented Feb 14, 2024

simon-mo commented Feb 14, 2024

tutu329 commented Feb 14, 2024

umarbutler commented Feb 14, 2024

casper-hansen commented Feb 14, 2024

simon-mo commented Feb 14, 2024

pcmoritz commented Feb 14, 2024

WoosukKwon commented Feb 14, 2024

WoosukKwon commented Feb 15, 2024

caoshiyi commented Feb 15, 2024

[v0.3.1] Release Tracker #2859

[v0.3.1] Release Tracker #2859

Comments

WoosukKwon commented Feb 13, 2024 • edited Loading

Major changes

PRs to be merged before the release

WoosukKwon commented Feb 13, 2024

simon-mo commented Feb 14, 2024

WoosukKwon commented Feb 14, 2024

simon-mo commented Feb 14, 2024

tutu329 commented Feb 14, 2024

umarbutler commented Feb 14, 2024

casper-hansen commented Feb 14, 2024

simon-mo commented Feb 14, 2024

pcmoritz commented Feb 14, 2024

WoosukKwon commented Feb 14, 2024

WoosukKwon commented Feb 15, 2024

caoshiyi commented Feb 15, 2024

WoosukKwon commented Feb 13, 2024 •

edited

Loading