[Feature]: multi-lora support older nvidia gpus. #6123

wuisawesome · 2024-07-04T00:04:57Z

🚀 The feature, motivation and pitch

Currently vLLM only supports LoRA adapters on nvidia gpus with compute capability >= 8.0. This request is to support >= 7.5.

The limitation here is that vLLM relies on https://github.com/punica-ai/punica for efficient LoRA and the upstream doesn't support older gpus.

Personally I've mainly run into this problem on Kaggle which requires you to run on T4s or older. Others seem to have run into this problem in other environments. Collab: #5199, other V100s #3826

Alternatives

In some but not all cases this can be mitigated by using a newer gpu or applying the lora to the base model and model swapping.

Additional context

I'm willing to contribute this. I've prototyped this and verified that it's possible to do this efficiently by changing the step of vllm's wheel build which builds the vendored punica kernel.

wuisawesome · 2024-07-04T00:20:49Z

I noticed that when building the vendored punica the issues were all related to bf16 arithmetic operations not being defined in cuda 12.1. Building against a newer cuda version (12.4) which has headers that define these operations fixed the problems.

Note that I'm not sure if building the kernels against cuda 12.4 is desirable/a good engineering practice if we want to support cuda 12.1 still. If that's the case, we can probably vendor the relevant code from cuda (though i don't have a sense of how complicated this would be).

jeejeelee · 2024-07-04T01:17:30Z

#5036 are working on addressing the issue you mentioned

shimu007 · 2024-07-22T10:05:41Z

What shall I do? Using V100 keeps reporting errors

jeejeelee · 2024-07-23T03:57:21Z

What shall I do? Using V100 keeps reporting errors

Are you testing #5036?

shimu007 · 2024-07-23T03:57:50Z

这是来自QQ邮箱的假期自动回复邮件。来信我已收到，我会尽快给你回复，谢谢！！！

Cloopen-ReLiNK · 2024-08-12T03:45:06Z

What shall I do? Using V100 keeps reporting errors

Are you testing #5036?

The vllm installed using the pr5036 source code was OK before, but 0.5.4 reported an error
@jeejeelee
V100 + lora /tmp/tmp8uy71zq3/main.c:6:23: fatal error: stdatomic.h: No such file or directory

shimu007 · 2024-08-12T03:45:44Z

这是来自QQ邮箱的假期自动回复邮件。来信我已收到，我会尽快给你回复，谢谢！！！

github-actions · 2024-11-12T01:58:53Z

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

shimu007 · 2024-11-12T01:59:25Z

这是来自QQ邮箱的假期自动回复邮件。来信我已收到，我会尽快给你回复，谢谢！！！

shimu007 · 2024-12-04T06:10:13Z

这是来自QQ邮箱的假期自动回复邮件。来信我已收到，我会尽快给你回复，谢谢！！！

wuisawesome added the feature request label Jul 4, 2024

github-actions bot added the stale label Nov 12, 2024

github-actions bot added unstale and removed stale labels Nov 14, 2024

jeejeelee closed this as completed Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: multi-lora support older nvidia gpus. #6123

[Feature]: multi-lora support older nvidia gpus. #6123

wuisawesome commented Jul 4, 2024

wuisawesome commented Jul 4, 2024

jeejeelee commented Jul 4, 2024

shimu007 commented Jul 22, 2024

jeejeelee commented Jul 23, 2024

shimu007 commented Jul 23, 2024 via email

Cloopen-ReLiNK commented Aug 12, 2024

shimu007 commented Aug 12, 2024 via email

github-actions bot commented Nov 12, 2024

shimu007 commented Nov 12, 2024 via email

shimu007 commented Dec 4, 2024 via email

[Feature]: multi-lora support older nvidia gpus. #6123

[Feature]: multi-lora support older nvidia gpus. #6123

Comments

wuisawesome commented Jul 4, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

wuisawesome commented Jul 4, 2024

jeejeelee commented Jul 4, 2024

shimu007 commented Jul 22, 2024

jeejeelee commented Jul 23, 2024

shimu007 commented Jul 23, 2024 via email

Cloopen-ReLiNK commented Aug 12, 2024

shimu007 commented Aug 12, 2024 via email

github-actions bot commented Nov 12, 2024

shimu007 commented Nov 12, 2024 via email

shimu007 commented Dec 4, 2024 via email