-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCCL error #1726
Comments
Hi @maxmelichov, in my experience, the error happened when using old version of pytorch. Please make sure to use |
same issue in cuda 11.8 , torch2.1.0+cu118 |
same issue in cuda 12.1, torch 2.1.1 + cu121, did u solve it ? |
occur the same problem |
same error. |
Same issue for me after I updated to 0.2.2 with Pytorch '2.1.1+cu121'. |
same issue |
pip list | grep nccl |
THX, solved |
This solved it for me as well. Thanks! |
It works for me, too. Thanks! |
solved the issue for me. torch 2.1.2 |
Thx, I love you! |
Same error in env. Could you please give some advice?
|
@songkq I got the same issue. Have you solved it? |
Not yet. @WoosukKwon Could you please give some advice for this issue? |
@jinfengfeng Solved by upgrading to vllm==0.3.2 |
@songkq maybe you should retry install nccl. url:https://developer.nvidia.com/nccl/nccl-legacy-downloads |
Thanks for the suggestion @ywglf.
|
only nvidia-nccl is necessary |
what's vllm-nccl-cu12 for? |
NCCL 2.19 (which was the new default with PyTorch 2.2) was using much more memory than NCCL 2.18 so we pinned NCCL and proceeded with the PyTorch 2.2 upgrade. A newer workaround has since been found so You can read:
|
I'm trying to load model into LLM(model="meta-llama/Llama-2-7b-chat-hf") and I'm getting the error below
The text was updated successfully, but these errors were encountered: