Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA Error 222 - provided PTX was compiled with an unsupported toolchain #401

Closed
4 tasks done
randombk opened this issue Jun 19, 2023 · 13 comments
Closed
4 tasks done
Labels
build hardware Hardware specific issue

Comments

@randombk
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Prerequisites

Expected Behavior

When using a Kaggle notebook with 2xT4 GPU, llama-cpp-python should work as expected.

Current Behavior

#python -m 'llama_cpp'

ggml_init_cublas: found 2 CUDA devices:
  Device 0: Tesla T4
  Device 1: Tesla T4
CUDA error 222 at /tmp/pip-install-2ecmu5o2/llama-cpp-python_284b4b67e8bf4aecb8c75b3d2715bc08/vendor/llama.cpp/ggml-cuda.cu:1501: the provided PTX was compiled with an unsupported toolchain.

Running llama.cpp directly works as expected.

Environment and Context

Free Kaggle notebook running with the 'T4 x2' GPU accelerator.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   44C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla T4            Off  | 00000000:00:05.0 Off |                    0 |
| N/A   45C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
$ python3 --version => Python 3.10.10
$ make --version => GNU Make 4.3
$ g++ --version => g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

Failure Information (for bugs)

Steps to Reproduce

I published a repro at https://www.kaggle.com/randombk/bug-llama-cpp-python-cuda-222-repro

@gjmulder gjmulder added build hardware Hardware specific issue and removed hardware Hardware specific issue labels Jun 20, 2023
@gjmulder gjmulder changed the title CUDA Error 222 when running in Kaggle notebook; Raw llama.cpp works without issue CUDA Error 222 - provided PTX was compiled with an unsupported toolchain Jun 20, 2023
@gjmulder
Copy link
Contributor

This looks like a nvcc NVidia compilation issue. PTX is a reference to CUDA compiled code.

@gjmulder gjmulder added the hardware Hardware specific issue label Jun 23, 2023
@gjmulder
Copy link
Contributor

Closing. Please reopen if the problem is reproducible with the latest llama-cpp-python which includes an updated llama.cpp

@gjmulder gjmulder closed this as not planned Won't fix, can't repro, duplicate, stale Jul 10, 2023
@randombk
Copy link
Author

The issue still repros on 0.1.77. It only appears on Kaggle, so I suspect it's something to do with their notebook/conda setup.

For anyone else blocked by this, a hacky workaround is to manually compile llama.cpp as a library then copy the resulting file into llama-cpp-python:

cd ~
git clone --recursive https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
make LLAMA_CUBLAS=1 -j libllama.so

# HACK: Use custom compiled libllama.so
cp ~/llama.cpp/libllama.so /opt/conda/lib/python3.10/site-packages/llama_cpp/libllama.so

@di-rse
Copy link

di-rse commented Aug 2, 2023

I'm having this issue and the @randombk workaround doesn't work for me, it just gives a new error:
LLAMA_ASSERT: llama.cpp:1800: !!kv_self.ctx

The output of nvcc --version is:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0

Any suggestions would be greatly appreciated!

@jiapei100
Copy link

@di-rse
Same here...
Please refer to #586

@wu375
Copy link

wu375 commented Aug 11, 2023

@randombk
Would you mind sharing the version of the cuda toolkit you used to manually compile llama.cpp? I suppose you compiled it on a different machine, not with the nvcc 11.8 one you posted right?

Thank you so much!

@desilinguist
Copy link

desilinguist commented Aug 11, 2023

FYI, I was running into this same issue but once I installed the actual CUDA version matching the version indicated at the top right in nvidia-smi output, the build worked great.

@jiapei100
Copy link

@wu375 @desilinguist

FYI:

~ nvidia-smi
Fri Aug 11 12:53:05 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 2080 Ti     Off | 00000000:21:00.0 Off |                  N/A |
|  0%   46C    P8              17W / 300W |      8MiB / 11264MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 3090        Off | 00000000:49:00.0  On |                  N/A |
|  0%   53C    P8              28W / 350W |   1635MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      6244      G   /usr/lib/xorg/Xorg                            4MiB |
|    1   N/A  N/A      6244      G   /usr/lib/xorg/Xorg                          380MiB |
|    1   N/A  N/A      6497      G   /usr/bin/gnome-shell                         88MiB |
|    1   N/A  N/A     10763      G   ...6044373,14595055559140153217,262144      199MiB |
|    1   N/A  N/A   1302513      C   python                                      884MiB |
|    1   N/A  N/A   2873402      G   ...sion,SpareRendererForSitePerProcess       61MiB |
+---------------------------------------------------------------------------------------+~ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

@desilinguist
Copy link

It wasn't just nvcc though. I used conda to install all cuda packages of that specific version like so:

conda create -c https://conda.anaconda.org/nvidia/label/cuda-12.0.1 -n cudaenv cuda

@jiapei100
Copy link

@desilinguist

I personally don't use conda any more...
But, it wouldn't make any difference if I use a virtual environment or not ...

@YerongLi
Copy link

YerongLi commented Sep 2, 2023

libllama.so

Building libllama.so works for me.

@csaben
Copy link

csaben commented Sep 23, 2023

this works for me in kaggle:

!git clone --recursive https://github.com/ggerganov/llama.cpp.git
# instead of: !LLAMA_CUBLAS=1 CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1  pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir
import os
os.chdir('llama.cpp')
!make LLAMA_CUBLAS=1 -j libllama.so

# # HACK: Use custom compiled libllama.so
!cp libllama.so /opt/conda/lib/python3.10/site-packages/llama_cpp/libllama.so

@RachelShalom
Copy link

same issue here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build hardware Hardware specific issue
Projects
None yet
Development

No branches or pull requests

9 participants