CUDA Error 222 - provided PTX was compiled with an unsupported toolchain #401

randombk · 2023-06-19T21:07:27Z

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Prerequisites

Expected Behavior

When using a Kaggle notebook with 2xT4 GPU, llama-cpp-python should work as expected.

Current Behavior

#python -m 'llama_cpp'

ggml_init_cublas: found 2 CUDA devices:
  Device 0: Tesla T4
  Device 1: Tesla T4
CUDA error 222 at /tmp/pip-install-2ecmu5o2/llama-cpp-python_284b4b67e8bf4aecb8c75b3d2715bc08/vendor/llama.cpp/ggml-cuda.cu:1501: the provided PTX was compiled with an unsupported toolchain.

Running llama.cpp directly works as expected.

Environment and Context

Free Kaggle notebook running with the 'T4 x2' GPU accelerator.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   44C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla T4            Off  | 00000000:00:05.0 Off |                    0 |
| N/A   45C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

$ python3 --version => Python 3.10.10
$ make --version => GNU Make 4.3
$ g++ --version => g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

Failure Information (for bugs)

Steps to Reproduce

I published a repro at https://www.kaggle.com/randombk/bug-llama-cpp-python-cuda-222-repro

The text was updated successfully, but these errors were encountered:

gjmulder · 2023-06-23T09:43:53Z

This looks like a nvcc NVidia compilation issue. PTX is a reference to CUDA compiled code.

gjmulder · 2023-07-10T06:55:01Z

Closing. Please reopen if the problem is reproducible with the latest llama-cpp-python which includes an updated llama.cpp

randombk · 2023-07-31T09:37:01Z

The issue still repros on 0.1.77. It only appears on Kaggle, so I suspect it's something to do with their notebook/conda setup.

For anyone else blocked by this, a hacky workaround is to manually compile llama.cpp as a library then copy the resulting file into llama-cpp-python:

cd ~
git clone --recursive https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
make LLAMA_CUBLAS=1 -j libllama.so

# HACK: Use custom compiled libllama.so
cp ~/llama.cpp/libllama.so /opt/conda/lib/python3.10/site-packages/llama_cpp/libllama.so

di-rse · 2023-08-02T20:10:28Z

I'm having this issue and the @randombk workaround doesn't work for me, it just gives a new error:
LLAMA_ASSERT: llama.cpp:1800: !!kv_self.ctx

The output of nvcc --version is:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0

Any suggestions would be greatly appreciated!

jiapei100 · 2023-08-08T06:19:10Z

@di-rse
Same here...
Please refer to #586

wu375 · 2023-08-11T08:28:12Z

@randombk
Would you mind sharing the version of the cuda toolkit you used to manually compile llama.cpp? I suppose you compiled it on a different machine, not with the nvcc 11.8 one you posted right?

Thank you so much!

desilinguist · 2023-08-11T15:39:40Z

FYI, I was running into this same issue but once I installed the actual CUDA version matching the version indicated at the top right in nvidia-smi output, the build worked great.

jiapei100 · 2023-08-11T19:53:28Z

@wu375 @desilinguist

FYI:

➜  ~ nvidia-smi
Fri Aug 11 12:53:05 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 2080 Ti     Off | 00000000:21:00.0 Off |                  N/A |
|  0%   46C    P8              17W / 300W |      8MiB / 11264MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 3090        Off | 00000000:49:00.0  On |                  N/A |
|  0%   53C    P8              28W / 350W |   1635MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      6244      G   /usr/lib/xorg/Xorg                            4MiB |
|    1   N/A  N/A      6244      G   /usr/lib/xorg/Xorg                          380MiB |
|    1   N/A  N/A      6497      G   /usr/bin/gnome-shell                         88MiB |
|    1   N/A  N/A     10763      G   ...6044373,14595055559140153217,262144      199MiB |
|    1   N/A  N/A   1302513      C   python                                      884MiB |
|    1   N/A  N/A   2873402      G   ...sion,SpareRendererForSitePerProcess       61MiB |
+---------------------------------------------------------------------------------------+
➜  ~ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0

desilinguist · 2023-08-11T23:15:32Z

It wasn't just nvcc though. I used conda to install all cuda packages of that specific version like so:

conda create -c https://conda.anaconda.org/nvidia/label/cuda-12.0.1 -n cudaenv cuda

jiapei100 · 2023-08-12T05:23:26Z

@desilinguist

I personally don't use conda any more...
But, it wouldn't make any difference if I use a virtual environment or not ...

YerongLi · 2023-09-02T05:07:29Z

libllama.so

Building libllama.so works for me.

csaben · 2023-09-23T01:16:53Z

this works for me in kaggle:

!git clone --recursive https://github.com/ggerganov/llama.cpp.git
# instead of: !LLAMA_CUBLAS=1 CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1  pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir
import os
os.chdir('llama.cpp')
!make LLAMA_CUBLAS=1 -j libllama.so

# # HACK: Use custom compiled libllama.so
!cp libllama.so /opt/conda/lib/python3.10/site-packages/llama_cpp/libllama.so

RachelShalom · 2023-11-06T20:04:07Z

same issue here

gjmulder added build hardware Hardware specific issue and removed hardware Hardware specific issue labels Jun 20, 2023

gjmulder changed the title ~~CUDA Error 222 when running in Kaggle notebook; Raw llama.cpp works without issue~~ CUDA Error 222 - provided PTX was compiled with an unsupported toolchain Jun 20, 2023

gjmulder added the hardware Hardware specific issue label Jun 23, 2023

gjmulder closed this as not planned Won't fix, can't repro, duplicate, stale Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA Error 222 - provided PTX was compiled with an unsupported toolchain #401

CUDA Error 222 - provided PTX was compiled with an unsupported toolchain #401

randombk commented Jun 19, 2023

gjmulder commented Jun 23, 2023

gjmulder commented Jul 10, 2023

randombk commented Jul 31, 2023

di-rse commented Aug 2, 2023

jiapei100 commented Aug 8, 2023

wu375 commented Aug 11, 2023 •

edited

Loading

desilinguist commented Aug 11, 2023 •

edited

Loading

jiapei100 commented Aug 11, 2023

desilinguist commented Aug 11, 2023

jiapei100 commented Aug 12, 2023

YerongLi commented Sep 2, 2023

csaben commented Sep 23, 2023

RachelShalom commented Nov 6, 2023

CUDA Error 222 - provided PTX was compiled with an unsupported toolchain #401

CUDA Error 222 - provided PTX was compiled with an unsupported toolchain #401

Comments

randombk commented Jun 19, 2023

Prerequisites

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

gjmulder commented Jun 23, 2023

gjmulder commented Jul 10, 2023

randombk commented Jul 31, 2023

di-rse commented Aug 2, 2023

jiapei100 commented Aug 8, 2023

wu375 commented Aug 11, 2023 • edited Loading

desilinguist commented Aug 11, 2023 • edited Loading

jiapei100 commented Aug 11, 2023

desilinguist commented Aug 11, 2023

jiapei100 commented Aug 12, 2023

YerongLi commented Sep 2, 2023

csaben commented Sep 23, 2023

RachelShalom commented Nov 6, 2023

wu375 commented Aug 11, 2023 •

edited

Loading

desilinguist commented Aug 11, 2023 •

edited

Loading