Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Titan X and Titan Xp support #473

Closed
kylekam opened this issue May 16, 2024 · 4 comments
Closed

Titan X and Titan Xp support #473

kylekam opened this issue May 16, 2024 · 4 comments

Comments

@kylekam
Copy link

kylekam commented May 16, 2024

I'm trying to finetune LLama 3 with the code sample provided in the notebook and installing using these instruction #73. Everything has gone smoothly but I'm running into this error:

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32

Here's the full log

(base) kyle@lab:~/DDS$  cd /home/kyle/DDS ; /usr/bin/env /home/kyle/anaconda3/envs/llm_finetune/bin/python /home/kyle/.vscode-server/extensions/ms-python.python-2024.4.1/python_files/lib/python/debugpy/adapter/../../debugpy/launcher 53243 -- /home/kyle/DDS/llm_fine_tune.py --hoi_path /datasets/video --max_epochs 1 --split train --is_training 
Creating VidOR dataloader for train split
/home/kyle/anaconda3/envs/llm_finetune/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
==((====))==  Unsloth: Fast Llama patching release 2024.5
   \\   /|    GPU: NVIDIA TITAN Xp. Max memory: 11.91 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.3.0. CUDA = 6.1. CUDA Toolkit = 11.8.
\        /    Bfloat16 = TRUE. Xformers = 0.0.26.post1. FA = False.
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.
Unsloth: unsloth/llama-3-8b-bnb-4bit has no tokenizer.model file.
Just informing you about this - this is not a critical error.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Unsloth 2024.5 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
max_steps is given, it will override any value given in num_train_epochs
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 2,009 | Num Epochs = 1
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 60
 "-____-"     Number of trainable parameters = 41,943,040
  0%|                                                                                                                                               | 0/60 [00:00<?, ?it/s]
LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32

I tested this on a Titan X and Xp and received the same error. I'm thinking the CUDA version displayed is a little odd because I know the system I'm working on returns 11.5 when I run "nvcc --version".

@kylekam
Copy link
Author

kylekam commented May 16, 2024

may be the same issue as this? #309

@kylekam
Copy link
Author

kylekam commented May 17, 2024

Titan X and Xp came out in 2015 and 2017, respectively, which means they're too old.

@kylekam kylekam closed this as completed May 17, 2024
@danielhanchen
Copy link
Contributor

Hmmm unsure actually - its possible you can try using an older Unsloth version which might work

@danielhanchen
Copy link
Contributor

But unsure - I remember some people managed to make it work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants