How to fix install cuda12.1, python=3.9, flash-atten=2.3.2 #598

batman-do · 2023-10-11T07:57:53Z

can u suggest to me solution for fix error that?, thank you

tridao · 2023-10-11T17:08:36Z

Can you check if you can download that wheel manually (e.g. with wget)?
I havne't seen the error "invalid cross-device link". Do you have write permission to /tmp?

batman-do · 2023-10-12T02:56:19Z

Can you check if you can download that wheel manually (e.g. with wget)? I havne't seen the error "invalid cross-device link". Do you have write permission to /tmp?

I exported ~/tmp but meet error:

how to fix that ?

tridao · 2023-10-12T03:00:28Z

Do you have write permission to /home/dodx/tmp?
I haven't seen this error but that's what I'm guessing. The setup script downloads the wheel and copies to $TMP, and it's running into problem at the copy step.

batman-do · 2023-10-12T03:44:06Z

Do you have write permission to /home/dodx/tmp? I haven't seen this error but that's what I'm guessing. The setup script downloads the wheel and copies to $TMP, and it's running into problem at the copy step.

yes , I has just fix , I can run but I have a question

Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm

so I use rtx 3090 don't use layer_norm right

tridao · 2023-10-12T03:45:38Z

You can try the layer_norm, I think it should work but I haven't tested extensively on 3080.

batman-do · 2023-10-12T03:46:15Z

You can try the layer_norm, I think it should work but I haven't tested extensively on 3080.

thanks u so much for reply me :))

YuehChuan · 2023-10-12T12:28:06Z

@batman-do see this
#595

python3.10
https://www.python.org/downloads/release/python-3100/
win11

python -m venv venv

cd venc/Scripts
activate
-----------------------

git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention

pip install packaging 
pip install wheel

set MAX_JOBS=4
python setup.py install

batman-do · 2023-10-14T04:17:35Z

@batman-do see this #595

python3.10
https://www.python.org/downloads/release/python-3100/
win11

python -m venv venv

cd venc/Scripts
activate
-----------------------

git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention

pip install packaging 
pip install wheel

set MAX_JOBS=4
python setup.py install

I got error this
`running bdist_egg
running egg_info
writing flash_attn.egg-info/PKG-INFO
writing dependency_links to flash_attn.egg-info/dependency_links.txt
writing requirements to flash_attn.egg-info/requires.txt
writing top-level names to flash_attn.egg-info/top_level.txt
reading manifest file 'flash_attn.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '.cu' under directory 'flash_attn'
warning: no files found matching '.h' under directory 'flash_attn'
warning: no files found matching '.cuh' under directory 'flash_attn'
warning: no files found matching '.cpp' under directory 'flash_attn'
warning: no files found matching '*.hpp' under directory 'flash_attn'
adding license file 'LICENSE'
adding license file 'AUTHORS'
writing manifest file 'flash_attn.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/utils/cpp_extension.py:424: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'flash_attn_2_cuda' extension
Emitting ninja build file /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/build.ninja...
Compiling objects...
Using envvar MAX_JOBS (4) as the number of workers...
[1/49] /usr/local/cuda/bin/nvcc -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o
/usr/local/cuda/bin/nvcc -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/Exceptions.h(56): error: identifier "cusparseStatus_t" is undefined
const char *cusparseGetErrorString(cusparseStatus_t status);
^

/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h(76): error: identifier "cusparseHandle_t" is undefined
attribute((visibility("default"))) cusparseHandle_t getCurrentCUDASparseHandle();
^

2 errors detected in the compilation of "/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu".
[2/49] /usr/local/cuda/bin/nvcc -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.cu -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.o
/usr/local/cuda/bin/nvcc -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.cu -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/Exceptions.h(56): error: identifier "cusparseStatus_t" is undefined
const char *cusparseGetErrorString(cusparseStatus_t status);
^

/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h(76): error: identifier "cusparseHandle_t" is undefined
attribute((visibility("default"))) cusparseHandle_t getCurrentCUDASparseHandle();
^

2 errors detected in the compilation of "/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.cu".
[3/49] /usr/local/cuda/bin/nvcc -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.cu -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.o
/usr/local/cuda/bin/nvcc -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.cu -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/Exceptions.h(56): error: identifier "cusparseStatus_t" is undefined
const char *cusparseGetErrorString(cusparseStatus_t status);
^

/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h(76): error: identifier "cusparseHandle_t" is undefined
attribute((visibility("default"))) cusparseHandle_t getCurrentCUDASparseHandle();
^

2 errors detected in the compilation of "/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.cu".
[4/49] c++ -MMD -MF /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o.d -pthread -B /data/dodx/anaconda3/envs/flash_attention/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /data/dodx/anaconda3/envs/flash_attention/include -fPIC -O2 -isystem /data/dodx/anaconda3/envs/flash_attention/include -fPIC -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/flash_api.cpp -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
FAILED: /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o
c++ -MMD -MF /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o.d -pthread -B /data/dodx/anaconda3/envs/flash_attention/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /data/dodx/anaconda3/envs/flash_attention/include -fPIC -O2 -isystem /data/dodx/anaconda3/envs/flash_attention/include -fPIC -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src -I/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/cutlass/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/TH -I/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/data/dodx/anaconda3/envs/flash_attention/include/python3.10 -c -c /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/flash_api.cpp -o /data/dodx/GenerateAI/test_LLM_local/flash-attention/build/temp.linux-x86_64-cpython-310/csrc/flash_attn/flash_api.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0
In file included from /data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h:22,
from /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/flash_api.cpp:8:
/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/Exceptions.h:56:36: error: ‘cusparseStatus_t’ was not declared in this scope; did you mean ‘cublasStatus_t’?
56 | const char cusparseGetErrorString(cusparseStatus_t status);
| ^~~~~~~~~~~~~~~~
| cublasStatus_t
In file included from /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/flash_api.cpp:8:
/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h:76:20: error: ‘cusparseHandle_t’ does not name a type; did you mean ‘cublasHandle_t’?
76 | TORCH_CUDA_CPP_API cusparseHandle_t getCurrentCUDASparseHandle();
| ^~~~~~~~~~~~~~~~
| cublasHandle_t
/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/flash_api.cpp: In function ‘void set_params_fprop(Flash_fwd_params&, size_t, size_t, size_t, size_t, size_t, size_t, size_t, size_t, size_t, at::Tensor, at::Tensor, at::Tensor, at::Tensor, void, void*, void*, void*, float, float, int, int)’:
/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/flash_api.cpp:47:38: warning: ‘void* memset(void*, int, size_t)’ clearing an object of non-trivial type ‘struct Flash_fwd_params’; use assignment or value-initialization instead [-Wclass-memaccess]
47 | memset(&params, 0, sizeof(params));
| ^
In file included from /data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/flash_api.cpp:13:
/data/dodx/GenerateAI/test_LLM_local/flash-attention/csrc/flash_attn/src/flash.h:51:8: note: ‘struct Flash_fwd_params’ declared here
51 | struct Flash_fwd_params : public Qkv_params {
| ^~~~~~~~~~~~~~~~
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
subprocess.run(
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '4']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/data/dodx/GenerateAI/test_LLM_local/flash-attention/setup.py", line 288, in
setup(
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/init.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/command/install.py", line 80, in run
self.do_egg_install()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/command/install.py", line 129, in do_egg_install
self.run_command('bdist_egg')
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/command/bdist_egg.py", line 164, in run
cmd = self.call_command('install_lib', warn_dir=0)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command
self.run_command(cmdname)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/command/install_lib.py", line 11, in run
self.build()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/command/install_lib.py", line 111, in build
self.run_command('build_ext')
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
_build_ext.run(self)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
build_ext.build_extensions(self)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
self._build_extensions_serial()
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
self.build_extension(ext)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
_build_ext.build_extension(self, ext)
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
objects = self.compiler.compile(
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension`

@YuehChuan

batman-do · 2023-10-14T04:22:21Z

i use MAX_JOBS=4 pip install flash-attn --no-build-isolation alterative build from source

batman-do · 2023-10-14T04:28:03Z

Hi @tridao , why I install layer_norm don't it respond, stopped like this

tridao · 2023-10-14T04:29:13Z

It probably takes a very long time if you don't have ninja or lots of CPU cores to compile. You don't have to use that extension.

batman-do · 2023-10-15T16:12:51Z

It probably takes a very long time if you don't have ninja or lots of CPU cores to compile. You don't have to use that extension.

Thanks @tridao , I will try maybe later

Batwho · 2023-10-20T09:14:41Z

@batman-do Hi, I got the exact same bug when trying pip install flash-attn==2.0.4 --no-build-isolation. How did you solve your problem eventually?

YuehChuan · 2023-10-26T00:24:04Z

@batman-do
According to
/data/dodx/anaconda3/envs/flash_attention/lib/python3.10/site-packages/torch/include/ATen/cuda/CUDAContext.h(76): error: identifier "cusparseHandle_t" is undefined

It seems that CUDA library cusparseHandle_t not locate properly.

I am using venv virtual environment not anaconda.

An do make sure
pytorch 2.2.0 with cuda 12 installed in your environment.
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121

And also, layerNorm is deprecate in flash attention2, no need to install.

CliuGeek9229 · 2023-10-30T11:32:11Z

use shutil.move(wheel_filename, wheel_path) instead os.rename(src, dst) in setup.py

SingL3 · 2023-11-09T07:05:31Z

use shutil.move(wheel_filename, wheel_path) instead os.rename(src, dst) in setup.py

Thanks! It works for me.

See Dao-AILab#598

drzraf · 2024-08-30T05:09:02Z

It keeps affecting users. Shouldn't have been closed. I guess it happens when /tmp/ or other pip cache directories are on different filesystems or tmpfs. shutil.move() does the trick, this should be changed.

batman-do closed this as completed Oct 11, 2023

batman-do reopened this Oct 11, 2023

batman-do closed this as completed Oct 15, 2023

Revliter mentioned this issue Mar 21, 2024

ERROR: Could not build wheels for flash_attn, which is required to install pyproject.toml-based projects #875

Open

lematt1991 added a commit to lematt1991/flash-attention that referenced this issue Jun 10, 2024

shutil move instead of rename in setup.py

4947efe

See Dao-AILab#598

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to fix install cuda12.1, python=3.9, flash-atten=2.3.2 #598

How to fix install cuda12.1, python=3.9, flash-atten=2.3.2 #598

batman-do commented Oct 11, 2023 •

edited

Loading

tridao commented Oct 11, 2023

batman-do commented Oct 12, 2023

tridao commented Oct 12, 2023

batman-do commented Oct 12, 2023

tridao commented Oct 12, 2023

batman-do commented Oct 12, 2023

YuehChuan commented Oct 12, 2023

batman-do commented Oct 14, 2023 •

edited

Loading

batman-do commented Oct 14, 2023

batman-do commented Oct 14, 2023

tridao commented Oct 14, 2023

batman-do commented Oct 15, 2023

Batwho commented Oct 20, 2023

YuehChuan commented Oct 26, 2023

CliuGeek9229 commented Oct 30, 2023

SingL3 commented Nov 9, 2023

drzraf commented Aug 30, 2024

How to fix install cuda12.1, python=3.9, flash-atten=2.3.2 #598

How to fix install cuda12.1, python=3.9, flash-atten=2.3.2 #598

Comments

batman-do commented Oct 11, 2023 • edited Loading

tridao commented Oct 11, 2023

batman-do commented Oct 12, 2023

tridao commented Oct 12, 2023

batman-do commented Oct 12, 2023

tridao commented Oct 12, 2023

batman-do commented Oct 12, 2023

YuehChuan commented Oct 12, 2023

batman-do commented Oct 14, 2023 • edited Loading

batman-do commented Oct 14, 2023

batman-do commented Oct 14, 2023

tridao commented Oct 14, 2023

batman-do commented Oct 15, 2023

Batwho commented Oct 20, 2023

YuehChuan commented Oct 26, 2023

CliuGeek9229 commented Oct 30, 2023

SingL3 commented Nov 9, 2023

drzraf commented Aug 30, 2024

batman-do commented Oct 11, 2023 •

edited

Loading

batman-do commented Oct 14, 2023 •

edited

Loading