-
Notifications
You must be signed in to change notification settings - Fork 74.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorFlow 2.7 does not detect CUDA installed through conda #52988
Comments
I'm not installing tensorflow from conda, just cuda/cudnn. Tensorflow is being installed from
Also note that nothing has changed on the conda side of things; we're still using the exact same environment with the same cuda/cudnn libraries, but it works in TF 2.6 and fails in TF 2.7. So I don't think the issue is on the conda side, something has changed in TensorFlow that has made this stop working. |
Open the terminal and type
at the end of the file add the following two lines
ensure no spaces on both side of '=' sign. if it still does not works, try adding for version 11.0
|
As mentioned, CUDA is being installed through conda, so |
Conda installs are not officially supported by Google |
I installed Tensorflow 2.7 on Windows with CUDA 11.2 and cuDNN 8.1 (no conda involved). I received the same |
Thank you, this also works with cuda-11.4. But how would you fix this issue in a jupyter notebook? For the pretty niche use case that you would need tf=2.7.0 features. When I start a jupyter server within a env that has these PATHs exported, it only shows the CPU. When exporting Paths in the notebook it doesn't work either. |
This seems to solve the issue: conda activate ENVNAME cd $CONDA_PREFIX
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh Edit ./etc/conda/activate.d/env_vars.sh as follows: #!/bin/sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib Edit ./etc/conda/deactivate.d/env_vars.sh as follows: #!/bin/sh
unset LD_LIBRARY_PATH Sourcehttps://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#macos-and-linux |
I don't want to be dismissive here, but there is a lack of understanding of the problem specifically introduced by TF 2.7:
This problem is not just a techie point, it does have deep implication for businesses that do real products. For my suffering peers, if you don't have access to root, you can use this small poorly-documented feature in your environment.yml:
Upon conda activate, the env variables will be set for you, and unset on deactivation. |
Really appreciate the file you provided! |
@holongate 's env is a good workaround and solves the problem for me. I'm quite astonished by how little thought was given on the issue - which is clearly a problem with TF 2.7 itself, and not with conda - and by how much time you waste on commenting that conda installs are not supported by Google. |
For anyone looking for a one-liner solution, you can do
(with the environment you want to modify activated). This has a similar effect as @jesusdpa1's solution here #52988 (comment), it'll set You still need to repeat that for every new conda environment though. It would be better if TensorFlow just detected the conda installed libraries, as it did in TensorFlow<=2.6. |
The official documentation suggests manually doing This is discussed above, but I'll reiterate the main points here for anyone coming across this thread:
I'll reiterate again, that all of these solutions are a downgrade in the user experience from TensorFlow < 2.7, when TensorFlow just correctly detected the conda-installed CUDA libraries without any fiddling required from the user. |
A kind-of semi-automated snippet for solving cudatoolkit PATH problem in conda environment that I am using:
This snippet automatically set and unset neccessary environment variables when you activate or deactivate conda environment. It could be useful not only for TF users, but for some other library where it needs CUDA dependencies to be built manually from source. |
Hi @drasmuss ,
With the above two lines of code it is not required to use the command I hope this shall address the issue.Please confirm if still missing anything here. Thanks! |
Hi @SuryanarayanaY, See #52988 (comment) for a summary of the discussion in this thread. The short answer is that no, that solution doesn't address the issue. Longer answer: The solution you describe from the docs is basically a worse version of idea 2 from that summary above. Worse in that it's more complicated, and it won't unset And, to reiterate again, all of these "solutions" are downgrades from the behaviour prior to TensorFlow 2.7, where TensorFlow just correctly detected the CUDA libraries without requiring any manual intervention from users. |
@drasmuss , I'm just curious if you have observed the same behavior in 2.11 version. |
Yes, the behaviour is the same in 2.11 and 2.12.0rc1 (I wouldn't expect it to change between rc1 and the full 2.12 release). Note that in 2.12 the error message has changed, so it displays
instead of the old "Could not load dynamic library..." errors, but it's the same issue. |
did you guys solve this problem? |
Hi, Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base. The Tensorflow team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate. Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow space. |
This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you. |
This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further. |
System information
Describe the current behavior
After installing cuda/cudnn through conda (
conda install cudatoolkit=11.2 cudnn=8.1
), TensorFlow 2.7 reports that it cannot find the cuda libraries.Installing TensorFlow 2.6 (or earlier) in the same environment, with the same cuda/cudnn installation, doesn't show any problem, it detects the libraries and GPU support works as expected.
The problem can be worked around by manually adding the conda lib directory to
LD_LIBRARY_PATH
(export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib
). However, obviously this is not ideal, as it needs to be repeated/adjusted for every new conda environment. It would be better if TensorFlow just detected the conda installed libraries, as it did in TensorFlow < 2.7.Describe the expected behavior
TensorFlow should detect cuda/cudnn libraries installed through
conda
, as it did in TensorFlow<2.7.Contributing
Standalone code to reproduce the issue
The text was updated successfully, but these errors were encountered: