-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to build TensorFlow-2.11.0-foss-2022a-CUDA-11.7.0.eb #17892
Comments
@xuagu37 Which EasyBuild version are you using here? |
IIRC this specific issue should be fixed by Check that this patch is in the EC you are using and you are using the latest easyblock as mentioned by @boegel above. |
Thanks for the replies!
Any other potential issues you can think of? |
First, I'd like to verify that this is the symlink issue. For that go to the machine where you tried to build TF and load the
|
The line we are interested in is:
I don't have "reallink" but I used "realpath" to generate the outputs of 2&3. |
That is exactly why I asked: This configuration is what causes the issue (a bug in Bazel/TensorFlow) and the patch should fix that. I don't know why it doesn't for you. So I need a bit more information:
|
Please see the attached. |
Hm, what I see is I have an idea for a solution: Can you modify the file @boegel It might make sense to add that to framework as I see no downsides but potentially faster runtimes due to less meta-data ops. As an alternative the patch |
I decided to pause my project of EasyBuild for now. Thanks for all the help! Feel free to close the issue. |
@boegel However the issue is valid and I verified this: It happens with compiler wrappers such as Both presented approaches (modifying EB or adding the patch) solve the rpath-wrapper issue. It might be better to readd the patch to newer TF ECs though because that would also fix the ccache usecase. The upstream patch which fixed the compiler-on-symlink issue doesn't fix the compiler-symlink-with-compiler-wrapper issue. I added our patch as a TF PR (again): tensorflow/tensorflow#60668 It should be enough to simply add that patch to all TF ECs >= 2.1 and do a test-build only up until the patch step. |
Add the TensorFlow-2.1.0_fix-cuda-build.patch to the TensorFlow-CUDA ECs to fix failure when compilers are on symlinked paths and e.g. ccache or rpath wrappers are used. Fixes #17892
Dear easybuild community,
I tried to build Tensorflow by:
It failed with the following error message:
Any help will be really appreciated.
Best Regards
The text was updated successfully, but these errors were encountered: