Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuild for CUDA 12 #1079

Merged

Conversation

conda-forge-admin
Copy link
Contributor

@conda-forge-admin conda-forge-admin commented Jun 4, 2023

Hi! This is the friendly automated conda-forge-webservice.

I've started rerendering the recipe as instructed in #1078.

If I find any needed changes to the recipe, I'll push them to this PR shortly. Thank you for waiting!

Here's a checklist to do before merging.

Fixes #1080
Fixes #1078
Fixes #974
Fixes #942

@conda-forge-webservices
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@jakirkham
Copy link
Member

@conda-forge-admin, please re-render

@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2023

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/arrow-cpp-feedstock/actions/runs/5171072276.

@jakirkham
Copy link
Member

@conda-forge-admin, please re-render

@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2023

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/arrow-cpp-feedstock/actions/runs/5171125185.

@jakirkham
Copy link
Member

jakirkham commented Jun 4, 2023

This logic may also need to be limited to CUDA 11 or earlier

# Enable CUDA support
if [[ ! -z "${cuda_compiler_version+x}" && "${cuda_compiler_version}" != "None" ]]
then
if [[ -z "${CUDA_HOME+x}" ]]
then
echo "cuda_compiler_version=${cuda_compiler_version} CUDA_HOME=$CUDA_HOME"
CUDA_GDB_EXECUTABLE=$(which cuda-gdb || exit 0)
if [[ -n "$CUDA_GDB_EXECUTABLE" ]]
then
CUDA_HOME=$(dirname $(dirname $CUDA_GDB_EXECUTABLE))
else
echo "Cannot determine CUDA_HOME: cuda-gdb not in PATH"
return 1
fi
fi
EXTRA_CMAKE_ARGS=" ${EXTRA_CMAKE_ARGS} -DARROW_CUDA=ON -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_HOME} -DCMAKE_LIBRARY_PATH=${CONDA_BUILD_SYSROOT}/lib"
else
EXTRA_CMAKE_ARGS=" ${EXTRA_CMAKE_ARGS} -DARROW_CUDA=OFF"
fi

Here's how another maintainer approached their case

Edit: It's also possible this workaround is no longer needed and so can simply be dropped

@jakirkham jakirkham changed the title MNT: rerender Rebuild for CUDA 12 Jun 4, 2023
@h-vetinari
Copy link
Member

This logic may also need to be limited to CUDA 11 or earlier

Feel free to update this. AFAIU, we only need to build with one cuda version anyway, so we can just remove those workarounds.

Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Pleasantly surprised to see this passing1 right off the bat - great work!

Aside from removing the workarounds (which would be great), I just have a comment about the run-constraints.

Footnotes

  1. the aarch+CUDA failure is Skip some more crashy tests on aarch+CUDA #1081, and the windows failure is infra flakiness

recipe/meta.yaml Outdated Show resolved Hide resolved
recipe/meta.yaml Outdated Show resolved Hide resolved
recipe/meta.yaml Outdated Show resolved Hide resolved
@jakirkham
Copy link
Member

@conda-forge-admin, please re-render

@jakirkham
Copy link
Member

Hmm...it looks like the CUDA builds are being skipped on CI altogether?


Skipped: apache-arrow from /home/conda/recipe_root defines build/skip for this configuration ({'cuda_compiler_version': '10.2', 'target_platform': 'linux-64', 'channel_targets': 'conda-forge main'}).

Also re-rendering removed cuda_compiler_version_min for some reason

jakirkham and others added 6 commits June 8, 2023 15:32
Makes a few changes that will generalize better over CUDA 11 / 12 like
use `cuda-version` and referring to the `compiler("cuda")` package
directly (since there are differences in what is in its `run_exports`).
Only the CUDA driver version matters for Arrow CUDA components.

Co-authored-by: h-vetinari <[email protected]>
Co-authored-by: Keith Kraus <[email protected]>
This looks for `$CUDA_HOME` when the `nvcc` wrapper script is installed
by searching for `cuda-gdb`, which is in the CUDA 11 Docker images.
However this isn't available on CUDA 12 (unless `cuda-gdb` is
installed).

That said, this logic was added to the `nvcc` wrapper script a while
ago. So this can just be dropped in favor of relying on the `nvcc`
wrapper sript to set this correctly.
@h-vetinari
Copy link
Member

Also please feel free to push changes directly

I took you at your word on this, hope that's OK. :)

Rather than asking you to do the rebasing work I'd like to have (to clean up the back-and-forth here which messes up the git history), I did it myself. Let me know if you want to do it differently.

(sorry for the force-pushes)

Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my POV now!

@jakirkham
Copy link
Member

Yep that's ok. Thanks for taking the lead! 🙏

Would just suggest wrapping up the discussion in this thread ( #1079 (comment) ). Perhaps that is just acknowledging there that users need new enough Conda & Mamba versions for this to work

@jakirkham
Copy link
Member

jakirkham commented Jun 8, 2023

@kkraus14 , anything else or is this good to go?

@jakirkham jakirkham requested review from kkraus14 and removed request for kkraus14 June 8, 2023 17:26
@kkraus14
Copy link
Contributor

kkraus14 commented Jun 8, 2023

I think we only want to build a CUDA variant with the minimum cuda compiler version that we support. It gives compatibility for newer versions and reduces the CI matrix a bit.

EDIT: This is already happening, apologies.

@jakirkham
Copy link
Member

All good. It is important that we get this right. So thanks for looking more closely and confirming 😄

@jakirkham jakirkham added the automerge Merge the PR when CI passes label Jun 8, 2023
@jakirkham
Copy link
Member

Asking the bot to merge when it is done, since it seems we are good with the changes here

Do we want to start backporting this now or are there other changes we want to combine this with on the older versions?

@h-vetinari
Copy link
Member

Asking the bot to merge when it is done, since it seems we are good with the changes here

Aarch+cuda is inexplicably broken on some tests. I'm debugging this in #1081

@h-vetinari h-vetinari merged commit f18400b into conda-forge:main Jun 8, 2023
@jakirkham
Copy link
Member

Ah ok. No worries. Had tried restarting, but I guess the same issues came back

@jakirkham
Copy link
Member

Do we want to start backporting this now or are there other changes we want to combine this with on the older versions?

Any thoughts on this question?

@h-vetinari
Copy link
Member

Yes, we want to backport. I usually do this for the maintenance branches whenever some relevant changes accumulate (or the next migrator drops by), but feel free to open PRs already if you want!

@jakirkham
Copy link
Member

Submitted PRs to backport these changes to the respective version branches that appear to still be supported:

@h-vetinari
Copy link
Member

Submitted PRs to backport these changes to the respective version branches

Thanks! It's a bit weird to me that the render commit is ordered before adding the migration, but whatever...

that appear to still be supported

I't not just appearance. Policy is: support for last 4 released arrow versions.

@jakirkham
Copy link
Member

Ah was trying to have a re-render in the beginning that refreshed things before applying changes. Re-rendering at the end was a no-op locally

Makes sense. Was just noting that in case I missed something

@h-vetinari
Copy link
Member

While highly appreciated, there's no point in restarting the merge of this PR on main. There hasn't been a passing run on aarch+CUDA for ~20 runs across various PRs & merges, but I seem to finally have cracked which tests are responsible in #1081

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Merge the PR when CI passes
Projects
None yet
4 participants