Add CuPy #9959

jakirkham · 2019-10-23T15:14:39Z

Based off PR ( #8638 ), this adds the cupy package. Tidied the PR a bit. Also dropped some changes to staged-recipes that are not needed.

Checklist

License file is packaged (see here for an example)
Source is from official source
Package does not vendor other packages. (If a package uses the source of another package, they should be separate packages or the licenses of all packages need to be packaged)
If static libraries are linked in, the license of the static library is packaged.
Build number is 0
A tarball (url) rather than a repo (e.g. git_url) is used in your recipe (see here for more details)
GitHub users listed in the maintainer section have posted a comment confirming they are willing to be listed there

conda-forge-linter · 2019-10-23T15:14:49Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/cupy) and found it was in an excellent condition.

jakirkham · 2019-10-23T15:16:08Z

@conda-forge/core, this is ready for review! 😄

leofang · 2019-10-23T15:19:29Z

@jakirkham this is awesome! Thanks for working on it. Could you point me to where - {{ compiler("cuda") }} is documented? I think it's pretty new and I wasn't aware how the EULA issue with Nvidia was resolved. Having this working would solve a lot of headache for packaging CUDA dependent projects.

jakirkham · 2019-10-23T15:24:53Z

@jakirkham this is awesome! Thanks for working on it.

Of course! Happy to help 😄

Could you point me to where - {{ compiler("cuda") }} is documented?

It needs to be. 😅 This is on my todo list. ( conda-forge/conda-forge.github.io#901 )

I think it's pretty new and I wasn't aware how the EULA issue with Nvidia was resolved.

Yeah it is pretty new. We plan to advertise it a bit more. Stay tuned.

The short answer is we are using NVIDIA-based Docker images that already have the toolchain to do the build.

Having this working would solve a lot of headache for packaging CUDA dependent projects.

Agreed!

On a different note, would you be interested in helping maintain the CuPy package? 😉

isuruf · 2019-10-23T15:29:59Z

Can you change the docker image and make tests pass? We can revert the docker image change before merging. (This is what we did for openmm)

jakirkham · 2019-10-23T15:49:06Z

Sure I've pushed a commit to build the package on CUDA 9.2.

scopatz · 2019-10-23T16:03:59Z

.azure-pipelines/azure-pipelines-linux.yml

@@ -7,7 +7,7 @@ jobs:
    matrix:
      linux:
        CONFIG: azure-linux-64-comp7
-        IMAGE_NAME: condaforge/linux-anvil-comp7
+        IMAGE_NAME: condaforge/linux-anvil-cuda:9.2


It would be good to not have to change this file in this PR

Yes, we'll have to use docker on docker to do that. That would also help with testing CentOS7 packages that you wanted to add.

This is just so we can test it here. We will revert it when we are done.

That said, we did start discussing how to do GPU tests in issue ( #9773 ). Have a rough start of a potential solution in PR ( #9804 ), but it could use help if anyone has time.

I have a question: would it imply that all CUDA packages built this way for conda-forge, and the - {{ compiler("cuda") }} flag, will be pinned at CUDA 9.2?

Right, this is why we need docs. 😅

So currently conda-forge builds against 3 CUDA versions. As some recipes have a CPU-only option, we support 1 CPU version + 3 CUDA versions in total. These can be seen here (None is the CPU version). These versions map one-to-one to these Docker images where the build occurs. Each package then gains a dependency on the cudatoolkit package with the same major minor version.

Now what does this mean for CuPy. This means it will be built against all of the CUDA versions, but no CPU versions. Each build of CuPy will pin to a major minor version of CUDA (so 9.2, 10.0, and 10.1). Users can then constrain which build they get by selecting the version of cudatoolkit they want.

Thanks a lot for the detailed explanation and the links, @jakirkham! These are very useful to know for building CUDA packages, not just CuPy.

Yes, it's be great to have a central place documenting all these. I've been trying to follow the progress of cudatoolkit for a while, but I completely missed these new developments ☹️

Do you know which CUDA 10.1 exactly is in the Docker image? You probably have noticed this new issue: cupy/cupy#2569. CuPy doesn't support 10.1 Update 2 currently.

btw, Nvidia is awful in their versioning -- why not just call it 10.1.2, or even simpler 10.3 (there's already an Update 1, which should have been 10.2)... This versioning semantics might need to be taken care in conda-forge's CUDA stack...

Glad that helps. Thanks for the feedback. At some point it would be good if you can share your thoughts on what has been confusing and useful as well as what would be valuable to have in the docs. Have been collecting feedback from others in this doc issue ( conda-forge/conda-forge.github.io#901 ). 🙂

Sorry we have not done a great job in publicizing it. Have been working through rough edges with individual contributors to try and smooth out the overall experience. That said, it would be useful to write up (even an incomplete) blogpost about the state of things from a high-level. Thanks for the nudge. 😉

FWICT the image is using CUDA 10.1 Update 2 (nvcc output below). Thanks for letting me know. IIUC that just affects master and the latest RC right (asked on the issue too)? In either case will follow that issue closely.

$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243

Good point. Have also been thinking about this. Could you please raise an issue about more detailed versioning on the nvcc feedstock?

OK will do. Also listing version strings here for future ref:
10.1: 10.1.105
10.1 Update 1: 10.1.168
10.1 Update 2: 10.1.243

See conda-forge/nvcc-feedstock#28.

recipes/cupy/meta.yaml

recipes/cupy/build.sh

leofang · 2019-10-24T03:13:38Z

recipes/cupy/meta.yaml

+  run:
+    - python
+    - setuptools
+    - {{ pin_compatible("cudnn") }}


Maybe I misunderstood: why just pinned cudnn but not nccl? There are incompatibilities even within the v2 series, and I am not sure if the built cupy.cuda.nccl module would be happy with a different nccl version discovered at runtime.

FYI, the CuPy team bundles a specific NCCL version in the CuPy wheel.

Yeah not pinning NCCL is dangerous. The NCCL version is determined at built time:
https://github.com/cupy/cupy/blob/b9db374f4d0ccee358a998d7b283248f77cef351/cupy/cuda/cupy_nccl.h#L131-L133

I think the plan is to pin that globally conda-forge/conda-forge-pinning-feedstock#309 (comment)

Thanks, @CJ-Wright. I wasn't aware of that discussion.

But still, don't we need a {{ pin_compatible("nccl") }} here? Once a CuPy package is built, it is supposed to be pinned to the precise NCCL version it was built against. I feel the global pin might be a bit fragile when a newer NCCL comes out.

Well the compiler might warn that the symbol is unused, but I don't think it will just get rid of the symbol altogether. There may be flags to enable this kind of elimination, but am pretty sure it is opt-in. I could be wrong about this though.

Anyways we should probably move this discussion to CuPy issue ( cupy/cupy#2568 ).

Well currently we are using the latest major minor version of NCCL with CuPy. Now that we are aware of this issue, I think we just need to stay on top of it.

Until that issue is resolved, shouldn't you pin exactly?

The NCCL issue? We are using the latest major minor version of NCCL and the current release of CuPy does too. So they are not adding any symbols at the moment that will clash with NCCL. Also they've acknowledge the issue upstream and seem willing to work on it. We certainly need to keep an eye on things as new releases come out and adjust our strategy accordingly. That said, I think we are doing all we can with this issue ATM.

jakirkham · 2019-10-24T15:32:20Z

@conda-forge/core, would it be possible to get another review of this? I think all comments from the last round were addressed. 🙂

leofang · 2019-10-24T15:34:26Z

See #9959 (comment).

jakirkham · 2019-10-24T15:47:57Z

(responding in that thread :)

leofang · 2019-10-24T16:22:26Z

Oh, almost forgot:

On a different note, would you be interested in helping maintain the CuPy package? 😉

I am more than happy to help if needed. I think I know CuPy internal enough.

But, I am a bit hesitating to be listed as a maintainer. Last time I worked on a package that has non-Python dependencies conda-forge/pyqt-feedstock#61 it was super painful. I explored in the dark without understanding how conda-build really works (lots of things either undocumented or have special treatments that it's impossible for a new comer to know), nor was I able to successfully debug myself (lack of resource and knowledge) when seeing issues in CI. @isuruf eventually took over and helped finish it (thanks again btw). The time investment is a bit daunting from that experience.

jakirkham · 2019-10-24T16:32:38Z

I think the Qt stack is one of the hardest pieces of our ecosystem. So props for diving in the deep end. Most of the tricky bits that I found from building CuPy were not specific to CuPy, but were more general GPU building things that have been upstreamed into the tooling in conda-forge. At this point, I would argue that building CuPy is not harder than building your average Python C extension (not PyQt 😉).

Should add I'm happy to help here too. Am using CuPy regularly in my day-to-day and expect that to continue. So will feel some pressure to keep things working here.

Though I think it would be good if some people that are not me become familiar with the CuPy recipe and updating it. Am happy to handhold as we move in that direction. In the end, I think you and other CuPy developers know way more about how CuPy works than I do. So it will be good to transition to having you and other CuPy developers directing how the package looks going forward. We've found this model successful with several packages. 🙂

leofang · 2019-10-24T16:42:51Z

FYI, the CuPy team pins at NCCL 2.4.2.1 for CUDA 9.2:
https://github.com/cupy/cupy-release-tools/blob/d542e5436cf1380913924babfb3306357ef5dac7/download_nccl.sh#L27-L28
Would it be nice to coordinate with / follow them?

CJ-Wright · 2019-10-24T16:52:07Z

Would it be nice to coordinate with / follow them?

It would be good to coordinate, although we also need to think of other downstream consumers of nccl.

jakirkham · 2019-10-24T16:53:01Z

We don't currently have a build of NCCL 2.4.2.1. It could be added though.

jakirkham · 2019-10-24T16:53:55Z

Do you know why they chose NCCL 2.4.2.1?

leofang · 2019-10-24T16:59:13Z

although we also need to think of other downstream consumers of nccl.

This sounds against a global pin that was planned...

Do you know why they chose NCCL 2.4.2.1?

Wasn't explained here: cupy/cupy-release-tools#29 Probably it was the latest version at that time?

leofang · 2019-10-24T17:01:41Z

@kmaehashi would you have time to comment / review on the NCCL pinning issue and the overall CuPy recipe here? Would be great to have an extra pair of eyes, especially someone from the CuPy/Chainer team!

CJ-Wright · 2019-10-24T17:20:15Z

This sounds against a global pin that was planned...

This is the reason to have a global pin, so that all consumers of nccl can have the same version. That way the downstream consumers don't cause errors for each other.

leofang · 2019-10-24T17:26:40Z

This sounds against a global pin that was planned...

This is the reason to have a global pin, so that all consumers of nccl can have the same version. That way the downstream consumers don't cause errors for each other.

Oh I see. So keep the pain at the coordination stage.

CJ-Wright · 2019-10-24T17:32:41Z

Yes, this is also why we're so interested in the exact level of pinning needed. If the pin is very strict then we will need to rebuild all consumers every time a new bug fix release is produced, which could be a lot of overhead for maintainers.

Changes the image to use the CUDA 9.2 Docker image and constrain the recipe to only build for CUDA 9.2.

leofang · 2019-10-25T08:17:48Z

Note this cupy/cupy#2568 (comment):

Thanks for pointing this issue out. The same thing can be said to cuDNN as we're exposing symbols whose names are same as cuDNN depending on build-time cuDNN version, and cudnn is linked as libcudnn.so.7 (not libcudnn.so.7.5).

jakirkham · 2019-10-25T14:49:45Z

@conda-forge/core, I think things are basically ready to go. Can revert the CI hack once we are ready to merge. What do people think?

jakirkham · 2019-10-25T14:55:57Z

Note this cupy/cupy#2568 (comment)

Oh sorry. I thought I posted that comment, but I guess that didn't happen. Thanks for raising @leofang. Currently we are just using the latest version of cuDNN at all times. So this shouldn't cause the same problem.

This reverts commit d82b766.

jakirkham · 2019-10-25T16:16:01Z

Can revert the CI hack once we are ready to merge.

Have gone ahead and done this. Can readd if further iteration is needed.

scopatz · 2019-10-25T17:55:18Z

Seems like this is failing on linux

jakirkham · 2019-10-25T18:09:46Z

Right that's because we don't have GPU builds on staged-recipes as discussed earlier in the thread. Also as noted I've removed the hack that enabled one GPU build. If you are interested in the CI status of the recipe, please look one commit prior. 🙂

leofang · 2019-10-25T18:17:42Z

@jakirkham add me to the maintainer. I'd feel less nervous if I could work with someone knowledgable like you 🙂

leofang · 2019-10-25T18:19:49Z

@jakirkham btw see my new question in #9959 (comment):

Do you know which CUDA 10.1 exactly is in the Docker image? You probably have noticed this new issue: cupy/cupy#2569. CuPy doesn't support 10.1 Update 2 currently.

jakirkham · 2019-10-25T20:39:31Z

@jakirkham add me to the maintainer. I'd feel less nervous if I could work with someone knowledgable like you 🙂

Thanks Leo! Would be great to have you. Added you in the latest commit. 🙂

@jakirkham btw see my new question in #9959 (comment):

Ah sorry! Missed that one before. Have responded in ( #9959 (comment) ).

jakirkham · 2019-10-25T21:52:39Z

Thanks all for the reviews and feedback here! Am really happy with how much nicer this recipe turned out as a result. 🎉 😄

carterbox · 2019-10-25T23:48:37Z

recipes/cupy/meta.yaml

+
+build:
+  number: 0
+  skip: True  # [not (linux64 and cuda_compiler_version != "None")]


Using De Morgan's Laws, this skip statement could be made easier to read:

[(not linux64) or cuda_compiler_version == "None"]

A PR to the feedstock would be welcome. 🙂

ref: https://github.com/conda-forge/cupy-feedstock

xref: conda-forge/cupy-feedstock#4

jakirkham added the Package request label Oct 23, 2019

jakirkham mentioned this pull request Oct 23, 2019

WIP: Add CuPy #8638

Closed

5 tasks

jakirkham force-pushed the add_cupy_final branch from 3607d94 to a1784b5 Compare October 23, 2019 15:42

scopatz reviewed Oct 23, 2019

View reviewed changes

isuruf reviewed Oct 23, 2019

View reviewed changes

recipes/cupy/meta.yaml Outdated Show resolved Hide resolved

isuruf reviewed Oct 23, 2019

View reviewed changes

recipes/cupy/meta.yaml Outdated Show resolved Hide resolved

isuruf reviewed Oct 23, 2019

View reviewed changes

recipes/cupy/build.sh Outdated Show resolved Hide resolved

jakirkham force-pushed the add_cupy_final branch from ef3bd6f to c599579 Compare October 23, 2019 18:37

isuruf reviewed Oct 23, 2019

View reviewed changes

recipes/cupy/build.sh Outdated Show resolved Hide resolved

jakirkham force-pushed the add_cupy_final branch 4 times, most recently from b8a3061 to a10b883 Compare October 23, 2019 22:03

leofang reviewed Oct 24, 2019

View reviewed changes

leofang mentioned this pull request Oct 24, 2019

Package tests in sdist cupy/cupy#2563

Merged

isuruf mentioned this pull request Oct 24, 2019

Set minimum version of NCCL to 2.4.6.1 conda-forge/conda-forge-pinning-feedstock#309

Merged

5 tasks

jakirkham mentioned this pull request Oct 24, 2019

Adding Conda packages to conda-forge cupy/cupy#2274

Closed

leofang mentioned this pull request Oct 24, 2019

Avoid exposing functions overlapping with NCCL names cupy/cupy#2568

Open

jakirkham added 2 commits October 24, 2019 18:02

recipes: Add CuPy

b50e35c

Temporarily build on CUDA 9.2 in staged-recipes

d82b766

Changes the image to use the CUDA 9.2 Docker image and constrain the recipe to only build for CUDA 9.2.

jakirkham force-pushed the add_cupy_final branch from a10b883 to d82b766 Compare October 24, 2019 22:02

Revert "Temporarily build on CUDA 9.2 in staged-recipes"

0bfa612

This reverts commit d82b766.

Add @leofang as a maintainer [ci skip] ***NO_CI***

2d954f3

jakirkham force-pushed the add_cupy_final branch from 81741d3 to 2d954f3 Compare October 25, 2019 20:38

isuruf merged commit b71cb49 into conda-forge:master Oct 25, 2019

jakirkham deleted the add_cupy_final branch October 25, 2019 21:51

carterbox reviewed Oct 25, 2019

View reviewed changes

leofang mentioned this pull request Oct 28, 2019

Explicitly pinning at CUDA x.x.x, based on the current status of CUDA 10.1? conda-forge/nvcc-feedstock#28

Open

matthewfeickert mentioned this pull request Oct 28, 2024

New package: cuda-histogram #27950

Merged

10 tasks

Add CuPy #9959

Add CuPy #9959

Conversation

jakirkham commented Oct 23, 2019 • edited Loading

conda-forge-linter commented Oct 23, 2019

jakirkham commented Oct 23, 2019

leofang commented Oct 23, 2019

jakirkham commented Oct 23, 2019

isuruf commented Oct 23, 2019

jakirkham commented Oct 23, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leofang Oct 25, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leofang Oct 24, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham Oct 25, 2019 • edited Loading

Choose a reason for hiding this comment

jakirkham commented Oct 24, 2019

leofang commented Oct 24, 2019

jakirkham commented Oct 24, 2019

leofang commented Oct 24, 2019

jakirkham commented Oct 24, 2019

leofang commented Oct 24, 2019 • edited Loading

CJ-Wright commented Oct 24, 2019

jakirkham commented Oct 24, 2019

jakirkham commented Oct 24, 2019

leofang commented Oct 24, 2019

leofang commented Oct 24, 2019

CJ-Wright commented Oct 24, 2019

leofang commented Oct 24, 2019

CJ-Wright commented Oct 24, 2019 • edited Loading

leofang commented Oct 25, 2019

jakirkham commented Oct 25, 2019

jakirkham commented Oct 25, 2019

jakirkham commented Oct 25, 2019

scopatz commented Oct 25, 2019

jakirkham commented Oct 25, 2019

leofang commented Oct 25, 2019

leofang commented Oct 25, 2019

jakirkham commented Oct 25, 2019

jakirkham commented Oct 25, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham commented Oct 23, 2019 •

edited

Loading

leofang Oct 25, 2019 •

edited

Loading

leofang Oct 24, 2019 •

edited

Loading

jakirkham Oct 25, 2019 •

edited

Loading

leofang commented Oct 24, 2019 •

edited

Loading

CJ-Wright commented Oct 24, 2019 •

edited

Loading