Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTorch with CUDA enabled #9

Closed
akshayka opened this issue Apr 4, 2021 · 11 comments
Closed

PyTorch with CUDA enabled #9

akshayka opened this issue Apr 4, 2021 · 11 comments

Comments

@akshayka
Copy link
Contributor

akshayka commented Apr 4, 2021

Installing pymde with

conda install -c conda-forge pymde

pulls in the CPU version of PyTorch, on Python 3.7/3.8/3.9 (tested on Linux). This may be because

conda install -c conda-forge pytorch

pulls in the CPU version by default.

We should prefer to install the CUDA version by default. One way to do this is to install PyTorch from the official PyTorch channel (though some of its dependencies need to be obtained from conda-forge); this I imagine is what we should do. (See the conda installation command here: https://pytorch.org/).

I am not sure how to modify the feedstock to pull from a specific channel.

@mfansler
Copy link
Member

mfansler commented Apr 4, 2021

There's nothing we can do on the Conda Forge build to enforce this. Conda does not allow a package to specify the channel from which a given dependency should come. It's really up to users to prioritize their channels correctly.

However, you could recommend that users install with the command:

conda install -c pytorch -c conda-forge pymde

That would prioritize the pytorch channel (if they have channel_priority: strict in their Conda config).

Additionally, we could make a point of testing for compatibility with PyTorch channel packages. This is related to the changes I was proposing to support a Windows build (#7). It is possible to specify the channels used when testing, and we could prioritize pytorch over conda-forge. This might be frowned upon, but I think it should be defensible since the pytorch channel is authoritative.

@akshayka
Copy link
Contributor Author

akshayka commented Apr 4, 2021

Thanks for clarifying! I've updated the installation documentation to match your suggestion.

Testing with the official PyTorch channel packages sounds good to me.

mfansler added a commit to mfansler/pymde-feedstock that referenced this issue Apr 5, 2021
@mfansler mfansler mentioned this issue Apr 5, 2021
4 tasks
@mfansler
Copy link
Member

mfansler commented Apr 5, 2021

FYI, I ran into super strange behavior with trying to prioritize PyTorch channel. The Conda solver totally chokes on it on all platforms - not clear why. First, it solved torchvision to super old versions. I tried to correct for that by including a minimum that coordinates with the pytorch >=1.7.1, namely torchvision >=0.8.2. That caused it to fail solving completely.

I tested locally using mamba instead of conda and it solved without issue (and examples worked fine). So, there's something wrong with the Conda solver that it isn't finding any solution. Might be a bit before I get back to this.

@akshayka
Copy link
Contributor Author

akshayka commented Apr 5, 2021

That is indeed strange! For what it's worth

conda install -c pytorch -c conda-forge pymde

worked for me on a freshly created environment (Ubuntu 20.10)

@mfansler
Copy link
Member

mfansler commented Apr 5, 2021

@akshayka I'd be interested to see what versions it resolved (conda env export). Also, was channel priority set to strict (conda config --show channel_priority)?

@akshayka
Copy link
Contributor Author

akshayka commented Apr 6, 2021

I just tried it again on a fresh conda environment, with Python 3.7. In fact I am seeing what you saw, with a very old version of torchvision being pulled in: 0.5.0.

The channel priority is set to flexible.

It sounds like you've got a handle on a solution, though (from what I saw in PR #11).

@mfansler
Copy link
Member

mfansler commented Apr 7, 2021

I noticed today that the latest build didn't push to Anaconda Cloud. Inspecting the pipelines, I found:

pytorch channel not allowed

Maybe it's a licensing thing? Technically, we're only using their channel after building, but this all is turning out more problematic than I anticipated. 😬

@akshayka
Copy link
Contributor Author

akshayka commented Apr 7, 2021

Thanks for looking into this! Perhaps Conda Forge will add a torchvision build for Windows, now that this issue is on their radar.

@h-vetinari
Copy link
Member

Hey, just stumbled over the feedstock;

Generally, no packages built by conda-forge are allowed to depend on any other channels (with the exception of the anaconda main channels), so having a recipe that depends on the pytorch channel is not possible - which is why you got the warning.

This principle has a lot of benefits, but it hits some issues with packages like pytorch & tensorflow that are very hard to build (at the very least in terms of computation time, exceeding the 6h azure limit), and therefore not yet available with the same coverage & timeliness as the rest of the ecosystem. But the people from conda-forge are trying to get these things resolved as well, so that there are timely and high-quality builds also for such complicated packages.

For torchvision specifically, there's an issue already, but we're running in the same limitation: there are no windows builds for pytorch in conda-forge yet, and it's a massive effort to debug the recipe locally & incrementally without CI & much collaboration (the medium term goal being to find a way to run this in CI after all, in a queue with more resources).

PS. You can easily choose the pytorch build by selecting on the build-string of the package.

conda install -c conda-forge pytorch=*=cuda*   # install GPU verion
conda install -c conda-forge pytorch=*=cpu*   # install CPU verion

The first asterisk could be replaced with a version number. Recent versions of conda should already prefer the GPU version, but perhaps the pytorch recipe still needs some modification to correctly "weigh down" the CPU version (so it gets deprioritized).

@mfansler
Copy link
Member

@h-vetinari I really appreciate the additional info and the work you all are doing in getting those feedstocks built! 🙏

Good to know about the CPU/GPU build strings.

@akshayka I'm closing this particular issue, since it doesn't appear that we need to do anything special on the feedstock side for CUDA use. Please reopen if you think otherwise!

@akshayka
Copy link
Contributor Author

Thanks so much, @h-vetinari. Selecting on the build string is such a simple solution --- I should have found it myself. That will work just fine for us.

Thanks again for all your work, & for providing help to so many feedstocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants