Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests that mock cuda availability and extend it for mps #14012

Closed
awaelchli opened this issue Aug 3, 2022 · 3 comments · Fixed by #14708
Closed

Update tests that mock cuda availability and extend it for mps #14012

awaelchli opened this issue Aug 3, 2022 · 3 comments · Fixed by #14708
Labels
accelerator: mps Apple Silicon GPU good first issue Good for newcomers help wanted Open to be worked on tests
Milestone

Comments

@awaelchli
Copy link
Contributor

awaelchli commented Aug 3, 2022

Proposed refactor

We have tests that fail locally on MPS supported devices because their mocking of CUDA does not extend to MPS. After #13947 landed, the availability check actually works now and we need to update some tests. These are mainly the tests that have this combination of settings:

  • has accelerator="gpu" (which includes both cuda and mps)
  • has function mocks like torch.cuda.is_available as True or torch.cuda.device_count > 0.

An example of such a test:

https://github.com/Lightning-AI/lightning/blob/e6a8283e9cd9df53fb661c64bbf2037e1391a16d/tests/tests_pytorch/trainer/connectors/test_accelerator_connector.py#L245-L262

Motivation

Let MPS users run the tests locally (me).

Pitch

Extend the mocking to MPS where applicable. In some tests it may make sense to actually only test for cuda, in which case we should change accelerator="gpu" to "cuda". Some other tests may need to be skipped entirely on MPS.

Note that simply slapping a RunIf(min_cuda=x) on top of these tests is not an option, as for most of the tests we want to run them with mocks on the CPU.

Additional context

Rather do this sooner than later. Right now, I can't differentiate from tests failing because of bugs in my branch vs. them failing due to this.


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging PyTorch Lightning, Transformers, and Hydra.

cc @Borda @akihironitta @justusschock

@awaelchli awaelchli added needs triage Waiting to be triaged by maintainers tests accelerator: mps Apple Silicon GPU and removed needs triage Waiting to be triaged by maintainers labels Aug 3, 2022
@awaelchli awaelchli added this to the pl:1.8 milestone Aug 3, 2022
@awaelchli awaelchli added the help wanted Open to be worked on label Aug 3, 2022
@jxtngx
Copy link
Contributor

jxtngx commented Aug 9, 2022

I'd like to help with this; I am on an M1 series mac and have some experience writing tests (for flash).

@awaelchli
Copy link
Contributor Author

Thanks @JustinGoheen. Let me know if you have any questions.

@jxtngx jxtngx removed their assignment Aug 16, 2022
@jxtngx
Copy link
Contributor

jxtngx commented Aug 16, 2022

apologies, I'm no longer able to work this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accelerator: mps Apple Silicon GPU good first issue Good for newcomers help wanted Open to be worked on tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants