Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTorch without distributed support not supported #1153

Closed
adamjstewart opened this issue Apr 16, 2023 · 2 comments
Closed

PyTorch without distributed support not supported #1153

adamjstewart opened this issue Apr 16, 2023 · 2 comments

Comments

@adamjstewart
Copy link
Contributor

If PyTorch isn't built with distributed support (commonly the case on macOS), many features of lightly crash with an error. For example:

lib/python3.10/site-packages/lightly/loss/ntx_ent_loss.py:164: in forward
    labels = labels + dist.rank() * batch_size
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    def rank() -> int:
        """Returns the rank of the current process."""
>       return dist.get_rank() if dist.is_initialized() else 0
E       AttributeError: module 'torch.distributed' has no attribute 'is_initialized'

Any time torch.distributed is used, it should be wrapped by:

if torch.distributed.is_available():

Lightning themselves have introduced and fixed this same bug several times:

@guarin
Copy link
Contributor

guarin commented Apr 17, 2023

Ah, thanks for bringing this up! We use dist in multiple places and should definitely check for it.

@guarin
Copy link
Contributor

guarin commented May 8, 2023

This should be fixed with #1180.

@guarin guarin closed this as completed May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants