PyTorch without distributed support not supported #1153

adamjstewart · 2023-04-16T22:48:36Z

If PyTorch isn't built with distributed support (commonly the case on macOS), many features of lightly crash with an error. For example:

lib/python3.10/site-packages/lightly/loss/ntx_ent_loss.py:164: in forward
    labels = labels + dist.rank() * batch_size
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    def rank() -> int:
        """Returns the rank of the current process."""
>       return dist.get_rank() if dist.is_initialized() else 0
E       AttributeError: module 'torch.distributed' has no attribute 'is_initialized'

Any time torch.distributed is used, it should be wrapped by:

if torch.distributed.is_available():

Lightning themselves have introduced and fixed this same bug several times:

The text was updated successfully, but these errors were encountered:

guarin · 2023-04-17T07:56:42Z

Ah, thanks for bringing this up! We use dist in multiple places and should definitely check for it.

guarin · 2023-05-08T07:47:57Z

This should be fixed with #1180.

guarin closed this as completed May 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch without distributed support not supported #1153

PyTorch without distributed support not supported #1153

adamjstewart commented Apr 16, 2023

guarin commented Apr 17, 2023

guarin commented May 8, 2023

PyTorch without distributed support not supported #1153

PyTorch without distributed support not supported #1153

Comments

adamjstewart commented Apr 16, 2023

guarin commented Apr 17, 2023

guarin commented May 8, 2023