-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throw error when torch iterable was not split by rank or split by worker #107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, left some minor comments
yield from self.source | ||
|
||
def _check_for_rank_split(self, source: Composable) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we should check if we are actually in the multi-rank environment. If not this check should be skipped.
Would be good to add a test for this case too: we are not in the multi-rank / multi-worker environment and to_torch_iterable
is called.
else: | ||
return self._check_for_rank_split(source.source) | ||
|
||
def _check_for_worker_split(self, source: Composable) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@@ -66,8 +66,36 @@ def __init__(self) -> None: | |||
|
|||
def __iter__(self) -> Iterator: | |||
"""Method to iterate over the source""" | |||
if not self._check_for_rank_split(self.source): | |||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have a custom exception for this, PytorchSplittingError
.split_by_worker_pytorch() | ||
.to_torch_iterable() | ||
) | ||
next(iter(it)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the length or items that we get.
48462e9
to
c028ecc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, thanks a lot, left a few minor comments. Please don't forget to bump the version.
@@ -66,8 +68,40 @@ def __init__(self) -> None: | |||
|
|||
def __iter__(self) -> Iterator: | |||
"""Method to iterate over the source""" | |||
if _in_multi_rank_env(): | |||
warnings.warn("In multi rank environment") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need warning here
"Add a 'split_by_rank_pytorch' call to your composable to avoid this error. " | ||
) | ||
if _in_multi_worker_env(): | ||
warnings.warn("In mulit worker environment") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need warning here
if torch.distributed.is_available() and torch.distributed.is_initialized(): | ||
group = torch.distributed.group.WORLD | ||
size = torch.distributed.get_world_size(group=group) | ||
return True if torch.distributed.is_available() and size > 1 else False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch.distributed.is_available()
is already checked in line 139
3936284
to
ccfcdc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Please don't forget to rebase to main and bump the version
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.
Fixes # issue
Type of change
Checklist: