Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make the split_by_worker and slpit_by_rank optional #140

Merged
merged 1 commit into from
Oct 23, 2023

Conversation

AlirezaSohofi
Copy link
Contributor

@AlirezaSohofi AlirezaSohofi commented Aug 16, 2023

Description

The requirements: merging multiple streams in a single driver, while ensuring data splitting happens at the shard key level. In this case, the user will take care that the correct splitting logic is indeed applied to each stream, still the drivers that includes these streams can't figure out if this is the case. By providing a way to bypass this safety mechanism, power users can build more powerful pipelines.

Fixes # issue

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring including code style reformatting
  • Other (please describe):

Checklist:

  • I have read the contributing guideline doc (external contributors only)
  • Lint and unit tests pass locally with my changes
  • I have kept the PR small so that it can be easily reviewed
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • All dependency changes have been reflected in the pip requirement files.

pzdkn
pzdkn previously approved these changes Aug 29, 2023
Copy link
Contributor

@pzdkn pzdkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid, but I am not exactly sure which problem this is solving. Is it possible to include an example for combining sources in a driver in the docs and mention how using this configuration is necessary?

pzdkn
pzdkn previously approved these changes Sep 15, 2023
Copy link
Contributor

@pzdkn pzdkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ali, Looks good!

pzdkn
pzdkn previously approved these changes Sep 15, 2023
Copy link
Contributor

@pzdkn pzdkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@github-actions
Copy link

This is PR is marked as stale as it has been inactive for 30 days. It will be closed in 7 days.

Copy link
Contributor

@pzdkn pzdkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thks, lgtm!

@AlirezaSohofi AlirezaSohofi merged commit 5fa094d into main Oct 23, 2023
4 checks passed
@AlirezaSohofi AlirezaSohofi deleted the optional_pytorch_split_checks branch October 23, 2023 13:26
@github-actions github-actions bot locked and limited conversation to collaborators Oct 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants