Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Prefetcher into the PrototypeRS and fixing prefetcher bug #826

Closed

Conversation

VitalyFedyunin
Copy link
Contributor

@VitalyFedyunin VitalyFedyunin commented Oct 12, 2022

Fixes: #815

Stack from ghstack (oldest at bottom):

Fixes #825

Differential Revision: D40308358

VitalyFedyunin added a commit that referenced this pull request Oct 12, 2022
ghstack-source-id: b80672b44f488b02a5cf8625e2229c5ef9f25a5a
Pull Request resolved: #826
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2022
@VitalyFedyunin
Copy link
Contributor Author

@VitalyFedyunin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ejguan ejguan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The culprit was the same as I was thinking. But, I do have a few comments below

Comment on lines -105 to -107

def reset_iterator(self):
self.reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I found the same issue as reset will be called automatically at the beginning of __iter__.

self.combined_datapipes.reset_epoch()
if self.prefetch_mainloop > 0:
# Stop prefetching first
self.combined_datapipes.reset()
Copy link
Contributor

@ejguan ejguan Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since reset will be called in the begining of the __iter__ as well, I think it's better to add a different API to stop prefetching and called here.

And, reset can be the last resort to stop thread if that API is not invoked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, we should have a graph function to stop prefetcher from the end of the pipeline to the beginning especially when multiple prefetchers are presented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also found them annoyingly duplicating and ideally we need something like GraphStop operation to freeze all parallel work. It would be extremely useful for snapshotting. But it is out of scope for this PR (Anyway I will post separate issue to capture demand)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM! There are a few special DataPipes need to be handled specially with this GraphStop:

  • FullSync
  • Prefetcher

VitalyFedyunin added a commit that referenced this pull request Oct 18, 2022
ghstack-source-id: bab358a21b83d018a81ea3027ba41d0d568671cd
Pull Request resolved: #826
@VitalyFedyunin
Copy link
Contributor Author

@VitalyFedyunin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ejguan ejguan mentioned this pull request Oct 18, 2022
ejguan pushed a commit to ejguan/data that referenced this pull request Oct 21, 2022
…orch#826)

Summary: Pull Request resolved: pytorch#826

Test Plan: Imported from OSS

Reviewed By: NivekT, msaroufim

Differential Revision: D40308358

Pulled By: VitalyFedyunin

fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
ejguan pushed a commit that referenced this pull request Oct 21, 2022
Summary: Pull Request resolved: #826

Test Plan: Imported from OSS

Reviewed By: NivekT, msaroufim

Differential Revision: D40308358

Pulled By: VitalyFedyunin

fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
@facebook-github-bot facebook-github-bot deleted the gh/VitalyFedyunin/28/head branch October 22, 2022 14:20
ejguan pushed a commit to ejguan/data that referenced this pull request Oct 23, 2022
…orch#826)

Summary: Pull Request resolved: pytorch#826

Test Plan: Imported from OSS

Reviewed By: NivekT, msaroufim

Differential Revision: D40308358

Pulled By: VitalyFedyunin

fbshipit-source-id: f4b706c29ceffc52c376e9cddfc56a75819ff1ee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants