-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single node GPU training example #333
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to propose re-naming this section to "MNIST Classification" with a Pytorch section underneath it, "MNIST Classification with Pytorch and W&B".
As a reminder, Tutorials are use-case-centric, not technology-centric, e.g. we could also write tutorials for training an MNIST classifier using any number of other ML packages.
Agree with @cosmicBboy |
#342 should fix the docs build issue |
df867ea
to
5ca26b2
Compare
Signed-off-by: Ketan Umare <[email protected]>
Signed-off-by: Jinserk Baik <[email protected]>
Signed-off-by: Ketan Umare <[email protected]>
Signed-off-by: Ketan Umare <[email protected]>
Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: cosmicBboy <[email protected]>
Signed-off-by: cosmicBboy <[email protected]>
902dfa1
to
16b3948
Compare
@cosmicBboy, we haven't yet tested this on We may need to test this when you're working on the distributed training scenario. |
@samhita-alla thanks for the heads up! @wild-endeavor is working on getting gpus on demo.nuclyde, I can help test out the example as well |
Sure, thank you! |
* Single node GPU training example Signed-off-by: Ketan Umare <[email protected]> * Minor fix related to tensorboard in PyTorch (#334) Signed-off-by: Jinserk Baik <[email protected]> * updated pytorch training example Signed-off-by: Ketan Umare <[email protected]> * updated Signed-off-by: Ketan Umare <[email protected]> * wandb integration, code lint, content Signed-off-by: Samhita Alla <[email protected]> * remove misplaced text Signed-off-by: Samhita Alla <[email protected]> * add pytorch in tests' manifest Signed-off-by: Samhita Alla <[email protected]> * changed pytorch to mnist Signed-off-by: Samhita Alla <[email protected]> * dockerfile Signed-off-by: Samhita Alla <[email protected]> * update link Signed-off-by: cosmicBboy <[email protected]> * update deps Signed-off-by: cosmicBboy <[email protected]> Co-authored-by: Jinserk Baik <[email protected]> Co-authored-by: Samhita Alla <[email protected]> Co-authored-by: cosmicBboy <[email protected]> add pytorch multi-gpu tutorial Signed-off-by: cosmicBboy <[email protected]> update pytorch tutorials Signed-off-by: cosmicBboy <[email protected]> update multi gpu example Signed-off-by: cosmicBboy <[email protected]> update multi-gpu Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> update flytekit version Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: cosmicBboy <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]> multi-gpu WIP Signed-off-by: Niels Bantilan <[email protected]>
Signed-off-by: Samhita Alla <[email protected]> protobf -> becomes -> protobuf (flyteorg#329) Signed-off-by: Bruce Arctor <[email protected]> fix dolt docs (flyteorg#327) Signed-off-by: Samhita Alla <[email protected]> reorganize sqlite3 user guide example (flyteorg#300) * reorganize sqlite3 user guide example move from extending_flyte to integrations/flytekit_plugins Signed-off-by: cosmicBboy <[email protected]> * update title Signed-off-by: cosmicBboy <[email protected]> * update dolt card text Signed-off-by: cosmicBboy <[email protected]> * add sql-alchemy Signed-off-by: Samhita Alla <[email protected]> * readme Signed-off-by: Samhita Alla <[email protected]> * lint code Signed-off-by: Samhita Alla <[email protected]> * modify sqlalchemy Signed-off-by: Samhita Alla <[email protected]> * update content Signed-off-by: Samhita Alla <[email protected]> * update example Signed-off-by: Samhita Alla <[email protected]> * sqlalchemy remote example Signed-off-by: Samhita Alla <[email protected]> * code updates Signed-off-by: Samhita Alla <[email protected]> Co-authored-by: Samhita Alla <[email protected]> Bump urllib3 in /cookbook/integrations/flytekit_plugins/dolt (flyteorg#325) Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.11 to 1.26.5. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](urllib3/urllib3@1.25.11...1.26.5) --- updated-dependencies: - dependency-name: urllib3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Example for writing queries for Athena (flyteorg#319) - flyte allows writing queries directly that are executed by the backend. This shows such an example Signed-off-by: Ketan Umare <[email protected]> Link athena docs from aws integrations page overview (flyteorg#338) rename control_plane section to remote_access (flyteorg#302) * rename control_plane section to remote_access - add stub pages for remote access user guide examples - clean up of named tuple outputs example - clean-up of sagemaker distributed pytorch training Signed-off-by: cosmicBboy <[email protected]> * [PR Into 302] Added documentation for running task, launchplans ,inspecting and debgging them (flyteorg#316) * Added documentation for running task, launchplans ,inspecting and debugging them Signed-off-by: Prafulla Mahindrakar <[email protected]> * Incorporated the feedback Signed-off-by: pmahindrakar-oss <[email protected]> Signed-off-by: cosmicBboy <[email protected]> * add links, formatting Signed-off-by: cosmicBboy <[email protected]> Co-authored-by: pmahindrakar-oss <[email protected]> fix sandbox start command in the cookbook (flyteorg#339) Signed-off-by: Pianist038801 <[email protected]> Co-authored-by: steven <[email protected]> Bump urllib3 from 1.25.11 to 1.26.5 in /cookbook/integrations/aws/athena (flyteorg#337) Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.11 to 1.26.5. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](urllib3/urllib3@1.25.11...1.26.5) --- updated-dependencies: - dependency-name: urllib3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Bump urllib3 in /cookbook/integrations/external_services/hive (flyteorg#336) Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.11 to 1.26.5. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](urllib3/urllib3@1.25.11...1.26.5) --- updated-dependencies: - dependency-name: urllib3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> update doc requirements with sphinx v4 (flyteorg#341) Signed-off-by: cosmicBboy <[email protected]> update dev requirements (flyteorg#342) * update dev requirements flyteorg#341 only updated the docs requirement, which resulted in a docs build issue https://readthedocs.org/projects/flytecookbook/builds/14327792/ Signed-off-by: cosmicBboy <[email protected]> * docs build installing deps in ci matches rtd Signed-off-by: cosmicBboy <[email protected]> Single node GPU training example (flyteorg#333) * Single node GPU training example Signed-off-by: Ketan Umare <[email protected]> * Minor fix related to tensorboard in PyTorch (flyteorg#334) Signed-off-by: Jinserk Baik <[email protected]> * updated pytorch training example Signed-off-by: Ketan Umare <[email protected]> * updated Signed-off-by: Ketan Umare <[email protected]> * wandb integration, code lint, content Signed-off-by: Samhita Alla <[email protected]> * remove misplaced text Signed-off-by: Samhita Alla <[email protected]> * add pytorch in tests' manifest Signed-off-by: Samhita Alla <[email protected]> * changed pytorch to mnist Signed-off-by: Samhita Alla <[email protected]> * dockerfile Signed-off-by: Samhita Alla <[email protected]> * update link Signed-off-by: cosmicBboy <[email protected]> * update deps Signed-off-by: cosmicBboy <[email protected]> Co-authored-by: Jinserk Baik <[email protected]> Co-authored-by: Samhita Alla <[email protected]> Co-authored-by: cosmicBboy <[email protected]> Indent list items under "Contribute to examples" section (flyteorg#346) Signed-off-by: eduardo apolinario <[email protected]> Papermill Tutorial (flyteorg#345) * Papermill tutorial Signed-off-by: Samhita Alla <[email protected]> * github action, tests Signed-off-by: Samhita Alla <[email protected]> * docs-related change Signed-off-by: Samhita Alla <[email protected]> * dataclass Signed-off-by: Samhita Alla <[email protected]>
* update pytorch multi-gpu example, incorporate comments @samhita-alla @kumare3 Signed-off-by: Niels Bantilan <[email protected]> * Apply suggestions from code review Co-authored-by: Samhita Alla <[email protected]> Signed-off-by: Niels Bantilan <[email protected]> Co-authored-by: Samhita Alla <[email protected]>
Signed-off-by: Ketan Umare [email protected]