-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds a dask_on_ray tutorial #225
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -0,0 +1,115 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least for me, trying to recreate this workflow, this does not produce anything. Maybe Kostya can say if the same happens for him?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see 127.0.0.1:8265
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also doesn't seem to produce anything in the readthedocs build
@@ -0,0 +1,115 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it would be valuable to add a sentence or two about use of the explicit optionsuse_map=False
here
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would adding a comment inline be sufficient? We have a section on this in the docs here: https://tape.readthedocs.io/en/latest/tutorials/scaling_to_large_data.html#Data-Partitioning-and-Parallelization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment inline and/or mentioning that it is elaborated more elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline comment was added
@@ -0,0 +1,115 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this tutorial is short, it allows us adding comparison between Dask and Ray. Something like this:
One cell
%%time ens = Ensemble() ens.from_dataset("s82_qso") ens._source = ens._source.repartition(npartitions=10) ens.batch(calc_sf2, use_map=False)
Next cell
%%time ens = Ensemble(client=False) ens.from_dataset("s82_qso") ens._source = ens._source.repartition(npartitions=10) ens.batch(calc_sf2, use_map=False)
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is great, thank you!
Codecov ReportPatch and project coverage have no change.
Additional details and impacted files@@ Coverage Diff @@
## main #225 +/- ##
=======================================
Coverage 92.57% 92.57%
=======================================
Files 22 22
Lines 1132 1132
=======================================
Hits 1048 1048
Misses 84 84 ☔ View full report in Codecov by Sentry. |
Adds a tutorial on using dask_on_ray with the Ensemble. No code changes are needed to TAPE itself to work with this.