Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Pytorch dataset #202

Merged

Conversation

jhamman
Copy link
Contributor

@jhamman jhamman commented Feb 1, 2024

Description of proposed changes

This PR introduces a number of changes to xbatcher's pytorch MapDataset class. These changes came out of the learnings while preparing my AMS talk (slides and blog post coming soon). The changes here allow for more control over how w

  • includes concurrent loading of x/y batches with dask.compute()
  • transforms now are expected to take an xarray object and return a torch.Tensor

transforms

  - includes concurrent loading of x/y batches with dask.compute()
  - transforms now are expected to return a torch.Tensor
Copy link

codecov bot commented Feb 1, 2024

Codecov Report

Attention: Patch coverage is 89.65517% with 3 lines in your changes missing coverage. Please review.

Project coverage is 97.44%. Comparing base (43c9135) to head (99b1fab).
Report is 26 commits behind head on main.

Files with missing lines Patch % Lines
xbatcher/loaders/torch.py 89.65% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #202      +/-   ##
==========================================
+ Coverage   96.25%   97.44%   +1.19%     
==========================================
  Files           6        6              
  Lines         347      392      +45     
  Branches       82       91       +9     
==========================================
+ Hits          334      382      +48     
+ Misses          8        5       -3     
  Partials        5        5              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maxrjones
Copy link
Member

maxrjones commented Feb 5, 2024

Thanks for contributing these improvements @jhamman! It would be good to hit the test coverage target, so do you mind if I push additional tests as part of the review?

@maxrjones
Copy link
Member

Sorry that I did not have time to review this yet! I am about to head out of office but would be glad to work on this with you after returning on March 5, or would welcome other reviewers in the meantime.

@andersy005
Copy link
Member

pre-commit.ci autofix

@andersy005
Copy link
Member

@jhamman, i pushed additional tests/fixes to your branch.. i hope you don';t mind :)

@maxrjones, curious to hear your feedback on this before I merge it into main when you get a chance

@maxrjones
Copy link
Member

@weiji14 would you have any interest in reviewing this PR? I can if you don't have time or interest, I just expect that your expertise would be particularly beneficial on the PyTorch dataset revisions. It'd also be helpful to know if we need to expect downstream implications for zen3geo.

@weiji14
Copy link
Member

weiji14 commented Aug 16, 2024

I'll be travelling most of the next two weeks (about to head to the airport actually), so probably won't get to this until September. If you can do the review @maxrjones, that would be great!

@maxrjones
Copy link
Member

I'll be travelling most of the next two weeks (about to head to the airport actually), so probably won't get to this until September. If you can do the review @maxrjones, that would be great!

Sounds good, safe travels! @andersy005 I'll block some time to look at this on Friday.

@maxrjones
Copy link
Member

@andersy005 I'll block some time to look at this on Friday.

I've had some other work come up today so unfortunately won't get a chance to review. I'll send an updated timeline next week, but feel welcome to proceed without me if you prefer.

@andersy005
Copy link
Member

this can definitely wait. so, no rush/worries @maxrjones

@andersy005
Copy link
Member

i spoke with @maxrjones earlier this week and it's my understanding that we can merge this as-is and cut a new release, making the updated feature available to users. we can revisit any potential issues or missing elements if needed.

@jhamman, thank you for putting this together!

@andersy005 andersy005 merged commit dcd09a0 into xarray-contrib:main Sep 13, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants