Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parallelisation, add clip_models functionality #25

Merged
merged 7 commits into from
Nov 3, 2024

Conversation

robbibt
Copy link
Member

@robbibt robbibt commented Oct 31, 2024

New features

  • Added new eo_tides.utils.clip_models function for clipping tide models to a smaller spatial extent. This can have a major positive impact on performance, sometimes producing more than a 10 x speedup. This function identifies all NetCDF-format tide models in a given input directory, including "ATLAS-netcdf" (e.g. TPXO9-atlas-nc), "FES-netcdf" (e.g. FES2022, EOT20), and "GOT-netcdf" (e.g. GOT5.5) format files. Files for each model are then clipped to the extent of the provided bounding box, handling model-specific file structures. After each model is clipped, the result is exported to the output directory and verified with pyTMD to ensure the clipped data is suitable for tide modelling.

image

Major changes

  • The parallel_splits parameter that controls the number of chunks data is broken into for parallel analysis has been refactored to use a new default of "auto". This now attempts to automatically determine a sensible value based on available CPU, number of points, and number of models being run. All CPUs will be used where possible, unless this will produce splits with less than 1000 points in each (which would increase overhead). Parallel splits will be reduced if multiple models are requested, as these are run in parallel too and will compete for the same resources.
  • Changed the default interpolation method from "spline" to "linear". This appears to produce the same results, but works considerably faster.
  • Updates to enable correct cropping, pending upcoming fixes in pyTMD v2.1.8.

Breaking changes

  • The list_models function has been relocated to eo_tides.utils (from eo_tides.model)

@codecov-commenter
Copy link

codecov-commenter commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 59.89305% with 75 lines in your changes missing coverage. Please review.

Project coverage is 76.2%. Comparing base (7cfeac2) to head (9c90a0b).

Files with missing lines Patch % Lines
eo_tides/utils.py 49.6% 61 Missing and 3 partials ⚠️
eo_tides/model.py 79.6% 6 Missing and 5 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main     #25     +/-   ##
=======================================
- Coverage   83.7%   76.2%   -7.5%     
=======================================
  Files          6       6             
  Lines        540     622     +82     
  Branches      91     107     +16     
=======================================
+ Hits         452     474     +22     
- Misses        49     106     +57     
- Partials      39      42      +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@robbibt robbibt merged commit 0b69bfd into main Nov 3, 2024
11 checks passed
@robbibt robbibt changed the title Improve parallelisation Improve parallelisation, add clip_models functionality Nov 3, 2024
@robbibt robbibt deleted the improve_parallel branch November 14, 2024 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants