Improve parallelisation, add clip_models
functionality
#25
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New features
eo_tides.utils.clip_models
function for clipping tide models to a smaller spatial extent. This can have a major positive impact on performance, sometimes producing more than a 10 x speedup. This function identifies all NetCDF-format tide models in a given input directory, including "ATLAS-netcdf" (e.g.TPXO9-atlas-nc
), "FES-netcdf" (e.g.FES2022
,EOT20
), and "GOT-netcdf" (e.g.GOT5.5
) format files. Files for each model are then clipped to the extent of the provided bounding box, handling model-specific file structures. After each model is clipped, the result is exported to the output directory and verified withpyTMD
to ensure the clipped data is suitable for tide modelling.Major changes
parallel_splits
parameter that controls the number of chunks data is broken into for parallel analysis has been refactored to use a new default of "auto". This now attempts to automatically determine a sensible value based on available CPU, number of points, and number of models being run. All CPUs will be used where possible, unless this will produce splits with less than 1000 points in each (which would increase overhead). Parallel splits will be reduced if multiple models are requested, as these are run in parallel too and will compete for the same resources.method
from "spline" to "linear". This appears to produce the same results, but works considerably faster.Breaking changes
list_models
function has been relocated toeo_tides.utils
(fromeo_tides.model
)