-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reusable SpatioTemporal functions #40
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Found a better name than geo.py - spatiotemporal.py! Because time matters. Also renaming the BBox class to Region as the double capital letters in a row didn't look Pythonic. The Region.subset function has been revamped to be a lot more user friendly, giving the actual subsetted xr.Dataset instead of just the boolean array.
Check out this pull request on Review Jupyter notebook visual diffs & provide feedback on notebooks. Powered by ReviewNB |
Enable importing of the ATLAS intake catalog straight from deepicedrain! This functions almost like a test fixture, enabling us to easily load ICESat-2 data easily in our scripts. I.e. keeping things DRY. Managed to get rid of the pytest fixture in test_calculate_delta.py which did the sample data loading from the catalog before. Renamed the very generic catalog.yaml to a slightly less generic atlas_catalog.yaml. Added some description metadata to that catalog file, and include nested in at11_test_case. Also ignoring .h5 data files now.
So that calling `deepicedrain.catalog` will actually work when people `pip install deepicedrain` without cloning the git repository (otherwise a FileNotFoundError is raised). Added a plugin to pyproject.toml so that the ATLAS catalog can be loaded via intake through `intake.cat.atlas_cat` too! Done by moving atlas_catalog.yaml and tests/ into the deepicedrain folder. This relies on a bit of magic (good ones), using the Python 3.7+ importlib.resources module to locate the atlas_catalog.yaml file via a relative path where the package is installed (in site-packages). Basically following a modified version of https://github.com/intake/intake-examples/tree/04bbe1880f2a4d2c74c6ea9c54385c380c1b9a1e/data_package. Had to use {{ CATALOG_DIR }} to link to the test_catalog.yaml file too. Side effect of this is that we're bundling all our tests into the `deepicedrain` python package, which might be bad for file size but good for finding test examples I guess.
Ensure that the ATLAS intake catalog is able to be loaded, and secretly document it's usage a little bit. Make the Github Actions test pass by busting the poetry cache through bumping json5 from 0.9.4 to 0.9.5.
weiji14
force-pushed
the
geo_to_spatiotemporal
branch
from
May 29, 2020 01:37
ea8e8c6
to
3c3d021
Compare
Turn the ICESat-2 delta_time to utc_time conversion code in our jupyter notebook into a well tested function! The cool bit is that we can pass in either a dask or numpy backed xarray.DataArray, and get the equivalent output, with dimensions and coordinates preserved! Gotta love [NEP18](https://numpy.org/neps/nep-0018-array-function-protocol.html). Added a chunks statement to test_catalog.yaml, and ensure the file is cached in a relative path. Had to make sure to read the test atl11_dataset using dask and close it properly after each test (?) or subsequent tests will fail, seeing a numpy.array instead of a dask.array (?). Should do proper setup/teardown next time. Also bumping up cftime from 1.1.1.2 to 1.1.3 and certifi from 2019.11.28 to 2020.4.5.1 to bust the CI cache, just in case.
weiji14
force-pushed
the
geo_to_spatiotemporal
branch
from
May 30, 2020 00:05
2551cdb
to
9956922
Compare
weiji14
force-pushed
the
geo_to_spatiotemporal
branch
from
May 30, 2020 03:24
4b64234
to
88a7553
Compare
Collapse the geographic reprojection code into a one-liner! Basically wraps around pyproj, and handles lazy dask.DataFrame and xarray.DataArray objects by including the will-be released workaround for handling __array__ objects (scheduled for pyproj 3.0). Reinstated the 'catalog' variable in atl06_play.ipynb, as it's used further down the notebook. Also hashing python files in deepicedrain to check whether we should bust the CI cache to reinstall `deepicedrain`, instead of manually bumping dependencies each time. That said, we'll bump up pyzmq from 19.0.0 to 19.0.1 and keep doing random bumps until this branch is merged into master.
weiji14
force-pushed
the
geo_to_spatiotemporal
branch
from
May 30, 2020 03:54
88a7553
to
be98dde
Compare
Provide an example of using `deepicedrain` on the main README.md page. Added a YUML diagram showing how data flows from ATL06 to ATL11. Bumped up pyparsing from 2.4.6 to 2.4.7 for good measure to bust the CI cache. Also listed a few related ICESat-2 projects on Github.
weiji14
added a commit
that referenced
this pull request
Jun 3, 2020
Patches #40. The deltatime_to_utctime converter didn't handle pandas.Series properly, as the start_epoch variable would have an index of 0, and the datetime + timedelta operation would only get applied at index 0 instead of along the whole column. Calling squeeze() converts the pandas.Series to a pandas.Timestamp, so that the addition operation is broadcast to the whole column. This also works on an xarray.DataArray and numpy.array. Doesn't work for a dask.Series, but we can work that out when the need arises.
weiji14
added a commit
that referenced
this pull request
Jun 3, 2020
Patches #40. The deltatime_to_utctime converter didn't handle pandas.Series properly, as the start_epoch variable would have an index of 0, and the datetime + timedelta operation would only get applied at index 0 instead of along the whole column. Calling squeeze() converts the pandas.Series to a pandas.Timestamp, so that the addition operation is broadcast to the whole column. This also works on an xarray.DataArray and numpy.array. Doesn't work for a dask.Series, but we can work that out when the need arises.
weiji14
added a commit
that referenced
this pull request
Jun 10, 2020
Patches #40. Allow for converting a single numpy timedelta64 value to datetime64, instead of raising a "ValueError: Could not convert object to NumPy timedelta".
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Because time 🕛 matters! Found a better name than
geo.py
-spatiotemporal.py
! Name inspired by the SpatioTemporal Asset Catalog, which is itself a cool project!Also making a proper 'data package' following https://intake.readthedocs.io/en/latest/data-packages.html, so that we can reuse our data in our jupyter notebook scripts and tests. Basically allowing for:
TODO in this PR:
TODO in future PRs:
lonlat_to_xy
function via the__array__
method, see ENH: Support objects with __array__ method pyproj4/pyproj#625. Edit: Done at Bump PROJ from 6.3.1 to 7.0.0, pyproj from 2.6.0 to 3.0.dev2 #164!References: