Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate dhdt with ICESat-2 ATL11 data over Antarctica #41

Merged
merged 5 commits into from
Jun 2, 2020

Conversation

weiji14
Copy link
Owner

@weiji14 weiji14 commented May 31, 2020

Finding change where it's at over Antarctica, and quickly before the ice melts! Calculating the rate of ice surface elevation change over time (dhdt) using ICESat-2 ATL11 data! For that, we'll do a simple linear regression to fit a trend line through the elevation points over time. Will focus on point locations where there is significant (>0.5 m) elevation change.

Linear Regression example from scipy.stats.linregress

Amazing to think how long we've come since ICESat-1, especially in terms of parallel compute code like dask. There's more than a magnitude order increase in data, but it feels like we can do so much more now too.

ATL11 Rate of Height Change over Time dhdt at Antarctica

TODO:

Citations:

  • Horgan, H. J., Anandakrishnan, S., Jacobel, R. W., Christianson, K., Alley, R. B., Heeszel, D. S., Picotti, S., & Walter, J. I. (2012). Subglacial Lake Whillans—Seismic observations of a shallow active reservoir beneath a West Antarctic ice stream. Earth and Planetary Science Letters, 331–332, 201–209. https://doi.org/10.1016/j.epsl.2012.02.023
  • Smith, B. E., Fricker, H. A., Joughin, I. R., & Tulaczyk, S. (2009). An inventory of active subglacial lakes in Antarctica detected by ICESat (2003–2008). Journal of Glaciology, 55(192), 573–595. https://doi.org/10.3189/002214309789470879

References:

Finding change where it's at, and quickly! There's nothing really special about calculating height range, it's just subtracting minimum height from maximum height. The `nanptp` function (which numpy doesn't have) merely accounts for NaN values, and I've put in the extra effort to parallelize it on dask using xr.apply_ufunc, so that it takes minutes to run on ~1 trillion points. Took a long detour packaging up `deepicedrain` properly before this dhdt notebook can be released, but it's worth it to polish out most of the cruft and reduce the amount of boilerplate preprocessing code. Also remove the need to close the atl11 test dataset as using to_dask() instead of read() seems to be nicer.
@weiji14 weiji14 added the feature 🚀 Brand new feature label May 31, 2020
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

weiji14 added 3 commits May 31, 2020 21:12
Performing linear regression in parallel, on 10 million points, in about 3 minutes, plus a few extra minutes of preprocessing time. Again, the rate of height change over time is just based on an ordinary least squares linear regression algorithm, nothing too fancy. The `nan_linregress` function (which wraps around scipy.stats.linregress) accounts for NaN values by masking them out, and the linregress results are placed into a single numpy.ndarray so that we can parallelize it using xr.apply_ufunc. This is based on a lot of research, looking at stackoverflow answers and people's Github code snippets (all linked in the Pull Request).

Will need to refactor a lot of elements in the coming week to keep things DRY, e.g. collapsing the datashade functionality into a one-liner. Probably need more tests too, and I've added test_nanptp_with_nan for good measure. Also patched Github Actions CI again 4aabf6e that was missing the actual `--no-root` statement.
Add a README.md file in the deepicedrain directory, listing out what each of the files (atlas_catalog.yaml, deltamath.py, spatiotemporal.py) are for! Also shifted usage instructions up on the main README.md, and updated the teaser image to one of dhdt over Antarctica!
Putting the datashader functionality into the Region class, so that we can make use of the bounding box information! Takes in a pandas.DataFrame table of x, y, z points, and outputs an xarray.DataArray grid for visualization purposes at the pre-set scale. Do some simple algebra math to set the correct aspect ratio with only plot_width as input. Standardized on the variable names to be ds_* for xarray.Datasets and df_* for pandas.Dataframes. Storing all of the intermediate Zarr and Parquet data files into an ATLXI folder. Will update plots in another commit once I sort out some issues, and maybe start a new file called visualization.py to handle the plotting code.
Tidy up our rate of height change over time (dhdt) code, putting them into an xarray.Dataset with proper names, and keeping things snappy by using chunks when reading intermediate Zarr stores. The hrange and dhdt plots have been updated to use the correct datashaded image aspect ratio as promised, both in the notebooks and the README.md! Also fixed coordinates of Kamb Ice Stream, as they were actually at Whillans Ice Stream, rookie x/y mistake.
@weiji14 weiji14 marked this pull request as ready for review June 2, 2020 08:40
@weiji14 weiji14 closed this in 8be8687 Jun 2, 2020
@weiji14 weiji14 merged commit 8be8687 into master Jun 2, 2020
@weiji14 weiji14 deleted the atlxi_dhdt branch June 2, 2020 08:47
@weiji14 weiji14 added this to the v0.2.0 milestone Jun 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature 🚀 Brand new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant