-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a demcoreg_init.sh
wrapper for all of the get_*.sh
scripts
#17
Comments
Based on discussion with @scottyhq, I am listing what data we fetch, whether they are available on "Earth on AWS", and if not available, then from where we download them without a login.
None of these is currently available on "Earth on AWS", so maybe including auxiliary data in a docker image is the better option. Just to confirm, all this will come into play if we set up a binder hub for lightweight computations, right @dshean ? Or is the thought that here that the user can fetch the relevant compressed .tif files stored on the docker image directly ? Note: The repository does have shell scripts to fetch these locally. |
OK, thanks for checking. The idea was to have them ready to go "locally" in the docker image with all of the necessary dependencies. Also, it's not just about downloading the files, there are also some processing steps in the shell scripts to prepare for dem_align.py or dem_mask.py. Some of these steps may be less relevant for newer versions of the products, and could probably come up with a better solution for combining all relevant RGI region shp. |
After talking with @ShashankBice I spent a couple hours with the ASP docker image and ran through the demcoreg beginners doc. To some extent we already have a nice solution for the preconfigured computing environment, I just tried with geohackweek tutorial contents
Embedding in the image could be practical fro data volumes< 1Gb, but it seems all these datasets could easily be 10Gb+. So my suggestion is to let users run get_X.sh as-needed or host "analysis-ready" data (unzipped, etc) externally on S3 or elsewhere. Perhaps some code refactoring could allow streaming only portions of these global datasets from agency servers or FTP locations. |
I didn't try all the get_*.sh scripts, (just nlcd, rgi, and bareground) and it looks like bareground hosting URL changed:
|
Thanks for taking a look. I remember this coming up in Jan 2020 in email thread with @cmcneil-usgs. Here are my notes:
|
@scottyhq, I agree with all of these thoughts. Simplest solution is to maintain For the datasets with tif tiles on the web (like the new link for bareground dateset), I expect we could prepare and distribute a vrt that would do the trick. Anybody want to do a quick test with a few tiles? |
I launched pangeo binder and while upload speed is not great, successfully uploaded a >100 MB DEM. Seems like we can recommend this for new users who have one-off application. though we should disable RGI glacier masking by default (I'll create separate issue). Playing around with fresh install and successfully ran the Rainier DEM samples from the geohackweek raster tutorial (great idea!). Let's keep hacking on this and update the README/doc with a simple example... |
Done in bd48b4f |
I will add this example as an extension to the ASP DEM tutorial over the weekend. |
Sounds great @ShashankBice! Probably best to keep it separate from the core ASP processing tutorial though - modular is good. What if we had a separate tutorial in demcoreg? |
makes sense :) ! |
@scottyhq I think you're using https://github.com/uw-cryo/asp-binder-dev/blob/master/binder/postBuild Looks like it pulls latest source from github and does dev install. Strangely, I'm not seeing latest commits when launching via pangeo binder. Firing up terminal and running |
Good catch, it was a a bit of a hack solution to try things out. Anything in I'm trying moving those pip install commands to the That seems to work @dshean, you can keep using the same binder link and you'll have the latest from github :
|
Nice! That makes sense, and seems like a good solution. Thanks for looking into it! |
Ideally, we would fetch all of these layers (e.g., NLCD, bareground) on the fly through services like "Earth on AWS" registry: https://aws.amazon.com/earth/
At present, they still require local download, extraction and processing.
We should give the user the option to get all demcoreg layers in one shot, or instructions on how to run the necessary
get_*.sh
script. Right now, when a user runsdem_align.py
it starts with a bunch of downloads - no good.Alternatively, we could include all auxiliary data in a docker image, or store ourselves in the cloud. Should discuss with @scottyhq
The text was updated successfully, but these errors were encountered: