-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install missing packages to pangeo #898
Comments
@wildintellect The list of packages listed above, notably those for Pangeo were included in the original working list of packages we used to determine what to add to our workspace in our recent release. However, columns F and G were not filled in for these, leading me to believe that they shouldn't be added to the Pangeo environment unlike other packages that were indicated to be added. Apologies if this was a mistake on my part. If these were in fact needed, can the data/users indicate which ones should be added, we want to keep the images lean, and the more packages we add, the harder it becomes for us to manage build dependency issues that @grallewellyn and I worked a lot on in the last release. |
Can we keep separate tickets for the different workspace types. |
I agree, new issue here: #902 and updated this issue |
@wildintellect Any input from the user working group regarding which packages listed in this issue need to be added to the pangeo workspace? |
I've asked @jsignell to have a look at this. Nothing urgent has come up, so you can probably go ahead with your planned changes and we'll revisit in another update cycle if the users report needing additional libs. |
Using comments and other information derived from the google doc, I've compiled the list below regarding packages we should be able to ignore or omit for Pangeo for the reason posted, but if any of these should not be in this omit list, please point it out.
After omitting the packages above, the list of packages to potentially add to our Pangeo workspace is the following. We need clarification from @wildintellect and the user working group members about which of these packages should be added.
|
@wildintellect I saw your most recent message after I posted the list seen in the post above. Before we proceed with this change we'll need confirmation from you (based on some of your comments in the original Google document) and from the user working group on which of these packages to add. We can't tell which are must-haves vs should-haves or nice-to-haves and we're trying to avoid bloating the workspace with unneeded packages. cc: @jsignell |
I'm a little confused about the goal of this ticket. My understanding is that the value of the pangeo-notebook environment is that:
Is it potentially bloated? Sure. But I would argue that that doesn't actually matter too much. |
We would like to streamline the image as the size of the container has a direct relation to the performance during data processing (each worker node downloads the container image at the start of a job). If the additional libraries take up an additional 5 gb each, and we launch a cluster of 1000 nodes, that's an additional 5 tb of unnecessary data transfer (albeit free) but also compute/wait time for that download. @anilnatha Please quantify the size difference between an optimized build and an unoptimized build. |
You might have already done all this, but here are a bunch of other ways to shrink images without removing packages. It looks like the standard pangeo-notebook image already implements the suggestions from https://jcristharif.com/conda-docker-tips.html and ends up with a docker image that compresses to 1.92 GB (https://hub.docker.com/r/pangeo/pangeo-notebook/tags) |
(fyi) I've issued a PR that takes care of adding the images listed below. If there are additional packages that are needed, we can try to squeeze them into this release, or they will have to be added in the next release. Added to base image (used in DPS)
Added to jupyterlab image (ADE)
|
FYI something is wrong with the Python Paths in
|
@wildintellect Opened a new ticket with that problem here: #956 and commented with what I am seeing |
To install Pangeo that were suggested in the original ticket from links:
Also, these packages were in the spreadsheet but are missing. We need to know to put in DPS or ADE from uwg:
Update in these spreadsheets
Pangeo: https://docs.google.com/spreadsheets/d/1krnOZ1SFW-GA_jOiL-nWzhNAA3JKTNqFrG0IpObHsBg/edit?usp=sharing
Vanilla/ kinda isce3: https://docs.google.com/spreadsheets/d/18Orw1cZbqUdOPBy9hFwXm43Pb0MZL7n1KEJuWO9UuIY/edit?usp=sharing
R: https://docs.google.com/spreadsheets/d/1mrQ3gdcxZHZNTksUmLz6qqqNSNxhAoonB9znLU0c0pk/edit?usp=sharing
The text was updated successfully, but these errors were encountered: