Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a JupyterBook "Library" #54

Open
mgrover1 opened this issue May 11, 2021 · 21 comments
Open

Add a JupyterBook "Library" #54

mgrover1 opened this issue May 11, 2021 · 21 comments

Comments

@mgrover1
Copy link
Contributor

It would be neat to add a gallery of Jupyter Books related to different workflows within CGD/more generally NCAR. We could format it as a gallery, but really it would function as a "Library of Books", providing templates at the top of the page for users to easily create their own books.

@bonnland
Copy link
Contributor

https://github.com/NCAR/notebook-gallery

This was intended as a place to share notebooks. It is already registered with the pangeo gallery here:

https://gallery.pangeo.io/repos/NCAR/notebook-gallery/

It also has nightly builds for notebooks that can be run on a non-HPC machine.

It could use more attention and vision than it currently has, however.

@mgrover1
Copy link
Contributor Author

The plan would be to still use that gallery, my vision here was moreso JupyterBooks which would include both markdown and notebooks, with a larger project in mind (ex. scientific papers, model documentation, etc.). I think a good example here would be the Pythia Foundations https://foundations.projectpythia.org/landing-page.html with a gallery of current books/resources here https://projectpythia.org/pages/links.html.

A couple books we could add would be:

  • CESM MARBL book - including model + data documentation, as well as diagnostics
  • CESM end-to-end example, from case creation to post processing to diagnostics, linking to existing documentation where possible (this will likely be renamed to prevent confusion with exiting cesm-workflow package)

@dcherian
Copy link
Contributor

dcherian commented May 13, 2021

Maybe the notebook gallery should become a "book gallery" and things like a "MARBL book" be a "part" with "chapters". IIRC jupyter-book allows this kind of organization.

@mgrover1
Copy link
Contributor Author

What do you mean by IIRC?

@dcherian
Copy link
Contributor

"if I remember correctly" :)

@mgrover1
Copy link
Contributor Author

Coming back to this.. after yesterday's WIP discussion - would it be helpful to have an "esds-examples" gallery which would be more of work-in-progress notebooks AND beef up the NCAR notebook gallery which would be for more polished examples?

Perhaps adding categories (to start) for:

  • CESM Diagnostics
  • Machine Learning
  • WRF

Both of these would be in the JupyterBook format, so one could easily query + find examples... we would likely need to make sure DOIs are associated with these still, but open to suggestions/feedback.

We could have a local copy on GLADE of this so if people want to copy the notebook there, they can - or access from the ESDS site?

Any thoughts @dcherian @andersy005 @bonnland @matt-long

@dcherian
Copy link
Contributor

would it be helpful to have an "esds-examples" gallery

I'm not sure this would really help.

I thought the bottleneck is super-easy sharing of notebooks which would look like:

  1. User A clicks button in jupyterlab to upload to gist.github.com
  2. User B clicks button to either:
    1. render notebook nicely in nbviewer
    2. open gist notebook on binder on cheyenne with some sensible default environment.

I use the notebook buddy extension for step 2.1 pretty frequently (https://github.com/iArunava/NoteBook-Buddy) and its great. For most reading/debugging/sharing use-cases, just a quick way to see the notebook is perfect.

@mgrover1
Copy link
Contributor Author

@dcherian would you see this being a part of the Jupyterhub then?

@dcherian
Copy link
Contributor

Somewhat related: https://jupyterbook.org/interactive/launchbuttons.html

There's a way to launch to a jupyterhub. So I may be coming around to your idea.

I'm assuming that having to specify an env file is the real roadblock step. Also that people like to have local .py files with utility functions used in the notebook. What do we do about that?

@dcherian
Copy link
Contributor

dcherian commented Jun 23, 2021

Can we write a command line tool that takes a path; converts it to a jupyterhub link thatopens that notebook in a "esds-kitchen-sink" environment?

Then anyone can run the notebook and get access to local .py files. The submitter can use the tool to make sure the kitchen-sink environment has everything needed. If not a small update will be needed but the environment should asymptotically approach a real kitchen sink :D

I'm also thinking that only one copy of this environment exists in a shared path that everyone can access (under either xdev or esds), that somehow gets used

@mgrover1
Copy link
Contributor Author

Can we write a command line tool that takes a path; converts it to a jupyterhub link thatopens that notebook in a "esds-kitchen-sink" environment?

Would this be run while on Casper/Cheyenne then?

Then anyone can run the notebook and get access to local .py files. The submitter can use the tool to make sure the kitchen-sink environment has everything needed. If not a small update will be needed but the environment should asymptotically approach a real kitchen sink :D

Would this modify the notebook that is already stored there, or would this essentially copy over the existing code to some gist?

I'm also thinking that only one copy of this environment exists in a shared path that everyone can access (under either xdev or esds), that somehow gets used

This seems like a great idea... just trying to think about the execution side - where would be the best place to test this out?

@dcherian
Copy link
Contributor

Would this be run while on Casper/Cheyenne then?

Yes. I'm thinking of the debugging/quick sharing use-case.

Would this modify the notebook that is already stored there, or would this essentially copy over the existing code to some gist?

No it just opens jupyterlab in the right directory on GLADE.

This seems like a great idea... just trying to think about the execution side - where would be the best place to test this out?

I really don't know; just throwing out ideas here. We "just" need a world-readable directory on glade which is easy to do...

@dcherian
Copy link
Contributor

cc @kmpaul @matt-long

What do you think about my "easy sharing of notebook" proposal above?

@kmpaul
Copy link
Contributor

kmpaul commented Jul 6, 2021

@dcherian: I think this idea is fantastic. And I think the path forward is to demonstrate a use case with jupyter-forward. What is the minimum viable workflow to demonstrate this? Then we can discuss how to optimize the workflow.

@dcherian
Copy link
Contributor

dcherian commented Jul 6, 2021

Here's a notebook: /glade/work/dcherian/etpac/notebooks/jforward_demo.ipynb

My idea is that anyone with access to cheyenne should be able to do nbload casper /glade/work/dcherian/etpac/notebooks/jforward_demo.ipynb (for e.g.) and it should mostly work.

I think this should be an alias for something like

jupyter-forward casper --conda-env /glade/work/xdev/envs/esds-kitchen/sink --load-file /glade/work/dcherian/etpac/notebooks/jforward_demo.ipynb

It should be smart enough to check if the file is accessible first and raise a nice error message asking the user to make the file world-readable.

@andersy005
Copy link
Contributor

@dcherian: I think this idea is fantastic. And I think the path forward is to demonstrate a use case with jupyter-forward. What is the minimum viable workflow to demonstrate this? Then we can discuss how to optimize the workflow.

@NCAR/xdev, here's a working, minimum viable prototype. For anyone interested in test driving this, make sure you are using the main branch of jupyter-forward

$ jupyter-forward --version 
Jupyter Forward CLI Version: 2021.5.11.post2

$ jupyter-forward casper \
> --notebook /glade/u/home/mgrover/southern_ocean_zooprod_CESM-LE.ipynb \
> --conda-env /glade/p/cisl/iowa/xdev/mambaforge/envs/esds-kitchen-sink

@andersy005
Copy link
Contributor

andersy005 commented Jul 7, 2021

This option introduces one limitation:

  • If notebook points to a notebook for which the user has no write permissions, the user is still able to run the notebook but won't be able to save the notebook

@dcherian
Copy link
Contributor

dcherian commented Jul 8, 2021

Anderson,that's amazing work. (and quick!)

I did get this warning when I ran my notebook:

WARNING:param.main: pandas could not register all extension types imports failed with the following error: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic' (/glade/p/cisl/iowa/xdev/mambaforge/envs/esds-kitchen-sink/lib/python3.8/site-packages/pandas/core/dtypes/generic.py)

with this cell:

%matplotlib inline
%load_ext autoreload
%load_ext watermark

import cf_xarray as cfxr
import cmocean as cmo
import dask
import distributed
import holoviews as hv
import hvplot.xarray
import matplotlib as mpl
import matplotlib.pyplot as plt
import ncar_jobqueue
import numpy as np
import pandas as pd
import xarray as xr
import xgcm

%watermark -iv

using

jupyter-forward casper --port 9999 --notebook /glade/work/dcherian/etpac/notebooks/jforward_demo.ipynb --conda-env /glade/p/cisl/iowa/xdev/mambaforge/envs/esds-kitchen-sink

@dcherian
Copy link
Contributor

dcherian commented Jul 8, 2021

Honestly this works so well I don't think we need another CLI utility. When --notebook is specified and --conda-env is not, jupyter-forward could just use this kitchen-sink environment.

@andersy005
Copy link
Contributor

andersy005 commented Jul 8, 2021

I did get this warning when I ran my notebook:

WARNING:param.main: pandas could not register all extension types imports failed with the following error: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic' (/glade/p/cisl/iowa/xdev/mambaforge/envs/esds-kitchen-sink/lib/python3.8/site-packages/pandas/core/dtypes/generic.py)

It appears that this issue has to do with hvplot. I upgraded hvplot to an alpha release of hvplot, and the warning went away.

Honestly this works so well I don't think we need another CLI utility. When --notebook is specified and --conda-env is not, jupyter-forward could just use this kitchen-sink environment.

I concur. I was thinking that we could introduce a configuration file to hold jupyter-forward user defaults such as --conda-env, host, etc, and this would allow us to avoid special-casing NCAR clusters + environments in jupyter-forward itself...

@andersy005
Copy link
Contributor

It appears that this issue has to do with hvplot. I upgraded hvplot to an alpha release of hvplot, and the warning went away.

I take this back -:). The issue is still there (not sure how I previously missed it). Will look into it later today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants