-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save rendered SigPlot images with the notebook #26
Comments
I'm going to generalize your point for a second to all related use cases and pose them as questions:
Aside from (3), where the answer is export to PNG in a similar vein as matplotlib, (1) and (2) require more thought. As a first pass, I think (1) and (2) should simply be export to PNG, which you referenced in the below paragraph. The final path will require more thought -- how do you preserve an interactive SigPlot with a large amount of data? Should you bother writing out that multi-MB HTML file? What do you do when the original resource is not available?
Yep, I thought our intern had a working PNG export this past summer, but it looks like it might've been lost in a branch or something.
Let me know if I'm misrepresenting your point here, but I'm reading this as: if a user plots from a websocket or an href, there is an expectation of persistence (even if the original resource is no longer available). (For an href, this should come in the form of a downloaded file.) If I've interpreted your point correctly, let's take a step back and discuss expected and reasonable use cases of Jupyter Notebook. I have always observed Jupyter Notebook used as a "playground" -- i.e., an area to do exploratory data analysis or an area to begin prototyping a capability -- or as a pedagogical tutorial builder/interactive documentation. In all of these cases, any data used in the notebook should reside in the notebook environment/directory. If this is the case, perhaps there's a use case of which I'm unaware. (cc @maihde) |
I agree, these are the right questions to ask. I think the additional use cases that I've observed, and that potential users have expressed to me, include:
In both cases, I notice that a primary activity is reading as opposed to exploratory analysis:
Even if the data is still available, it's helpful to be able to transparently read a notebook rather than Run Cell / Run All (and potentially deal with any environmental changes). In the case of sharing a notebook, it may be tricky to get at the original data if it resides on the other side of a firewall. I think the reading use case can apply to pedagogical notebooks too. Although you may well want to recreate the plot if possible, you'd like to know what it should look like when you do. (See https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects/blob/master/example-data-science-notebook/Example%20Machine%20Learning%20Notebook.ipynb as an example of something you might want to read first, run later.) Even the jupyter-sigplot demo notebook, as rendered by Github, is an example of this reader conop: if a casual reader could see what SigPlot would look like for the given inputs, they'd be better positioned to determine whether it's a potential fit for their problem. Binder sometimes takes several minutes to load, and one may have a clone of the extension on a network that can't reach your CDN. The Now, Bokeh treats (1) and (2) a little differently, in that its figures tend to be "live" even without the original data available. SigPlot is more capable for interactive tasks on its own than Matplotlib/Bokeh (which require server-side logic to "do" anything). In the end game, it would be powerful if all that interactivity were available in statically rendered notebooks like the HTML case. I don't have a feel for whether that use case (fully armed and operational SigPlot from a standalone loaded document) is central or fringe. I suspect that full interactivity will end up going along with access to the input data for the playground / exploratory analysis and training cases, and will not be too missed for the simple reading / reporting cases. The scientific paper may be obsolete, but if the data's not available, a notebook can still be useful, like a whitepaper. If you package the data with your notebook, you get a much richer experience, at the minor cost of re-running the notebook cells. |
I wonder also if there are some different classes of interactivity that we might be able to support? Zooming and panning require access to the full input data. CX mode. trace style, abscisa/index, and maybe some scaling, could operate on the subset of the data that's in the viewport (including compressed). |
@maihde would be able to speak to that best. |
Consolidating the |
I don't know if it's a red herring, but I noticed that there's an option to "save widget state" in the Web notebook's toolbar. Might be worth digging into what this means and see if it could be leveraged to get even an interactive rendered SigPlot saved with the notebook file. (It would be important to investigate behavior when plotting very large files.) |
I've been working on getting a PNG to stay with the notebook, like There are a few other cool display classes available to us here. I wonder if the |
I think the basic approach is sound: have the client grab a PNG from SigPlot, put it in a traitlet for the server to store, display the PNG on the client. Does the notebook already serialize the PNG in the I think both Image and Javascript can be set as rich reprs. The Image repr is better for printing and reading, and some sharing (all you need is the .ipynb file). So the first serialization to support seems like Image; then users can re-evaluate cells to get a live widget if needed. Maybe a future enhancement would be to dynamically and automatically replace the image with a live widget, if all the libraries and data are available to a running kernel. Or, like Bokeh, maybe we could save enough code to render a Javascript rich repr. I still don't really know how widgets and rich reprs interact. |
I did not modify the I've noticed that there is an issue when you re-run the entire notebook, the |
That makes sense. We have the same race in the other direction with I think the basic idea we currently have implemented is sensible: queue up events that need to happen after render (or overlay, in the case of |
I'm back to looking into this enhancement. The hope is to be able to control the cell that has the plot object when the base64 image finally gets to the python kernel. @mrecachinas had previously recommended seeing if we could change the background of that cell to the png representation so when a notebook is loaded, we have the stored png available. I'm looking to do that with |
Just noting here that I finally got a chance to play with this. I see the race condition you're talking about in the example notebooks. When I run the code in my some of own notebooks, I get a "Memory" object and no "UUID" element, and the images don't save with the notebook. Still very promising! |
@sterre What versions of ipywidgets, jupyter, notebook, traitlets, and ipython are installed? |
This was using Anaconda 2019-10 for all infrastructure, jupyter-sigplot freshly built |
We've talked about this some in person and on Slack. This issue is just trying to capture some of what we've discussed, with no strong organization.
Currently, when a saved notebook is re-opened, SigPlot widgets do not reliably show a rendered image without re-evaluating the generating cell and all dependencies. This is especially vexing in cases like nbviewer or GitHub / GitLab, and is likely a showstopper if the original data is no longer available.
There's some nascent logic in the extension around
done
andimageOutput
that looks like it wants to capture a png from SigPlot and save it to the client for rich representation. This seems like a solid approach, with the only question being how to make that PNG repr display at the right time. It's possible that widgets and rich representation don't mix--this from a very quick experiment where I tried to add an HTML representation to the hello world widget.Libraries like Matplotlib/Seaborn and Bokeh seem to address this by using a Javascript rich representation instead of a bona fide widget. I make this claim based on observing what's saved with a notebook containing figures from each library.
%matplotlib notebook
generates a Javascript and image representation. On load, the image is displayed until the cell is re-evaluated.%matplotlib inline
just generates an image repr.I thought D3 might be a reasonable analog to SigPlot, so went looking for some examples of D3 in a notebook. Here's what I found. None of these is as complete as we might like.
It appears that Javascript reprs take precedence over other rich reprs. This may depend on whether the notebook is trusted. If you return
None
from a_repr_*
function, that repr is not used, which could potentially allow us to wait until a PNG was available before rendering it.Whatever the representation in the saved notebook, it needs to deal gracefully with very large input data. Matplotlib and Bokeh do this by serializing the figure instead of the data. (There's a size inflation for small data sets, but a big saving on larger data.) A DataShader-style approach may also be relevant.
JupyterLab has a different extension model, and also restricts Javascript content.
The text was updated successfully, but these errors were encountered: