-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plot is very slow to display large array via overlay_array #44
Comments
I tested this at home, and it worked significantly faster. A 200K-element array only took a few seconds to plot. The system that is running slow is actually an OpenStack VM, but it has 8 cores and 15 GB of RAM, both of which are greater than my home workstation. The OpenStack instance is running a hardened CentOS7, and my home machine is running Ubuntu 18.04. Any suggestions on what and how to test for bottlenecks? |
I know that Websockets in general are capable of transmitting binary data. If I get time, I'd like to look into whether the Jupyter Comm abstraction has somewhere we could hook into for binary transport instead of JSON. |
@nispio if you can, open up the Chrome Developer tools and make a HAR file capture of the slow plot. Then send me the HAR file privately or attached to this commit. This will help us understand where the slow down is coming from. |
Also @nispio what version of Python, browser, and browser version are you using? (Even though we are fairly confident it's the JSON serialization, it would still be useful to have this environment information.) |
@nispio I'm able to reproduce this behavior as well. In my case the slow down is definitely on the Python side and is part of the Jupyter COMM subsystem which uses Traitlets to synchronize the arrays from the backend engine to the front end. From what I can tell so far, it's either: (a) the serialization of the array to JSON or (b) the Traitlet comparison of objects. My environment is:
|
Here is a link that shows a technique that we might be able to use to transfer binary data instead of Json (https://gist.github.com/maartenbreddels/40fa030fdb922e6d2074282ceed6b753). |
Profiling
The full file is attached: prun1.txt It's sorted by cumulative time, so naturally the outer functions took the longest, but tracing down, it does look like it spent most of the time in the JSON serialization. Options:There are a few options @maihde and I have been discussing, and this ties in nicely with @desean1625's comments on Slack:
Right now option (3) is the most feasible and simplest, but (2) has been discussed before for high network latency (or low network throughput) situations, and keeps coming up in similar discussions for SigPlot (e.g. @desean1625's Slack comments), and clearly reaches into design decisions with SigPlot (e.g., XCMP) and the saving plots in Jupyter-SigPlot. Option (1) is a nice intermediary that should be explored as well because we all agree serializing to JSON is not the optimal solution performance-wise, and (1) will need to be implemented before we can implement (2). |
One difference between Matplotlib and SigPlot is that Matplotlib actually does all the rendering on the server (kernel), whereas SigPlot, being native Javascript, can (and currently does) render on the client. Another is that the When we've been talking about saving rendered images with the notebook, my understanding is that we were mainly thinking the client would generate a PNG after rendering, and send that back to the server to store as a rich representation, allowing saved notebooks to be viewed with embedded figures via things like nbviewer. This is the same shape as Matplotlib, but in reverse. Just wanted to double-check that we're talking about the same thing Just as I think Bokeh is also worth investigating in the context of saving, it may also be worth a look in the context of Comms. (I don't know/remember exactly what it does; in some cases, I'm certain that it sends a Javascript representation of the figure in question, which is definitely no better than JSON.) Regarding the binary transport for SigPlot, I think it would have the additional virtue that SigPlot already internally traffics in ArrayBuffers (iirc), so there'd be one less translation client side, over and above the deserialization savings. On yet another point, even if we use a binary transport, large files may take a while to arrive at the server. Some kind of progress indicator would be nice to address the OP's observation that nothing obvious occurs during loading--but note that the progress in question is in the Comm, not SigPlot proper. Imagine that SigPlot grows a way to indicate loading progress, possibly updated by XHR events in a purely Web context. Then I suggest that we have the notebook widget trigger that same mechanism, rather than also adding UI to display notebook Comm progress (presuming that we can hook into Comm progress in the first place). And finally, on the detached header point: I bet it would not be too much work to add an "overlay href data only" capability, since we already have headermod / header-only push and the expression in the constructor is "data" plus "overrides" (header). In the notebook context, we could keep the header in a traitlet, allowing a nice Python syntax for headermod like |
A quick correction/update: I incorrectly stated before that I was plotting 200,000 points, but I later realized (to my embarrassment) that The two takeaways from this correction:
|
20M points will definitely be very slow to serialize from Python to JSON
and back. I'm glad you caught that because I overlooked the same detail
(also to my embarrassment) when attempting to reproduce your issue.
On the plus side, SigPlot can handle 20M points in either binary form or
javascript (see this fiddle https://jsfiddle.net/kwbm1oj6/1/ ) with
minimal delay.
So it's still worth us trying to fix this in Jupyter because there is
something internal to Jupyter that is *very* slow to do the serialization.
```
#!/usr/bin/env python
import json
import random
import time
npoints = int(20e6)
data = []
for _ in range(npoints):
data.append(random.random())
s = time.time()
dat = json.dumps(data)
e = time.time()
print "serialization took", e-s
```
Only takes 22 seconds, so something under the hood in traitlets or Jupyter
is causing a lot of extra dealy.
Regards,
~Michael
…On Mon, Apr 1, 2019 at 2:12 PM Josh Hunsaker ***@***.***> wrote:
A quick correction/update:
I incorrectly stated before that I was plotting 200,000 points, but I
later realized (to my embarrassment) that arange(0, 200e3, 0.01) creates
20 million points. However, when I was testing at home, I was coding up the
test notebook from scratch, and I used r_[:200e3]*0.01 without realizing
that I was not comparing apples to apples. Plotting 20 million points at
home gave performance that was similar to those described in the problem
case on OpenStack.
The two takeaways from this correction:
1. The minute-plus array-plotting times were occurring for 20M points
(not 200k as first stated)
2. The performance difference between my two setups was not nearly as
drastic as I initially thought.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOCInm5nAjMeHY7jrMYS7ePztRFLcigks5vckwVgaJpZM4cRGuq>
.
|
Yep. There are instances where we could benefit from server-side rendering or at least cleverly sending what can render on the screen (
That's correct. They both are technically widgets---the only difference is
Yep. In general, sending pre-rendered images will be easier to persist. I was merely noting that (2) would enable us to do this.
Bokeh appears to have the same issues we have. The following code both took >10 minutes to eventually crash my (Chrome v72) browser every time I ran it: In[1]: from bokeh.plotting import figure, show, output_notebook
import numpy as np
output_notebook()
In[2]: x = np.arange(0, 200e3, 0.01)
sinx = np.sin(x)
cosx = np.cos(x)
In[3]: %%timeit
p = figure(width=300, height=300)
p.multi_line(xs=[x, x], ys=[sinx, cosx])
show(p)
Exactly.
Yep, we could do something like (not-tested) from __future__ import division
import requests
url = ...
# `stream=True` lets us stream over the response
r = requests.get(url, stream=True)
# get the total file size
total_size = int(r.headers.get('content-length', 0));
# we'll want to iterate over the file by chunks
block_size = 1024
# how much we've written locally (kernel-side)
wrote = 0
# "stream" the remote asset to ``local_file``
with open(local_file, 'wb') as f:
for data in r.iter_content(block_size):
# keep track of how much we've written
f.write(data)
wrote += len(data)
# update the ``progress`` traitlet, which
# we will handle on the client-side in some loading
# notification (e.g., loading bar via tqdm?, spinny wheel, etc.)
progress = wrote / total_size
Agreed that it wouldn't be difficult.
That seems reasonable. |
@maihde See the profiling output I posted above. It doesn't look like the
but ipykernel's jsonutil.py's
I ran the following import json
import random
import time
from ipykernel import jsonutil
npoints = int(20e6)
data = []
for _ in range(npoints):
data.append(random.random())
s = time.time()
# what I added
data = jsonutil.json_clean(data)
dat = json.dumps(data)
e = time.time()
print "serialization took", e-s which produced
The combination of So why does Jupyter call all these methods multiple times? Still tracking that down... |
This should plot a detached header file. var container = document.getElementById('plot');
var plot = new sigplot.Plot(container, {});
fetch("./path_to_detached_data").then(function(response) {
return response.arrayBuffer();
}).then(function(buffer) {
var bf = sigplot.m.initialize();
bf.setData(buffer);
plot.overlay_bluefile(bf);
}) |
Given that we control the horizontal and the vertical in this instance, it seems like we'd be able to get a good improvement just by implementing our own serialize/deserialize to bypass FWIW, if we replace the data in the "big data" fiddle with |
@maihde @sterre This also could be driving the slowness...
We're incurring two syncs (and therefore json serializations) each time we plot (i.e., @desean1625 Perfect. Thanks! |
I agree with your results. SigPlot has an optimization to not render
duplicate points if they occupy the same pixel (a cheater `xcmp`). With
ramp this overlap happens more often so we are actually drawing fewer
points and lines. Per this discussion and many others, I'm planning on
implementing an`xcmp` feature for Layer1D so that regardless of _what_ you
are plotting, the same number of points plotted are the same. Depending on
your needs this can be good or bad so it will stay with the legacy XPLOT
behavior by default.
…On Mon, Apr 1, 2019 at 4:30 PM Stephan A. Terre ***@***.***> wrote:
Given that we control the horizontal and the vertical in this instance, it
seems like we'd be able to get a good improvement just by implementing our
own serialize/deserialize to bypass json_clean! I still like the
ArrayBuffer approach better overall, of course.
FWIW, if we replace the data in the "big data" fiddle with Math.random()
instead of a ramp, the full range of data takes more like 10 sec to render
(compared with ~2 sec for the ramp).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOCIqbNBbeP85LKYUqRzpbSzpz1P_YPks5vcmx4gaJpZM4cRGuq>
.
|
@sterre One other thing to add, the Jupyter python kernel takes upwards of
two minutes or more to serialize and transfer the 20e6 points while
spinning the CPU and 100%. This is all before anything hits the browser
and is transferred via the web socket.
So the overall ten seconds in pure JavaScript isn’t bad.
The odd thing is that Python JSON outside of Jupyter is still slow...but
not on the order of multiple minutes.
On Mon, Apr 1, 2019 at 5:47 PM Michael Ihde <[email protected]>
wrote:
… I agree with your results. SigPlot has an optimization to not render
duplicate points if they occupy the same pixel (a cheater `xcmp`). With
ramp this overlap happens more often so we are actually drawing fewer
points and lines. Per this discussion and many others, I'm planning on
implementing an`xcmp` feature for Layer1D so that regardless of _what_ you
are plotting, the same number of points plotted are the same. Depending on
your needs this can be good or bad so it will stay with the legacy XPLOT
behavior by default.
On Mon, Apr 1, 2019 at 4:30 PM Stephan A. Terre ***@***.***>
wrote:
> Given that we control the horizontal and the vertical in this instance,
> it seems like we'd be able to get a good improvement just by implementing
> our own serialize/deserialize to bypass json_clean! I still like the
> ArrayBuffer approach better overall, of course.
>
> FWIW, if we replace the data in the "big data" fiddle with Math.random()
> instead of a ramp, the full range of data takes more like 10 sec to render
> (compared with ~2 sec for the ramp).
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#44 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAOCIqbNBbeP85LKYUqRzpbSzpz1P_YPks5vcmx4gaJpZM4cRGuq>
> .
>
|
I wouldn't be surprised if the
This is already a little confusing because the timing aspect of it is not documented. I can think of a couple of ideas for reducing the number of syncs as well as clarifying the intent a bit. One approach might be to just remember, in the client, what we've already overlaid (e.g., by index into However, because we're syncing the entire history of all arrays that have ever been overlaid, we're doing the most possible work we can whenever we add a new array--we'll have to re-serialize all previously-overlaid arrays! So we might be better off using our traitlet only to communicate new arrays, and devise a scheme where the client retains what it's received historically in local variables rather than traitlets. We may want a unique id (UID), like a one-up-counter, for each array that we overlay; then the client could just ack those UIDs with a quick traitlet update (which might even UIDs for all arrays currently being plotted). All this touches somewhat on the idea of sending more generic commands, like tuples |
Addresses #13, #36, #44, and indirectly #11. This changes the name from `sigplot.SigPlot` to `sigplot.Plot` and modifies the API to match the JS. This also includes adding a traitlet for progress when downloading a remote asset, but the visual component (progress bar, incrementing counter, etc.) hasn't been figured out yet.
Optimized in 1d44f0d. It uses binary serialization/deserialization for the array rather than just JSON array serialization/deserialization. It was much much faster for the 20 million point case. If it still is an issue, feel free to open this or a new issue. |
When I use
overlay_array
to plot a large array (e.g. 20M points) it takes a significant amount of time (i.e. over one minute) for the data to be displayed in the widget. This is basically what my notebook looks like:In[1]:
In[2]:
In[4]:
When I evaluate the second cell, a blank plot is displayed right away, but the data takes over a minute to show up, and there is no indication that any background work is taking place.
When I evaluate the fourth cell, the full plot is displayed almost instantly, including the data.
The text was updated successfully, but these errors were encountered: