-
Notifications
You must be signed in to change notification settings - Fork 949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make serializers synchronous and guarantee to return a snapshot of the data #1270
Conversation
…e data. WIP, as this causes problems for binary arrays - basically we’ll need an explicit serializer on attributes that have binary arrays now, which can choose to implement the snapshot however they think best. The default serializer will make a copy.
Can you explain a bit more what the problem is? |
This is buried in #1044. Because we have message throttling, state sync messages are not always sent immediately when save_changes is called. The problem is that we serialize the state attributes we are syncing when we send the message, rather than when we request the save. This means that if we sync some attributes, then change the object, the sync message could be sent after the change, which means we sent the wrong values. What this PR does is serialize the object to get a snapshot of the synced attributes when the save is requested. That means that the serializers must return a snapshot of the state that won't change. It's easy to accomplish this with JSONable values - just stringify and parse to get a cheap deep copy. However, if there are binary arrays in the json state, there's no way to get a snapshot. I imagine that some libraries will not do a snapshot of binary data, but will just sync the current state of the binary array and hope for the best. I suppose another approach is to make the state sync call explicitly just be a request that the future current state of the object be synced at some point in the future. This means that we then have to be careful to never leave the state of the object in some inconsistent intermediate state, which sounds tenuous at best. |
Actually, can't we juse the code that I wrote for extracting the buffers to do this? I spend effort not to clone the whole thing, but if we rip that part out, it clones everything (except the arrays..) |
Sure, the JSON parse/stringify is just an easy way to deep copy an object. We could use a different deep copy implementation instead, though I it's likely to be much slower (maybe we should time it, though). I think the underlying issue with using it to do a deep copy is that then we are condoning that this sort of async bug is expected for binary data. (I wish we had simple copy-on-write ArrayBuffers, so we could get a real snapshot of the binary data.) |
Is it an option to remove the throttling, or do it differently? |
and officially sanction this async problem for binary arrays? (I wish we had some sort of copy-on-write binary view of an array. I just looked into the Chrome source, and if I understand it correctly, they don't implement copy-on-write copies: https://cs.chromium.org/chromium/src/v8/src/builtins/builtins-arraybuffer.cc?l=281 ) |
Good question. I think it makes sense to have some sort of throttling implemented somewhere. A typical usecase is dragging a slider which may cause a lot of computation to be done, maybe drawing a plot. The nice thing about the pending messages throttle is that it dynamically adjusts for the delay - sync messages are sent exactly as often as the kernel can handle them. What is the problem with snapshotting the data the widget wants to sync? |
Actually, the throttling/debounding is something I raised a while ago #663 and recently updated with some code (on the python side). Maybe it's sth that needs to go into the interact decorator, and we just handle this purely on the python side. |
We could, it will take up memory, but maybe we should accept that. |
That's where we maybe look the other way if someone chooses to not hand us back a snapshot of the state. Or rather, widget authors can decide if they really need to hand back a copy or not. But by default, we make a copy. |
The end result is that if you have an attribute that has binary data, and you don't care about this snapshotting business, you just declare that attribute to have the identity serializer, i.e., hand back the attribute as-is. |
|
Ping @maartenbreddels - I think we should be okay with merging this after all. If you didn't care about making a copy, you could just give an identity function as a serializer. |
Hmm, I don't quite follow when |
Great question. When someone calls for a sync (e.g., with Does that answer your question? |
Ok, thanks that explains a lot. I think I understand the issue now. If you don't give a serializer, it will try to clone your data, and fail if there is a binary buffer (that's fine I guess). In that case you have to give a serializer, which could just do nothing. So in effect the default serializer is a clone using stringify/JSON.parse, if you don't want cloning you need to specify so. I think that is good default behaviour. |
Exactly! Do you mind looking this over and/or trying out this PR and giving me thumbs up/down? It's a fundamental enough change that I'm hesitant to merge it without someone else looking at it. |
Also looking at it today
…On Apr 10, 2017 11:31 PM, "Jason Grout" ***@***.***> wrote:
Exactly! Do you mind looking this over and/or trying out this PR and
giving me thumbs up/down? It's a fundamental enough change that I'm
hesitant to merge it without someone else looking at it.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1270 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACSXFv4RxVb-rAaANNhnsE9fLZU7ORZzks5rup--gaJpZM4M3Kmi>
.
|
Will do, give me 24h
(from mobile phone)
Op 10 apr. 2017 23:31 schreef "Jason Grout" <[email protected]>:
Exactly! Do you mind looking this over and/or trying out this PR and giving
me thumbs up/down? It's a fundamental enough change that I'm hesitant to
merge it without someone else looking at it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1270 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABryPXYS7SP9SVbsFARb8wuPfjuqHi0dks5rup--gaJpZM4M3Kmi>
.
|
ping @SylvainCorlay, @maartenbreddels :) |
Give me a 4 more hours! 😊
(from mobile phone)
Op 12 apr. 2017 17:14 schreef "Jason Grout" <[email protected]>:
… ping @SylvainCorlay <https://github.com/SylvainCorlay>, @maartenbreddels
<https://github.com/maartenbreddels> :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1270 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABryPd6kl45KqkFHPSyd8rS0D2RSGC3iks5rvOpxgaJpZM4M3Kmi>
.
|
No pressure! |
jupyter-js-widgets/src/widget.ts
Outdated
// How should we handle those? One way is to declare a serializer for fields | ||
// that could have binary that copies what fields it can, and makes a decision | ||
// about whether to copy the ArrayBuffer | ||
state[k] = JSON.parse(JSON.stringify(state[k])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe have a warning here, say when an exception occurs, give a hint what may have gone wrong?
There will be an error from the json engine. However, we can catch it and
give it some context.
…On Wed, Apr 12, 2017, 12:42 Maarten Breddels ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In jupyter-js-widgets/src/widget.ts
<#1270 (comment)>
:
> + * have toJSON called if possible, and the final result should be a
+ * primitive object that is a snapshot of the widget state that may have
+ * binary array buffers.
+ */
+ serialize(state) {
+ const serializers = (this.constructor as typeof WidgetModel).serializers || {};
+ for (const k of state) {
+ if (serializers[k] && serializers[k].serialize) {
+ state[k] = (serializers[k].serialize)(state[k], this);
+ } else {
+ // the default serializer just deep-copies the object
+ // TODO: this won't work if the object is a primitive object with binary buffers!
+ // How should we handle those? One way is to declare a serializer for fields
+ // that could have binary that copies what fields it can, and makes a decision
+ // about whether to copy the ArrayBuffer
+ state[k] = JSON.parse(JSON.stringify(state[k]));
Can we maybe have a warning here, say when an exception occurs, give a
hint what may have gone wrong?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1270 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AALwZpJC188Nj2DlD9zS2VA3oDk3T_MYks5rvP7hgaJpZM4M3Kmi>
.
|
Controller works, code looks, ipyvolume doesn't break. The issue in #1195 is still present though. I can kind of reproduce it, not easy, but we should fix that before 7.0, it is large performance penalty since it sends back (for some reason) especially binary buffers it seems. |
Yes, exactly, say console.error('serializing failed for ...') and then rethrow the exception would be helpful already. |
Interesting: JSON.stringify already works for ArrayBuffers:
|
I added a message for context for serialization errors. |
Excellent 👍 |
WIP, as this causes problems for binary arrays - basically we’ll need an explicit serializer on attributes that have binary arrays now, which can choose to implement the snapshot however they think best. The default serializer will make a copy.
This is work towards solving fundamental problems exposed at #1044.
CC @maartenbreddels - this affects the binary serialization.