Make serializers synchronous and guarantee to return a snapshot of the data #1270

jasongrout · 2017-04-07T17:20:39Z

WIP, as this causes problems for binary arrays - basically we’ll need an explicit serializer on attributes that have binary arrays now, which can choose to implement the snapshot however they think best. The default serializer will make a copy.

This is work towards solving fundamental problems exposed at #1044.

CC @maartenbreddels - this affects the binary serialization.

…e data. WIP, as this causes problems for binary arrays - basically we’ll need an explicit serializer on attributes that have binary arrays now, which can choose to implement the snapshot however they think best. The default serializer will make a copy.

maartenbreddels · 2017-04-07T18:16:12Z

Can you explain a bit more what the problem is?

jasongrout · 2017-04-07T18:43:16Z

This is buried in #1044. Because we have message throttling, state sync messages are not always sent immediately when save_changes is called. The problem is that we serialize the state attributes we are syncing when we send the message, rather than when we request the save. This means that if we sync some attributes, then change the object, the sync message could be sent after the change, which means we sent the wrong values.

What this PR does is serialize the object to get a snapshot of the synced attributes when the save is requested. That means that the serializers must return a snapshot of the state that won't change. It's easy to accomplish this with JSONable values - just stringify and parse to get a cheap deep copy. However, if there are binary arrays in the json state, there's no way to get a snapshot. I imagine that some libraries will not do a snapshot of binary data, but will just sync the current state of the binary array and hope for the best.

I suppose another approach is to make the state sync call explicitly just be a request that the future current state of the object be synced at some point in the future. This means that we then have to be careful to never leave the state of the object in some inconsistent intermediate state, which sounds tenuous at best.

maartenbreddels · 2017-04-07T18:45:40Z

Actually, can't we juse the code that I wrote for extracting the buffers to do this? I spend effort not to clone the whole thing, but if we rip that part out, it clones everything (except the arrays..)

jasongrout · 2017-04-07T19:00:07Z

Sure, the JSON parse/stringify is just an easy way to deep copy an object. We could use a different deep copy implementation instead, though I it's likely to be much slower (maybe we should time it, though). I think the underlying issue with using it to do a deep copy is that then we are condoning that this sort of async bug is expected for binary data.

(I wish we had simple copy-on-write ArrayBuffers, so we could get a real snapshot of the binary data.)

maartenbreddels · 2017-04-07T19:11:55Z

Is it an option to remove the throttling, or do it differently?

jasongrout · 2017-04-07T19:53:59Z

Actually, can't we juse the code that I wrote for extracting the buffers to do this? I spend effort not to clone the whole thing, but if we rip that part out, it clones everything (except the arrays..)

and officially sanction this async problem for binary arrays?

(I wish we had some sort of copy-on-write binary view of an array. I just looked into the Chrome source, and if I understand it correctly, they don't implement copy-on-write copies: https://cs.chromium.org/chromium/src/v8/src/builtins/builtins-arraybuffer.cc?l=281 )

jasongrout · 2017-04-07T20:00:59Z

Is it an option to remove the throttling, or do it differently?

Good question. I think it makes sense to have some sort of throttling implemented somewhere. A typical usecase is dragging a slider which may cause a lot of computation to be done, maybe drawing a plot. The nice thing about the pending messages throttle is that it dynamically adjusts for the delay - sync messages are sent exactly as often as the kernel can handle them.

What is the problem with snapshotting the data the widget wants to sync?

maartenbreddels · 2017-04-07T20:41:52Z

Actually, the throttling/debounding is something I raised a while ago #663 and recently updated with some code (on the python side). Maybe it's sth that needs to go into the interact decorator, and we just handle this purely on the python side.

maartenbreddels · 2017-04-07T20:44:10Z

What is the problem with snapshotting the data the widget wants to sync?

We could, it will take up memory, but maybe we should accept that.

jasongrout · 2017-04-07T21:00:31Z

We could, it will take up memory, but maybe we should accept that.

That's where we maybe look the other way if someone chooses to not hand us back a snapshot of the state. Or rather, widget authors can decide if they really need to hand back a copy or not. But by default, we make a copy.

jasongrout · 2017-04-07T21:01:46Z

The end result is that if you have an attribute that has binary data, and you don't care about this snapshotting business, you just declare that attribute to have the identity serializer, i.e., hand back the attribute as-is.

jasongrout · 2017-04-07T21:18:12Z

Actually, thinking about it, I don't think there really will be that much change after all. Already, if you have binary data, you'll probably need a custom serializer to understand how to package up that binary data, so the automatic copying is not done with the current PR. Whether you actually make a copy or not is on your head (it's correct to make a copy, but maybe it's okay to fudge it a bit...).
This is the sort of thing I was talking about on the original binary PR, that we should just treat every message we split up as a copy. However, your implementation will work even if people fudge things a bit and don't return copies. That's maybe a good thing to be robust against that.

jasongrout · 2017-04-10T13:44:39Z

Ping @maartenbreddels - I think we should be okay with merging this after all. If you didn't care about making a copy, you could just give an identity function as a serializer.

maartenbreddels · 2017-04-10T13:58:13Z

Hmm, I don't quite follow when send_sync_message and when serialize are being used, can you explain this a bit.

jasongrout · 2017-04-10T14:40:58Z

Great question. When someone calls for a sync (e.g., with save_changes()), the sync method gets called. Inside the sync method, we first serialize a message (which should snapshot the current state of the attributes we are going to sync). Then we check to see if there are any pending sync messages still in flight (i.e., a sync message that was sent, but we haven't had a corresponding execution status idle message). If there is no pending sync message still being processed by the kernel, then we directly send the message (calling send_sync_message). If there is a pending sync message still being processed by the kernel, then we store the serialized snapshotted attributes in this.msg_buffer. When we get back a status idle message for the pending message, we call send_sync_message directly to send the message we've been saving up: https://github.com/jasongrout/ipywidgets/blob/20e00848139a83659f80cf027d313cdf6ac00adf/jupyter-js-widgets/src/widget.ts#L243.

Does that answer your question?

maartenbreddels · 2017-04-10T19:19:52Z

Ok, thanks that explains a lot. I think I understand the issue now. If you don't give a serializer, it will try to clone your data, and fail if there is a binary buffer (that's fine I guess). In that case you have to give a serializer, which could just do nothing. So in effect the default serializer is a clone using stringify/JSON.parse, if you don't want cloning you need to specify so. I think that is good default behaviour.

jasongrout · 2017-04-10T21:31:41Z

Exactly! Do you mind looking this over and/or trying out this PR and giving me thumbs up/down? It's a fundamental enough change that I'm hesitant to merge it without someone else looking at it.

SylvainCorlay · 2017-04-11T08:31:34Z

Also looking at it today

…

On Apr 10, 2017 11:31 PM, "Jason Grout" ***@***.***> wrote: Exactly! Do you mind looking this over and/or trying out this PR and giving me thumbs up/down? It's a fundamental enough change that I'm hesitant to merge it without someone else looking at it. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1270 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACSXFv4RxVb-rAaANNhnsE9fLZU7ORZzks5rup--gaJpZM4M3Kmi> .

maartenbreddels · 2017-04-11T11:18:26Z

Will do, give me 24h (from mobile phone) Op 10 apr. 2017 23:31 schreef "Jason Grout" <[email protected]>: Exactly! Do you mind looking this over and/or trying out this PR and giving me thumbs up/down? It's a fundamental enough change that I'm hesitant to merge it without someone else looking at it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1270 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABryPXYS7SP9SVbsFARb8wuPfjuqHi0dks5rup--gaJpZM4M3Kmi> .

jasongrout · 2017-04-12T15:14:57Z

ping @SylvainCorlay, @maartenbreddels :)

maartenbreddels · 2017-04-12T15:55:02Z

Give me a 4 more hours! 😊 (from mobile phone) Op 12 apr. 2017 17:14 schreef "Jason Grout" <[email protected]>:

…

ping @SylvainCorlay <https://github.com/SylvainCorlay>, @maartenbreddels <https://github.com/maartenbreddels> :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1270 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABryPd6kl45KqkFHPSyd8rS0D2RSGC3iks5rvOpxgaJpZM4M3Kmi> .

jasongrout · 2017-04-12T16:13:52Z

Give me a 4 more hours! 😊

No pressure!

maartenbreddels · 2017-04-12T16:40:51Z

jupyter-js-widgets/src/widget.ts

+                // How should we handle those? One way is to declare a serializer for fields
+                // that could have binary that copies what fields it can, and makes a decision
+                // about whether to copy the ArrayBuffer
+                state[k] = JSON.parse(JSON.stringify(state[k]));


Can we maybe have a warning here, say when an exception occurs, give a hint what may have gone wrong?

jasongrout · 2017-04-12T17:18:53Z

There will be an error from the json engine. However, we can catch it and give it some context.

…

On Wed, Apr 12, 2017, 12:42 Maarten Breddels ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In jupyter-js-widgets/src/widget.ts <#1270 (comment)> : > + * have toJSON called if possible, and the final result should be a + * primitive object that is a snapshot of the widget state that may have + * binary array buffers. + */ + serialize(state) { + const serializers = (this.constructor as typeof WidgetModel).serializers || {}; + for (const k of state) { + if (serializers[k] && serializers[k].serialize) { + state[k] = (serializers[k].serialize)(state[k], this); + } else { + // the default serializer just deep-copies the object + // TODO: this won't work if the object is a primitive object with binary buffers! + // How should we handle those? One way is to declare a serializer for fields + // that could have binary that copies what fields it can, and makes a decision + // about whether to copy the ArrayBuffer + state[k] = JSON.parse(JSON.stringify(state[k])); Can we maybe have a warning here, say when an exception occurs, give a hint what may have gone wrong? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1270 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALwZpJC188Nj2DlD9zS2VA3oDk3T_MYks5rvP7hgaJpZM4M3Kmi> .

maartenbreddels · 2017-04-12T17:38:20Z

Controller works, code looks, ipyvolume doesn't break. The issue in #1195 is still present though. I can kind of reproduce it, not easy, but we should fix that before 7.0, it is large performance penalty since it sends back (for some reason) especially binary buffers it seems.

maartenbreddels · 2017-04-12T17:39:25Z

Yes, exactly, say console.error('serializing failed for ...') and then rethrow the exception would be helpful already.

jasongrout · 2017-04-12T18:06:29Z

Interesting: JSON.stringify already works for ArrayBuffers:

> JSON.stringify(new Int16Array(4))
"{"0":0,"1":0,"2":0,"3":0}"

jasongrout · 2017-04-12T18:36:51Z

I added a message for context for serialization errors.

maartenbreddels · 2017-04-12T18:38:27Z

Excellent 👍

jasongrout modified the milestone: 7.0 Apr 11, 2017

maartenbreddels reviewed Apr 12, 2017

View reviewed changes

Give serialization errors context.

ebda6d2

jasongrout merged commit fe6a8f1 into jupyter-widgets:master Apr 12, 2017

This was referenced Apr 12, 2017

Snapshot widget state when syncing #1044

Closed

Fix syncing bug. #1280

Merged

Update Changelog #1279

Closed

jasongrout mentioned this pull request May 22, 2017

NGL widget won't render #1375

Closed

github-actions bot added the resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. label Feb 15, 2021

github-actions bot locked as resolved and limited conversation to collaborators Feb 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make serializers synchronous and guarantee to return a snapshot of the data #1270

Make serializers synchronous and guarantee to return a snapshot of the data #1270

jasongrout commented Apr 7, 2017

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017 •

edited

Loading

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 7, 2017

maartenbreddels commented Apr 7, 2017 •

edited

Loading

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 10, 2017

maartenbreddels commented Apr 10, 2017

jasongrout commented Apr 10, 2017 •

edited

Loading

maartenbreddels commented Apr 10, 2017

jasongrout commented Apr 10, 2017

SylvainCorlay commented Apr 11, 2017 via email

maartenbreddels commented Apr 11, 2017 via email

jasongrout commented Apr 12, 2017

maartenbreddels commented Apr 12, 2017 via email

jasongrout commented Apr 12, 2017

maartenbreddels Apr 12, 2017

jasongrout commented Apr 12, 2017 via email

maartenbreddels commented Apr 12, 2017

maartenbreddels commented Apr 12, 2017 •

edited

Loading

jasongrout commented Apr 12, 2017

jasongrout commented Apr 12, 2017

maartenbreddels commented Apr 12, 2017

Make serializers synchronous and guarantee to return a snapshot of the data #1270

Make serializers synchronous and guarantee to return a snapshot of the data #1270

Conversation

jasongrout commented Apr 7, 2017

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017 • edited Loading

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 7, 2017

maartenbreddels commented Apr 7, 2017 • edited Loading

maartenbreddels commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 7, 2017

jasongrout commented Apr 10, 2017

maartenbreddels commented Apr 10, 2017

jasongrout commented Apr 10, 2017 • edited Loading

maartenbreddels commented Apr 10, 2017

jasongrout commented Apr 10, 2017

SylvainCorlay commented Apr 11, 2017 via email

maartenbreddels commented Apr 11, 2017 via email

jasongrout commented Apr 12, 2017

maartenbreddels commented Apr 12, 2017 via email

jasongrout commented Apr 12, 2017

maartenbreddels Apr 12, 2017

Choose a reason for hiding this comment

jasongrout commented Apr 12, 2017 via email

maartenbreddels commented Apr 12, 2017

maartenbreddels commented Apr 12, 2017 • edited Loading

jasongrout commented Apr 12, 2017

jasongrout commented Apr 12, 2017

maartenbreddels commented Apr 12, 2017

jasongrout commented Apr 7, 2017 •

edited

Loading

maartenbreddels commented Apr 7, 2017 •

edited

Loading

jasongrout commented Apr 10, 2017 •

edited

Loading

maartenbreddels commented Apr 12, 2017 •

edited

Loading