comms: handle known-resolved promise in remote-to-kernel message correctly #2640

warner · 2021-03-15T00:17:10Z

I suspect that we currently suffer from a bug, due to the changes we landed in #2516, but writing a test is the way to make sure. The scenario I think we don't handle correctly starts with a comms vat, that has two remotes (remoteA and remoteB), and the local kernel. Then:

remoteA sends a message to remoteB, which references a new (exported) promise
- comms adds a promise table entry for it, with A as the decider, and B as a subscriber
- note that the local kernel is not a subscriber (yet)
remoteA resolves the promise
- comms marks the promise table entry as resolved, and records the resolution data
- comms sends the resolution to remoteB (because it is a subscriber)
- the ACK (Cope with asynchrony in remote retirement of promise IDs #2509) from remoteB has not been received yet
comms receives a messge from remoteB, for the local kernel, which references the promise
- remoteB must have emitted this message before it received the resolution, else it would make a new (ephemeral) promise ID
- the messages crossed on the comms-remoteB wire
comms translates the message into kernel-facing identifiers
- the promise is known to be resolved
- comms creates an ephemeral promise ID to send to the kernel
- comms does syscall.send into the local kernel, referencing the new vpid
- BUG comms is responsible for promptly doing syscall.resolve for that vpid, but I suspect it does not

Previously, this prompt syscall.resolve was handled by a Promise.resolve().then(_ => syscall.resolve(...)) in provideKernelForLocal, as the promise ID was translated into a vpid. The translation was trivial, because we used the same IDs for both "local space" and "kernel-facing space", but it provided a convenient time to check the resolution status, and arrange to tell the kernel about it. We used Promise.resolve() to ensure the syscall.resolve didn't happen until after the syscall.send, because the kernel is certainly not prepared to receive a resolution before the export.

This "notify after translate" clause overlapped with changes (maybe from #2358?) that also did a syscall.resolve for ancillary promises after translating a remote resolve message. The overlap caused duplicate resolution syscalls for the remote resolve case. Removing the one in provideKernelForLocal reduced the number of resolves in the remote-resolve case from 2 to 1, but I think it would also reduce the number of resolves in the remote-send case from 1 to 0, and that will be a bug.

The task here is to write a test case that exercises the scenario above (probably added to test-comms.js, and following its pattern of "real comms vat, fake everything else", rather than attempting to build a test-message-patterns.js approach). Then see if it fails, and fix it. I think the fix will come out of the work in #2363, where we decide what follow-on resolve messages need to be sent after translation, rather than during.

I'm going to throw this at @FUDCo because he's already working in this space, but feel free to grab me for help writing the test case.

The text was updated successfully, but these errors were encountered:

FUDCo · 2021-04-10T00:20:33Z

Closed by #2752

warner added bug Something isn't working SwingSet package: SwingSet labels Mar 15, 2021

warner assigned FUDCo Mar 15, 2021

FUDCo mentioned this issue Mar 29, 2021

Support promise retirement in comms vat #2752

Merged

FUDCo closed this as completed Apr 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

comms: handle known-resolved promise in remote-to-kernel message correctly #2640

comms: handle known-resolved promise in remote-to-kernel message correctly #2640

warner commented Mar 15, 2021

FUDCo commented Apr 10, 2021

comms: handle known-resolved promise in remote-to-kernel message correctly #2640

comms: handle known-resolved promise in remote-to-kernel message correctly #2640

Comments

warner commented Mar 15, 2021

FUDCo commented Apr 10, 2021