Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cross-machine calls in vm #803

Closed
dckc opened this issue Mar 29, 2020 · 16 comments
Closed

cross-machine calls in vm #803

dckc opened this issue Mar 29, 2020 · 16 comments
Assignees
Labels
wontfix This will not be worked on xsnap the XS execution tool

Comments

@dckc
Copy link
Member

dckc commented Mar 29, 2020

A goal arising from discussion with @dtribble @erights @warner @michaelfig and co is to use some xs mechanism that lets each vat run in its own unit of resource allocation and let them be terminated independently. Meanwhile, the kernel has to be able to call them (synchronously), (and vice versa?).

I started working on a special host object that contained a new xsMachine. It was going OK, but then I wondered if I'm reinventing Workers poorly and yup, it sure looks like it.

One snag with just using Workers out of the box is that they're currently tied to a windowing system for debugging and such. When I try to build the worker example on our xs-cli-lin platform, I get:

# cc pcWorker.c.o
/home/connolly/projects/moddable/modules/base/worker/pcWorker.c:2:10: fatal error: screen.h: No such file or directory
 #include "screen.h"

This is perhaps a duplicate of #516; feel free to close this if you'd rather I took my notes there. Or perhaps I should raise a Workers tied to windowing screen issue against the moddable SDK?

https://github.com/dckc/moddable/tree/xmachine a57a6e9

@michaelfig
Copy link
Member

Hi! IMO, the dependency on the windowing system should be stubbed out somehow in xs-cli-lin. This is because we'll still want to capture logs and put them somewhere special.

It may be better to narrow the interface somewhat and have the worker depend on that restricted API just for the things it needs. Not sure how much extra stuff is in screen.h, as I haven't looked at it, but if that seems like a better approach, I'd suggest raising it with moddable.

@dckc
Copy link
Member Author

dckc commented Mar 29, 2020 via email

@dckc
Copy link
Member Author

dckc commented Apr 4, 2020

how to fxAbort() just one worker?

Stubbing out the screen depdendency was reasonably straightforward (1f45af6).

Then I built an example with alice and bob vats that play nice and mallet that blows up the stack. The stack overflow code called fxAbort which I had just defined as exit(1). This of course took down alice and bob along with mallet.

So I'm trying to work out how to terminate just mallet in fxAbort. (I wonder if this goes against proposal-oom-fails-fast). There's code that maintains a list of workers and it doesn't seem to accomodate terminating a worker preemptively. I wonder about doing something at the gio thread level.

https://github.com/dckc/moddable/tree/xmachine f085562

@erights
Copy link
Member

erights commented Apr 4, 2020

Hi @dckc good question. With separate vats being separate xs machines, with separate resource budgets, and only asynchronously coupled, it's clear that these are uopts --- units of preemptive termination. With them being arbitrarily synchronously coupled, this would be much tricker or impossible. A single call stack could thread through an arbitrary number of these, and it's not clear what to do when one of them terminates.

What we actually have is an odd mix, a solvable special case between these two: Each vat is directly coupled only to the swingset kernel. Via the swingset kernel, each vat is only asynchronously coupled to every other vat. The synchronous coupling between the swingset kernel and any one vat is extremely limited and designed to support crash, revival, and deterministic replay. But we've got a layering problem. We don't want the xs engine to know about or make any special case for swingset.

Attn @warner @Chris-Hibbert @FUDCo @phoddie

@phoddie
Copy link

phoddie commented Apr 4, 2020

@dckc - XS workers have no dependency on a window system at all. The PC implementation of them is built to run in "screen test" host simulator, so you see the word screen. If you look at the microcontroller implementation of workers in modWorker.c, you will see it is quite minimal.

@phoddie
Copy link

phoddie commented Apr 4, 2020

@dckc - There's nothing about fxAbort that requires it to take down all machines. fxAbort is a host function precisely so a host can define the behavior it wants in that case. Our usual implementation calls exit because that's simple and safe for a micrcontroller. But you can do whatever you need there.

@dckc
Copy link
Member Author

dckc commented Apr 5, 2020 via email

@erights
Copy link
Member

erights commented Apr 5, 2020

The mixed stack is an issue. @warner does it make sense to start with the async-only-coupled vats, to get the separate termination working there, before worrying about accommodating swingSet's very limited synchrony?

@phoddie
Copy link

phoddie commented Apr 5, 2020

But the only argument is the xsMachine*. How do I find the worker data structure to terminate it?

That's up to you. The invocation of fxAbort doesn't know what you need from the VM. Some options:

  • In your worker host implementation, but you could maintain a list of each workers and their associated VM.
  • You could store information about the worker using xsSetContext when creating the worker VM and then retrieve it with xsGetContext in fxAbort.
  • In a worker there is a always a global self. If that it is a host object (as in modWorker.c) you should be able to safely retrieve self inside fxAbort and the access its host data with xsGetHostData. As with xsSetContext you can store whatever information your host needs.

@dckc
Copy link
Member Author

dckc commented Apr 11, 2020

Thanks to @michaelfig , I'm over a hump. We can terminate mallet independent of alice and bob.

cd49457e https://github.com/dckc/moddable/tree/xmachine/examples/js/xvat

(git hash is not stable; IOU better commit message and I expect to force-push when I do)

@phoddie
Copy link

phoddie commented Apr 11, 2020

Cool! How did you get that working?

@dckc
Copy link
Member Author

dckc commented Apr 11, 2020

In lin_xs_cli.c, fxAbort checks to see if the->context is a worker (i.e. not null, since our main machine has a null context) and if so, calls worker_abort:

void worker_abort(xsMachine *the, txWorker *worker)
{
	// NOTE: the = the worker machine, not the owner as in _terminate
        GMainLoop* main_loop = worker->main_loop;
        GMainContext* main_context = g_main_loop_get_context (main_loop);
        fprintf(stderr, "@@!!!abort worker=%p main_loop=%p main_context=%p\n",
                worker, main_loop, main_context);
	fxWorkerTerminate(worker);
        // TODO: notify owner
	g_main_loop_unref(main_loop);
	g_main_context_pop_thread_default(main_context);
	g_main_context_unref(main_context);
        fprintf(stderr, "about to kill thread...\n");
        g_thread_exit(NULL);
        fprintf(stderr, "already killed this thread; this should not print!\n");
}

@phoddie
Copy link

phoddie commented Apr 13, 2020

Makes sense.

@dckc
Copy link
Member Author

dckc commented Apr 15, 2020

The mixed stack is an issue. @warner does it make sense to start with the async-only-coupled vats ...?

If so, the analogy to worker becomes even stronger: onmessage and postmessage look an awful lot like dispatch and syscall.

@dckc dckc self-assigned this Jan 16, 2021
@dckc
Copy link
Member Author

dckc commented Jan 16, 2021

We don't need this for the plan in #2107 ; it may be nice to have eventually, but if we have some way of signalling that issues are postponed to a later release, this is a good candidate.

@dckc dckc added the xsnap the XS execution tool label Apr 28, 2021
@dckc
Copy link
Member Author

dckc commented Jul 22, 2021

This is overtaked by events such as adding the vat warehouse #2277.

@dckc dckc closed this as completed Jul 22, 2021
@dckc dckc added the wontfix This will not be worked on label Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on xsnap the XS execution tool
Projects
None yet
Development

No branches or pull requests

4 participants