Skip to content

Latest commit

 

History

History
72 lines (33 loc) · 6.7 KB

internals.md

File metadata and controls

72 lines (33 loc) · 6.7 KB

Internals

General Overview

The library implementation is in sandbox.mjs.

The sandbox has two modes: 'live' and 'replay'. In live mode, all IO is recorded in the IO journal (#ioJournal). In replay mode, the journal is replayed against the sandbox to restore the state of the sandbox. When it reaches the end of the journal, it switches back to live mode.

Note that there is no need to separately restore the sandboxed scripts, because evaluating a script involves passing the script to the sandbox, which is recorded in the journal. So the script is automatically restored when the journal is replayed.

The replay cursor (#journalReplayCursor) represents the state of replay. Each action in the journal is sent to the sandbox to execute.

Every action is assumed to have a return value. Actions include things like function calls and property-gets, which all return values. So journal entries come in pairs -- an action and a result.

The pair may not be directly adjacent to each other. Like HTML opening and closing tags, they may imply a nested structure. In particular, a function call action to the sandbox may trigger other function calls to the host, which would show up in the journal after the call initial action but before its return value.

IO information flowing out of the sandbox theoretically doesn't need to be recorded, but it's recorded so that it can be verified against the journal in replay mode as a layer of safety. If the IO information doesn't match the journal, then the journal is corrupt or the sandbox is non-deterministic. Non-determinism might in theory result from change in the engine version. If a sandbox is restored on an engine with different capabilities, for example, its behavior may play out differently. It can also happen if the environment (globalThis) has changed semantically, or if the app is being run in a debugger (the debugger may try to evaluate expressions in the sandbox, which would be recorded in the journal).

Membrane - Wet and Dry

The membrane has two sides -- the wet side represents the inside of the membrane (inside the sandbox) and the dry side is outside. This is the conventional naming for membranes in JavaScript and is inspired by a living cell which is wet inside and potentially dry outside.

In the POC, a membrane side is represented by the SerializingMembrane class, of which there is a wet instance and a dry instance (probably the name is not appropriate, since two SerializingMembrane make up the full membrane). This class is the same for the wet side and the dry side, so the naming convention in the class is to use the term local and remote to refer to its own side or the other side of the membrane respectively.

For messages sent between the two sides of the membrane, the terminology dst and src refer to something in the destination or origin of the message respectively. Since the message moves between sides, "local" and "remote" are not appropriate names since these terms changing meaning depending on whether its the sender or receiver of the message that's using these terms.

The wet side and dry side of the membrane communicate purely through a communications channel is represented by the sendAction function that is injected into each side. This communications channel is pure JSON -- it contains no direct references to functions or objects. The sendAction function is responsible for serializing the action and sending it to the other side of the membrane, as well as recording the action in the journal (or verifying it against the journal in replay mode).

The SerializingMembrane.serialize and SerializingMembrane.deserialize functions serialize values to be passed across the communications channel and corresponding deserialize on the other side.

Primitives and Symbols

JSON primitives are passed across the boundary in the serialized form { type: 'primitive', value }, where value is the raw primitive.

Well-known symbols are passed across the boundary in the serialized form { type: 'well-known-symbol', name }, where name is an established identifier for this symbol. They are then deserialized on the other side to the equivalent symbol in the remote environment (the remote environment may also be the "future" environment if the journal is being replayed).

User-defined symbols are not supported at present.

Passing Objects and Functions

Local objects and functions are passed through the membrane in the form { type: 'src-obj', id, objectType }, where objectType is function or object. The ID does not mean anything to the receiver, it is just a unique identifier, generated by the sender, for the object or function. If the receiver then references that object again (e.g. as the target of a property access) it will use the same ID.

To refer to a remote object or function, the form is { type: 'dst-obj', id }, where id is the ID of the object or function as previously generated on the remote side. The remote side has a Map (objectsByLocalId) that it will use to resolve the ID to the actual object or function. The local side also has a map (WeakMap serializedByObject) which essentially keeps track of the remote IDs of received objects, but does so more generally by the serialized form of all objects sent across the membrane, whether local or remote.

Control Channel

In general, the local side can only refer to a remote object if it knows its ID, which normally only happens if the remote side has sent it to the local side. But to bootstrap communication, there needs to exist an initial object in the wet side to which you can send a first message. This is called the controlChannel. Instead of using a numeric ID generated by the wet side, the control channel uses a "well known ID" which is the string controlChannel. The setup process artificially injects the control channel object into the membrane.

This controlChannel object currently has one method, which is evaluateCommonJsModule. The dry side can invoke this method through the membrane to execute a CommonJS module in the wet side.

GlobalThis

Going the other direction, the globalThis object used by the wet side is also injected into the membrane as a proxy for the dry globalThis (which is in turn a proxy of the globals provided to the sandbox).

Ephemerals

When a snapshot is taken, the app may have any number of open references to host objects that might not exist in the new host or at least are not identifiable in the new host. I refer to these objects as "ephemerals" to follow the convention I used in Microvium.

When snapshotting, the snapshot includes the list IDs used to reference host objects. When the snapshot is restored, these IDs are restored but point to revoked proxies.

Arrays and prototypes

Promises

Exceptions