-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(AsyncContext): examine security of AsyncContext #1424
Conversation
So if I understand correctly the readme in this PR, a new turn can learn explicit information to where it came from, but it cannot use that information to covertly communicate or affect the integrity of other turns that were also spawned at the same time, if it didn't have a way to do so without async context. If that's the case, it's a promising abstraction. |
} | ||
harden(AsyncContext); | ||
|
||
// Exposed only to the internal `then` function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is only used by then
, I assume that there is no reason to preserve the this
value between wrapperFn()
and fn()
?
The reasoning in the README looks good to me, after reading the AsyncContext proposal's README and studying the code in this PR. I found myself reaching for this kind of abstraction when I was beginning to implement the Jessica interpreter. It enables little languages that have their own spaghetti stacks, regardless of whether they are actually sync or async on the underlying interpreter. |
I can see why this proposal was marked for security review. Right off the bat, there are several parts that I don't understand the value of (and seems like the sort of thing that introduces security issues that violate ocap practices). The proposal says:
I may have simply not encountered enough practical scenarios where this was not possible, as it sounds like Michael Fig has (above, writing an interpreter). Using ergonomics as a justification to violate explicit reference passing as a security model reminds me of a classic xkcd: But if it is actually necessary, and can be proven safe, then I guess I shouldn't object. Meta process comment: In a perfect world, it would be nice if it were a proposer's responsibility to prove safety of a new language feature, rather than the security-conscious "opposition"'s responsibility to hastily craft an insecurity proof. Especially since the cost of approving an insecure change is so high, and it seems like the cost of not implementing this change (to me) is very low. At some point in a language's maturity it seems like it would be a good standard for a change to meet. It's getting a bit late here, and I agree this is a bit hard to reason about. I'll try to grok it better tomorrow. |
On the process point, while I agree in general, I don't think this case could have gone any other way. For all ocap rules we ever would have thought to write down, their proposal is trivially unsafe: The primordial But the champions have done a great job at internalizing ocap safety concerns, and crafting a proposal that is maximally friendly to them, but for this irreducible violation. Having internalized our concerns, they intuited that their proposal satisfies our real concerns, even though it violates the only rules we've ever been able to articulate. That is all they can do from their end. It was, and must be, up to us to test their conjecture against our tacit, inarticulate, real concerns. The result: This PR tentatively seems to represent our discovery of an articulate understanding of our concerns that validates their conjecture. I don't think there was any other way to get here. But again, this conclusion is tentative. Violating the rules which we normally rely on for safety is to venture into dangerous territory. This is why I've asked so many of you to examine the case I'm making here skeptically, and catch remaining reasons for concern I may have missed. |
which reasoning? ah...
Ouch. That hurts my brain. I have looked at systems where memory (heap allocation) and CPU time were treated as explicit capabilities, but I have never considered the ability to emit messages as a capability. This is going to take some perspective shift to think thru. |
This is not coming from a place of good understanding of the entire matter, but here it goes. I've got some observations stemming from the AsyncContext being just like node’s AsyncLocalStorage (note even the 11th slide says so) that already exists and we:
Based on the main use cases for AsyncLocalStorage I know and the details (struggles) of AsyncLocalStorage getting implemented on top of async_hooks in node, I believe AsyncContexts could be used to escape compartments in practice assuming some of the implementation difficulties transfer over to them or if they're used naively (which is likely the case in the wild already)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My review:
We're referencing the wrap
only in context of the internal then
function, but our intention is to provide wrap as a public static API on the AsyncContext
as in slide 14. Do you see a problem with the public API? The reason it's exposed is because there will be more queueing primitives in the ecosystem than just then
(eg, an EventEmitter
or web worker's postMessage
).
It's great that we're building on SyncContext
. The reasoning that I've built up is based on that. SyncContext
works in a secure system, and essentially approximates a stack value on the call stack. Setting a value on the stack seems totally fine. And we can rework that example several ways, eg as a closed over variable wrapped in a () => val
lambda that passed downwards. We've hoisted that closure to the module scope as a Stack
, and closing a new variable to pass downwards instead pushes (and automatically pops) to that stack (there are some massive performance benefits to this, besides being the only way to accomplish some APIs).
The only new superpower introduced by AsyncContext
is essentially language integration. Propagation of state will happen automatically, but it's the same functionality that a userland SyncContext
would perform.
Nit: adding a 0-sync-context…
to the filename to ensure they're reviewed in order would be helpful. We're trying to build from original, to weak, transposed, etc on top of prior examples.
Addressing other reviews:
So if I understand correctly the readme in this PR, a new turn can learn explicit information to where it came from, but it cannot use that information to covertly communicate or affect the integrity of other turns that were also spawned at the same time, if it didn't have a way to do so without async context.
Correct. You can only affect turns that spawn from the current turn, and never parents or siblings. It's only 1-way communication in the same way as a call stack can manipulate stack values.
Having internalized our concerns, they intuited that their proposal satisfies our real concerns, even though it violates the only rules we've ever been able to articulate. That is all they can do from their end. It was, and must be, up to us to test their conjecture against our tacit, inarticulate, real concerns. The result: This PR tentatively seems to represent our discovery of an articulate understanding of our concerns that validates their conjecture. I don't think there was any other way to get here.
This is exactly why we've included the SyncContext
building blocks in the slide deck. We think that this is safe in a sync context (and trivially implementable in userland even under a hardened security model). We want to build on that model to reason about this the same way you have in this PR.
I believe AsyncContexts could be used to escape compartments in practice assuming some of the implementation difficulties transfer over to them or if they're used naively (which is likely the case in the wild already)
I don't believe 1) and 2) are unique to AsyncContext
, though, they're a flaw of global state. Building on the SyncContext
example, both of these exist. Yes, we'll be adding a new class that will make it easier to use these types of state, but sometimes they're necessary.
The first slides example builds on console.log
, where we cannot introduce a new parameter value and introducing our own API won't gain real adoption (devs will need to remember to use it, the entire ecosystem will need to be rewritten to use it, etc).
3. correct propagation of AsyncLocalStorage in node depends on a lot of instrumentation, including some of the libraries having to manually keep the right context going while doing _weird async stuff_™.
It's still possible to lose the async context if you do weird things (we'll be exploring Web APIs and how they'll snapshot the state). This is actually another reason we desperately need a single ecosystem primitive to perform this functionality. Once they're one way to do it correctly, the ecosystem can update to take advantage of it and make sure the state is correctly propagated across all weird pauses. This is part of the reason we've exposed AsyncContext.wrap
, so that everyone can preserve the state as necessary.
Hi @jridgewell , thanks. Exposing An unencapsulated If you think that sync-context-original.js breaks the safety rules. (Primordial access to global mutable state.) The way we demonstrate that it is safe anyway is by equivalence to sync-context-shallow.js, which doesn't break any safety rules. Of course, these are only for the synchronous case. But we can also consider a |
No, I don't think it's possible to implement wrap directly from I'd argue that class AsyncContext {
static wrap(cb) {
// Due to async ticks introduced before the next `wrapped` can be executed,
// we have to store state multiple calls until the queue is drained.
const queue = [];
// We need capture to capture the current global state. We immediately do
// `capture.then()` so that the `then` will capture the state to be restored
// when we next invoke the wrapped cb.
let capture = new Deferred(Promise);
// This is a little magical. `.then` snapshots the current global
// AsyncContext state when `.then` is invoked (but not its callback).
// So wee need to immediately invoke it to capture the current state.
// We'll create new promises in future turns that capture that restored
// global state. ✨
capture.promise.then(function drainQueue() {
// Set up capture _again_, capturing the current global state (which is
// the restored state), so that the next time wrapped is called, we can
// restore it all again.
capture = new Deferred(Promise);
capture.promise.then(drainQueue);
// Drain all queued calls. Should be wrapped in try-catches,
// but psuedocode...
const drain = queue.splice(0, Infinity);
for (const { next, args } of drain) {
next.resolve(cb(...args));
}
});
return (...args) => {
// Calling `resolve` will invoke any chained `then` cbs, which is our
// `drainQueue` function. That'll restore the global state at the time
// `.then(drainQueue)` was registered, giving us the appearance of snapshots.
capture.resolve();
// We have to wait at least a tick for the capture to drain, so we need
// promises now.
let next = new Deferred(FakePromise);
queue.push({ next, args });
// Let's ignore return values for the moment. It causes complications and
// requires fleshing out a fake promise impl that doesn't capture global
// state at the moment `.then(cb)` is called.
};
}
}
// Just a helper
class Deferred {
constructor(PromConstructor) {
this.promise = new PromConstructor((r) => {
this.resolve = r;
});
}
} It's not perfect (and introduces some awkward complications), but the ability to snapshot all the current global state for some future turn would already be possible. |
|
||
get() { | ||
let keys = __keys__; | ||
while (keys !== undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be open to a timing attack for communicating across 2 parties Eve and Mallory
Assuming Eve has access to a timer, she can use setTimeout to measure event loop delay caused by a get()
call.
Meanwhile, when a condition Mallory wants to communicate to Eve is met, Mallory uses run()
significantly many times to make a noticeable delay.
@jridgewell Regarding a While wasteful, this should have the same effect as synchronising on a global state while the only state used is local state with coutinuity between wrapping and calling maintained by the KEY originating from the scope of the |
Right, I meant |
a3fdd80
to
b7c8171
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand what's going on here, and I'm OK with it.
}; | ||
harden(makeBob); | ||
|
||
const makeCarol = secretForAlice => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this attack poses 2 important questions (and by answering them we'll understand that it's not actually an attack at all):
- Can Alice and Bob communicate without Carol knowing about it?
- No, Carol can use
AsyncContext
herself to detect the communication
- No, Carol can use
- Can Carol censor this communication?
- Yes, Carol can use
wrap
to force her own empty state onto Alice and Bob.
- Yes, Carol can use
The crux of this attack is that Alice has received 2 callbacks from Bob (they're actually the same ===
callback, but run in different fluid var contexts), which she stores. After receiving these callbacks, she can then invoke either. Carol cannot tell from the callback's identity which is invoked. Once invoked, Bob can tell from the fluid var state whether the first callback or second callback is invoked.
- Can Alice and Bob communicate without Carol knowing about it?
This would be a successful attack if Carol could not determine which was invoked. However, she can, we just haven't added to code to do so yet. In order to determine which fluid context is being used when Alice finally invokes, Carol needs use her own AsyncContext
before allowing messages to pass between Alice and Bob.
const makeCarol = secretForAlice => {
// …
let id = 0;
const bobSideCtx = new AsyncContext();
const aliceSideCtx = new AsyncContext();
const forBobFromAlice = () => {
log.push('alice.forBob() {');
aliceSideCtx.run(id++, () => {
log.push('bobSideCtx = ' + bobSideCtx.get());
alice.forBob();
});
log.push('alice.forBob }');
};
const forAliceFromBob = () => {
log.push('bob.forAlice() {');
bobSideCtx.run(id++, () => {
log.push('aliceSideCtx = ' + aliceSideCtx.get());
bob.forAlice();
})
log.push('bob.forAlice }');
};
// …
}
I'm modifying the forBobFromAlice
and forAliceFromBob
because they approximate the membrane wrapping if this were a full setup. By adding in a aliceSideCtx
and bobSideCtx
, and running the proxy calls inside of incrementing contexts, we'll now see that it's impossible for Alice and Bob to communicate without Carol being aware of it.
- Can Carol censor this communication?
So, this leads us to whether Carol can prevent this form of communication entirely, and the answer is yes! Using a wrap
ped callback (which itself receives a callback and invokes it in the now restored context), we can entirely censor communication on both sides. This might break things, but that's a bug of the programs when running inside the membrane.
const makeCarol = secretForAlice => {
// …
const bobSideCtx = wrap((cb) => cb());
const aliceSideCtx = wrap((cb) => cb());
const forBobFromAlice = () => {
log.push('alice.forBob() {');
aliceSideCtx(() => {
alice.forBob();
});
log.push('alice.forBob }');
};
const forAliceFromBob = () => {
log.push('bob.forAlice() {');
bobSideCtx(() => {
bob.forAlice();
})
log.push('bob.forAlice }');
};
// …
}
And now we can see that Carol can prevent Alice and Bob from communicating via async contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not following this closely enough to answer this question myself, so please bear with me if the answer naturally follows:
Can a program protect its own invariants from co-tenant programs that use AsyncContext without itself using or being aware of the existence of AsyncContext?
Does this attack/riposte instead illustrate that new programs that do use AsyncContext must employ it as both sword and shield to preserve their own integrity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a program protect its own invariants from co-tenant programs that use AsyncContext without itself using or being aware of the existence of AsyncContext?
In same-realm, no it's not possible without the membrane using an AyncContext
. In cross-realm, yes it is possible, because Bob's fluid var will not be preserved by Alice wrapping the callback she received.
Does this attack/riposte instead illustrate that new programs that do use AsyncContext must employ it as both sword and shield to preserve their own integrity?
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In cross-realm, yes it is possible, because Bob's fluid var will not be preserved by Alice wrapping the callback she received.
Can you clarify what you mean here? I assume you refer to cross Worker which has to go through an async API, where the host implementation would implicitly use AsyncContext itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why a ShadowRealm based synchronous interaction would be any different than a same realm one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cross-realm case (independent of whether it is legacy realms or shadow realms) hinges on one aspect of our model code: In 6-async-context-transpose.js
, for the __get__
variable, how global is it? Certainly, looking at this model code as JavaScript, this variable can only be per-realm. But this is a model of something for the language engine to implement.
SO FAR, independently created realms have no observantly-shared mutable state. But, one could decide to adopt a model purposely violating this, where __get__
was understood as modeling state global to the agent, i.e, per agent rather than per realm, and implicitly shared by all realms within that agent. The same cross-compartment attack illustrated by test-attack.js
would become a cross-realm attack, but not a cross-agent attack.
In for a penny, in for a pound. If this is too scary for the cross-realm case --- which I sympathize with! --- then isn't it equally too scary for the cross-compartment case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would deepen the analogy between the hidden state behind then
and the hidden state behind AsyncContext
. I lied above when I said "independently created realms have no observantly-shared mutable state". The Job queue is shared by all realms within the same agent. If Alice and Bob are in two separate realms, and Alice's code gets control first, if Bob ever gets control, he has observed that Alice did not engage in an infinite loop.
This is the Anthropic Side Channel, which communicates more than zero bits but less than one bit, since Bob can never observe Alice's other possible decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize wrap
was Realm-specific. Is this behavior necessary? Also, how do we expect it to be implemented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this some more: I'm deeply skeptical of wrap
being Realm-specific, since it seems that it would rule out host environments making their own built-in, not-directly-accessible, cross-Realm AsyncContext variables (and, as @erights explains, the job queue intertwines Realms at this level anyway). I'd prefer we make wrap
per-agent, if we reason that this is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The just-added fps-transform explanation would further support making this per-agent and cross-realm.
f22f844
to
1b85fc0
Compare
get: () => state, | ||
}); | ||
}; | ||
harden(makeSyncContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is useful, but "thinking out loud" in code.
Here's a wrap function for 2-sync-context-shallow where calling wrap doesn't mutate any globals, but uses message passing to trigger local mutations.
Creating a context appends to a global in the implementation, but this feels meaningfully different so I thought I'd share.
https://gist.github.com/naugtur/72c7a73314a99892b2bbc9fb43d78d1f
I found a way to perform the attack that @erights designed, but using Defending against it is actually not straightforward at all, and requires the membrane to force all interactions between alice and bob to be asynchronous (unless |
7860ede
to
0735be9
Compare
0735be9
to
142155c
Compare
to | ||
```js | ||
(F, _x, _y) => _f(F, _a, _b); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to add an pre/post example using AsyncContext
to demonstrate what is transformed.
// Pre
const ctx = new AsyncContext();
ctx.run(1, () => {
ctx.get() === 1;
});
// Post
const F = [[Fluid Context]];
const ctx = new AsyncContext(F);
ctx.run(F, 1, (F1) => {
ctx.get(F1) === 1;
});
run: (F2, _val, _cb, _args = []) => { | ||
const key = harden({}); | ||
transposedMap.set(key, _val); | ||
const F3 = harden(m => (m.has(key) ? m.get(key) : F2(m))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const F3 = harden(m => (m.has(key) ? m.get(key) : F2(m))); | |
const F3 = harden((F7, _m) => (_m.has(key) ? _m.get(key) : F2(_m))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jridgewell I don't think this is correct. Remember that this code is written manually in the post-fps language. The functions, keys, and maps involved at this line have no reality in the pre-fps language.
const _wrapper = (F6, ...args) => { | ||
return _fn(F5, ...args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const _wrapper = (F6, ...args) => { | |
return _fn(F5, ...args); | |
const _wrapper = (F6, ..._args) => { | |
return _fn(F5, ..._args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is appropriate. Done.
This PR has served its purpose. AsyncContext is proceeding well. Closing. |
The security of @legendecas' AsyncContext proposal, best explained by @jridgewell's Slide 11, is hard to reason about. They have done everything they can to make it as ocap safe as possible while still providing the motivating feature of the proposal. It may very well be safe enough that we should allow it to proceed to standardization. Or, it may not be. We need to figure that out quickly. This PR adds a number of files for exactly that purpose.
SES Meeting: AsyncContext Security Review, Part 2, despite the name, is the best place to start to follow our discussions of this topic.
I have added these files to the
@endo/eventual-send
package because I also want to explore whether we can reimplement track-turns in terms of this abstraction.@legendecas @jridgewell @littledan @Jack-Works I cannot add you to the formal reviewers list, but please consider yourselves reviewers as well. I would love your comments.
If the reasoning in the added README.md is sound, then we would withdraw all objections to the AsyncContext proposal.
For those watching the Jan 4, 2023 SES Meeting: AsyncContext Security Review: Part 1 recording, #1428 is a snapshot of the state of this PR as of that recording. These should now be considered for historical interest only. For all other purposes, prefer this PR itself.