Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Can-call-user-code annotation #2548

Merged
merged 2 commits into from
Jan 8, 2022
Merged

Conversation

syg
Copy link
Contributor

@syg syg commented Oct 8, 2021

This PR adds an annotation to call sites of AOs and SDOs that can call user code, possibly transitively.

Ecmarkup starts with a set of "roots" that can call user code, and propagates the effect to all reachable AO and SDO invocations.

For spec authors:

  • Roots and false negatives can be manually tagged with <emu-meta effects="user-code">root cause</emu-meta>.
  • Most invocations of internal methods (e.g. [[Get]]) are roots.
  • The phrase "the result of evaluating" is automatically considered a root.
  • False positives can be manually suppressed with <emu-meta suppress-effects="user-code">suppressed</emu-meta>.
  • Steps can be manually fenced with 1. [fence-effects="user-code"] Step, which fences any child step effects from propagating.
  • Some spec constructs in algorithm steps are like nested functions and should fence effects from propagating. Abstract Closures are automatic fences.

@ljharb ljharb marked this pull request as draft October 8, 2021 22:01
spec.html Outdated Show resolved Hide resolved
@syg syg added the editor call to be discussed in the next editor call label Oct 8, 2021
@ljharb
Copy link
Member

ljharb commented Oct 8, 2021

A false positive is https://ci.tc39.es/preview/tc39/ecma262/sha/a5ba1ae3a48e8cffb1e6051bc97dceace591af20/#sec-createregexpstringiterator.

iow, technically if you called that AO with a user-provided regex, the annotations are correct; but it's only ever called with a spec-fiction-regex, so they're not. there's probably a similar case with async iteration.

@syg syg removed the editor call to be discussed in the next editor call label Oct 20, 2021
@bakkot
Copy link
Contributor

bakkot commented Nov 10, 2021

As a concrete example of a place the analysis fails for abstract closures, %TypedArray%.prototype.entries has as its third step Return CreateArrayIterator(_O_, ~key+value~)., which does not directly invoke user code; rather, it returns an abstract closure which does so.


Another thing is Yield, introduced in #2045, which we use both for a literal yield expression and to specify built-in iterators like CreateArrayIterator. Yield just defers to GeneratorYield or AsyncGeneratorYield depending on which kind of generator we're in. It's is only marked as can-call-user-code because in the async case it calls Await, which calls PromiseResolve, which [[Get]]s the user-controlled .constructor of the passed value. In the sync case, it does not call user code (although it does pause execution, such that user code can be called in the mean time).

Other than the yield in ES code, I'm pretty sure we always know at the callsite whether we're in a sync or an async generator. (Even for the user code one, we can figure it out by calling GetGeneratorKind().) In fact, because we don't currently have any async iterators in the spec, for the built-in iterators it's always sync. I think we should probably just call GeneratorYield directly, but failing that, we should mark all the Yields in built-in iterators as not calling user code (since they're all sync).


The various Reflect methods call the MOP methods of their argument and do not mark that as calling user code. For example, Reflect.get has Return ? target.[[Get]](key, receiver). and that line is not marked as UC.


Similarly, IsExtensible calls O.[[IsExtensible]](), which is not marked as UC despite potentially invoking the isExtensible trap on proxies.

@syg
Copy link
Contributor Author

syg commented Dec 9, 2021

Another thing is Yield, introduced in #2045, which we use both for a literal yield expression and to specify built-in iterators like CreateArrayIterator. Yield just defers to GeneratorYield or AsyncGeneratorYield depending on which kind of generator we're in. It's is only marked as can-call-user-code because in the async case it calls Await, which calls PromiseResolve, which [[Get]]s the user-controlled .constructor of the passed value. In the sync case, it does not call user code (although it does pause execution, such that user code can be called in the mean time).

Other than the yield in ES code, I'm pretty sure we always know at the callsite whether we're in a sync or an async generator. (Even for the user code one, we can figure it out by calling GetGeneratorKind().) In fact, because we don't currently have any async iterators in the spec, for the built-in iterators it's always sync. I think we should probably just call GeneratorYield directly, but failing that, we should mark all the Yields in built-in iterators as not calling user code (since they're all sync).

So by my reading GeneratorYield() can also never return abrupt completions, yeah?

@bakkot
Copy link
Contributor

bakkot commented Dec 9, 2021

So by my reading GeneratorYield() can also never return abrupt completions, yeah?

From the perspective of the consumer of GeneratorYield, that's not correct: the completion record which GeneratorYield returns can be a throw or return completion. This happens when calling .throw or .return, which will explicitly construct and pass in such completions using the "resume ... using" language in GeneratorResumeAbrupt step 10, which is basically a jump to GeneratorYield step 7.a.

("Return" is kind of a funny word here, because of the execution context stack manipulation - the "Return" in GeneratorYield step 8 isn't an actual return. I have a PR fixing this in #2429, but it needs rebasing. Still, you might find this easier to follow by reading the version of GeneratorYield in that PR.)

@syg
Copy link
Contributor Author

syg commented Dec 9, 2021

Ah, of course, when it "returns" upon abnormal resumption.

@syg
Copy link
Contributor Author

syg commented Dec 9, 2021

The Abstract Closure stuff won't be correct in the preview until tc39/ecmarkup#382 is merged and bumped here.

@bakkot
Copy link
Contributor

bakkot commented Dec 9, 2021

The Abstract Closure stuff won't be correct in the preview until tc39/ecmarkup#382 is merged and bumped here.

Done.

@syg syg requested a review from a team December 9, 2021 02:19
@bakkot
Copy link
Contributor

bakkot commented Dec 9, 2021

Below lists are not comprehensive.

False negatives:

  • all calls to the Evaluate and ExecuteModule concrete methods for cyclic module records
  • every use of the phrase "the result of evaluating", except for the one in JSON.parse - these are the actual roots, even though they're mostly intermediated through internal methods (this phrase probably deserves explicit support in ecmarkup)
  • most callsites of the HasBinding, CreateMutableBinding, InitializeBinding, SetMutableBinding, GetBindingValue, and DeleteBinding concrete methods of environment records (which are user-code-y because of with(proxy)) - these don't propagate for the same reason the [[-style internal methods don't propagate
  • calls to IsExtensible - I don't know what's up with that; the IsExtensible method itself is correctly marked as containing user code in its body
  • calls to [[GetPrototypeOf]], [[PreventExtensions]], and [[OwnPropertyKeys]] - I'm guessing your regex for internal methods just missed these

False positives:

  • the eight instances of Perform DefinePropertyOrThrow without a ! or ? (which should just get a !; presumably that will be fixed by Editorial: remove implicit wrapping/unwrapping of completion records #2547)
  • the first step in RegExpCreate (which should be !; it can't throw here because it's using the built-in RegExp constructor)
  • the call to RegExpCreate in the evaluation semantics for regexp literals (RegExpCreate only calls user code if one of the arguments is an object with a trapped ToString)
  • AllocateTypedArrayBuffer step 5 (it's ? because CreateByteDataBlock can throw, but it doesn't call user code; it uses the built-in ArrayBuffer constructor)
  • SortCompare steps 7 and 9 - IsLessThan can only call user code when called with objects, not strings

Other issues:

  • the call to ToString in CreateMappedArgumentsObject step 19.b.ii.3 is marked as UC, but it's supposed to be associated with the surrounding _map_.[[DefineProperty]] call (which is explicitly marked as user code). This is presumably a bug in ecmarkup.
  • the call to GetThisValue in GetValue step 4.c has the same issue and should have the same fix (probably there's others)

@syg
Copy link
Contributor Author

syg commented Dec 9, 2021

every use of the phrase "the result of evaluating", except for the one in JSON.parse - these are the actual roots, even though they're mostly intermediated through internal methods (this phrase probably deserves explicit support in ecmarkup)

My preference is to add the <dt>effects</dt><dd>user-code</dd> to the AOs that contain that phrasing than to special case it for now, because all the roots now are manually marked like that. There's no special handling of an AO that has an <emu-meta effects="user-code">...</emu-meta> somewhere, and I'm kind of worried about having a pass that looks through the contents of every AO, until it's no longer maintainable to manually mark all the roots.

Edit: Actually, I think I misunderstood your suggestion for special casing. You're not talking about just marking a surrounding AO that has it as a root, but actually rendering the UC for the phrase "the result of evaluating |Foo|". That sounds right and does deserve ecmarkup handling, there are a hundreds of occurrences. But still it does the extra pass that I'm not thrilled about...

@bakkot
Copy link
Contributor

bakkot commented Dec 9, 2021

You're not talking about just marking a surrounding AO that has it as a root, but actually rendering the UC for the phrase "the result of evaluating |Foo|".

Right.

But still it does the extra pass that I'm not thrilled about...

Ehh... you can just stick .replace(/the result of evaluating ([a-zA-Z_|0-9]+)/g, '<emu-meta effects="user-code">$1</emu-meta>) at the end of this line and call it good, I think.

We could do it properly, probably in ecmarkdown, but that's... more work.

@syg
Copy link
Contributor Author

syg commented Dec 10, 2021

  • all calls to the Evaluate and ExecuteModule concrete methods for cyclic module records

Done

  • every use of the phrase "the result of evaluating", except for the one in JSON.parse - these are the actual roots, even though they're mostly intermediated through internal methods (this phrase probably deserves explicit support in ecmarkup)
    most callsites of the HasBinding, CreateMutableBinding, InitializeBinding, SetMutableBinding, GetBindingValue, and DeleteBinding concrete methods of environment records (which are user-code-y because of with(proxy)) - these don't propagate for the same reason the [[-style internal methods don't propagate

Done in tc39/ecmarkup#385

calls to IsExtensible - I don't know what's up with that; the IsExtensible method itself is correctly marked as containing user code in its body

Done in tc39/ecmarkup#385

calls to [[GetPrototypeOf]], [[PreventExtensions]], and [[OwnPropertyKeys]] - I'm guessing your regex for internal methods just missed these

Done

False positives:

I opted to leave these to be automatically fixed later when they get a !.

  • the first step in RegExpCreate (which should be !; it can't throw here because it's using the built-in RegExp constructor)
    the call to RegExpCreate in the evaluation semantics for regexp literals (RegExpCreate only calls user code if one of the arguments is an object with a trapped ToString)

Done

  • AllocateTypedArrayBuffer step 5 (it's ? because CreateByteDataBlock can throw, but it doesn't call user code; it uses the built-in ArrayBuffer constructor)

Done

  • SortCompare steps 7 and 9 - IsLessThan can only call user code when called with objects, not strings

Done

Other issues:

  • the call to ToString in CreateMappedArgumentsObject step 19.b.ii.3 is marked as UC, but it's supposed to be associated with the surrounding map.[[DefineProperty]] call (which is explicitly marked as user code). This is presumably a bug in ecmarkup.

Done in tc39/ecmarkup#385

  • the call to GetThisValue in GetValue step 4.c has the same issue and should have the same fix (probably there's others)

Done in tc39/ecmarkup#385

@bakkot
Copy link
Contributor

bakkot commented Dec 10, 2021

Few more things:

  • bac10dd can get reverted given tc39/ecmarkup@80959f0
  • apparently we invoke some SDOs as Foo for |Bar| instead of Foo of |Bar| - e.g. in FunctionDeclarationInstantiation step 25.a./26.a, we Perform ? IteratorBindingInitialization for _formals_. That should get marked as user code, but doesn't, presumably because ecmarkup doesn't know you can invoke SDOs with "for" (I didn't either - we only use this form about 50 times total, compared to ~1000 uses with "of").
  • despite appearances, GeneratorStart doesn't actually execute user code; it just sets up code to be executed later (the first time .next is invoked). Not sure how best to deal with that - put no-user-code in the dl header, maybe? Or special case "set the code evaluation state" like we did for "a new Abstract Closure"?
    • if we do special case "set the code evaluation state", we'll have to make sure AsyncBlockStart still counts as executing user code, since it immediately jumps into the context it has just constructed
  • on the other hand, GeneratorResume effectively does execute user code, by explicitly pushing and resuming an execution context which can contain user code. Ditto GeneratorResumeAbrupt.
  • Re: GeneratorResume, %StringIteratorPrototype%.next, %MapIteratorPrototype%.next, and %SetIteratorPrototype%.next don't invoke user code despite calling GeneratorResume, because the execution contexts which they're resuming do not ever invoke user code. On the other hand, %ArrayIteratorPrototype%.next and %RegExpStringIteratorPrototype%.next do invoke user code.

Didn't spot anything else wrong in some random spot-checks, except for the aforementioned DefinePropertyOrThrow issues (plus CreateMappedArgumentsObject step 19.b.ii.3, which should really be a ! DefinePropertyOrThrow).

@syg
Copy link
Contributor Author

syg commented Dec 10, 2021

apparently we invoke some SDOs as Foo for |Bar| instead of Foo of |Bar| - e.g. in FunctionDeclarationInstantiation step 25.a./26.a, we Perform ? IteratorBindingInitialization for formals. That should get marked as user code, but doesn't, presumably because ecmarkup doesn't know you can invoke SDOs with "for" (I didn't either - we only use this form about 50 times total, compared to ~1000 uses with "of").

I'll make a separate PR that changes these to "of" instead.

@syg
Copy link
Contributor Author

syg commented Dec 10, 2021

Note that 9a55c12 depends on tc39/ecmarkup#388.

spec.html Outdated Show resolved Hide resolved
@bakkot
Copy link
Contributor

bakkot commented Dec 11, 2021

You missed GeneratorResumeAbrupt, I think.

Edit: pushed a commit fixing.

Copy link
Contributor

@bakkot bakkot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few false positives in the internal methods for various exotic objects.

  • In the internal methods for Arguments exotic objects, none of the things which operate on [[ParameterMap]] (usually spelled _map_) can invoke user code, since [[ParameterMap]] is an entirely spec-fictional object. Nor can the OrdinaryDelete call in the [[Delete]] section, since a user can't trap that call. (The Get and Set calls on the arguments object itself can still invoke user code, since a user can add getters/setters to the arguments object.)

  • In the [[Delete]] semantics for Integer-Indexed exotic objects, the call to OrdinaryDelete can't call user code, because the only step in OrdinaryDelete which calls user code is the invocation of [[GetOwnProperty]], and we know the [[GetOwnProperty]] semantics for Integer-Indexed exotic objects doesn't call user code. (In fact it also doesn't throw, so I think the ? OrdinaryDelete here could become ! OrdinaryDelete.)

  • None of the internal methods of module namespace exotic objects can invoke use code.

Otherwise LGTM. I'm sure there's remaining false positives and potentially false negatives, but I didn't see any except for those already mentioned when spot-checking stuff.

A surprisingly small diff, in the end, and would be smaller still if ecmarkup understood concrete and/or internal methods.

spec.html Outdated Show resolved Hide resolved
@syg
Copy link
Contributor Author

syg commented Dec 16, 2021

In the internal methods for Arguments exotic objects, none of the things which operate on [[ParameterMap]] ...

I feel like this is also one of those things better fixed by correct prefixes of ! to those calls than <emu-meta> overrides slate in this PR.

Also interested in @michaelficarra's thoughts whether we should be more eager with manual suppressions for cases that could be getting fixed via !.

In the [[Delete]] semantics for Integer-Indexed exotic objects, the call to OrdinaryDelete can't call user code ...

This is just a 1 liner so I've changed this to a ! OrdinaryDelete. I agree it can't call a Proxy trap since the receiver is guaranteed to be an IIEO.

None of the internal methods of module namespace exotic objects can invoke use code.

I'd also like these to be automatically fixed by switching to ! prefixing.

@bakkot
Copy link
Contributor

bakkot commented Dec 16, 2021

I feel like this is also one of those things better fixed by correct prefixes of ! to those calls than <emu-meta> overrides slate in this PR.

Fine with me.

None of the internal methods of module namespace exotic objects can invoke use code.

I'd also like these to be automatically fixed by switching to ! prefixing.

Well, in a couple cases cases ([[DefineOwnProperty]] step 2, e.g.) these methods are invoking other internal methods, and those sites are explicitly marked as calling user code by this PR. So those cases wouldn't be automatically fixed.

And in a couple cases the operation can actually throw - calls to [[Get]] can throw if the module isn't linked (though I don't remember how that can happen...) or via the HostResolveImportedModule hook.

@syg
Copy link
Contributor Author

syg commented Dec 17, 2021

Well, in a couple cases cases ([[DefineOwnProperty]] step 2, e.g.) these methods are invoking other internal methods, and those sites are explicitly marked as calling user code by this PR. So those cases wouldn't be automatically fixed.

Good point, done.

@syg syg marked this pull request as ready for review December 20, 2021 16:38
@syg syg added editor call to be discussed in the next editor call and removed editor call to be discussed in the next editor call labels Dec 20, 2021
@syg syg requested a review from a team December 22, 2021 23:12
Copy link
Member

@michaelficarra michaelficarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Only missing suppressions I saw while reviewing the rendering will be suppressed by ! in the future. All the manual suppressions and annotations in this PR look correct. I didn't check for missing suppressions/annotations. I also didn't check whether the suppressions/annotations were necessary; I assume ecmarkup would complain for any it could determine through propagation.

@bakkot
Copy link
Contributor

bakkot commented Dec 29, 2021

I also didn't check whether the suppressions/annotations were necessary; I assume ecmarkup would complain for any it could determine through propagation.

That's not currently implemented, but I've added it to the backlog for the linter. Since there wouldn't be any difference in the rendered output, I don't think landing that check needs to block this PR. (I don't think there are currently any redundant ones, though.)

Copy link
Contributor

@bakkot bakkot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Still some known false positives but that's fine.

@bakkot bakkot added the ready to merge Editors believe this PR needs no further reviews, and is ready to land. label Jan 6, 2022
@syg
Copy link
Contributor Author

syg commented Jan 6, 2022

@ljharb When you merge this, please squash. There's no value in keeping the separate commits for this.

@ljharb ljharb merged commit 4cba6f7 into tc39:main Jan 8, 2022
@domenic
Copy link
Member

domenic commented Jan 26, 2022

I may be blind, but is this supposed to be visible in the output? I cannot seem to find any differences in the live spec, although if I inspect element I do see various <emu-meta>s.

@bakkot
Copy link
Contributor

bakkot commented Jan 26, 2022

@domenic Hit u to toggle the annotation (you can also hit ? pop up a modal listing such shortcuts).

jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Oct 19, 2024
(PR tc39#2548 unfolded many invocations of Yield,
but didn't change the accompanying Note-steps.)
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Oct 19, 2024
(PR tc39#2548 unfolded many invocations of Yield,
but didn't change the accompanying Note-steps.)
michaelficarra pushed a commit to jmdyck/ecma262 that referenced this pull request Oct 21, 2024
(PR tc39#2548 unfolded many invocations of Yield,
but didn't change the accompanying Note-steps.)
ljharb pushed a commit to jmdyck/ecma262 that referenced this pull request Nov 8, 2024
…#3454)

(PR tc39#2548 unfolded many invocations of Yield, but didn't change the accompanying Note-steps.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Editors believe this PR needs no further reviews, and is ready to land.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants