Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a fresh node startup has populated RegExp statics #43740

Closed
ljharb opened this issue Jul 8, 2022 · 16 comments · Fixed by #43741
Closed

a fresh node startup has populated RegExp statics #43740

ljharb opened this issue Jul 8, 2022 · 16 comments · Fixed by #43741

Comments

@ljharb
Copy link
Member

ljharb commented Jul 8, 2022

Version

v18.5.0

Platform

n/a

Subsystem

No response

What steps will reproduce the bug?

node -pe 'RegExp.$_' will print out WeakRef.

node, and then RegExp.$_, will print out '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'.

This also applies to $&.

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior?

No response

What do you see instead?

I expect the empty string for all RegExp statics.

This is a problem for a few reasons:

  1. it exposes internal regular expression usage in node core
  2. the value might change from any version to any version, so if anyone's foolish enough to rely on it, they'll break in a non-major
  3. it's unseemly

If there's a point, right before node starts running user code, and right before the repl becomes available, then we could execute a trivial regular expression operation to clear all the statics.

Alternatively, this may be exposing non-primordial regex usage, and it's likely that switching to primordials (since I believe they'd update the statics on the primordial RegExp instead of the global RegExp) would solve this as well.

Additional information

No response

@ljharb
Copy link
Member Author

ljharb commented Jul 8, 2022

cc @aduh95

@hemanth
Copy link
Contributor

hemanth commented Jul 8, 2022

I would like to work on the fix for this. Looks like it is a side of this.

@ljharb
Copy link
Member Author

ljharb commented Jul 8, 2022

ah, maybe the statics are populated by the same realm then :-/ that would be great for the repl, but wouldn't address node itself.

@richardlau
Copy link
Member

Duplicate of #18931?

@ljharb
Copy link
Member Author

ljharb commented Jul 8, 2022

The repl part, yes - but not the node -pe part.

@aduh95
Copy link
Contributor

aduh95 commented Jul 8, 2022

it's likely that switching to primordials (since I believe they'd update the statics on the primordial RegExp instead of the global RegExp) would solve this as well.

Before user code is executed, primordials.RegExp === globalThis.RegExp, so that wouldn't help. Creating an empty RegExp sounds fine though.

@hemanth if you want to work on the REPL side, that'd be great, I have a fix ready for the other problem.

@hemanth
Copy link
Contributor

hemanth commented Jul 8, 2022

@aduh95 Trying to understand the intent behind this.

regExMatcher.exec(savedRegExMatches.join(sep)) results in:

[
  '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
  '',
  '',
  '',
  '',
  '',
  '',
  '',
  '',
  '',
  index: 0,
  input: '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
  groups: undefined
]

@aduh95
Copy link
Contributor

aduh95 commented Jul 8, 2022

My understanding is it's trying to clear up whatever RegEx node internals may have run between two REPL entries if that makes sense.

  1. The user enters a regex operation, and hit ENTER to eval their entry.
  2. Node.js internals run other regex operations that overrides RegExp statics.
  3. The user expect to find RegExp statics in the same state their original operation should have left them.

I think that works for RegExp.$1 to RegExp.$9, but not the other ones. Not sure this could be achieve without re-running the user code.

@hemanth
Copy link
Contributor

hemanth commented Jul 8, 2022

Node.js internals run other regex operations that overrides RegExp statics.

If we use primordial SafeRegExp we can avoid this side effect?

@aduh95
Copy link
Contributor

aduh95 commented Jul 8, 2022

Using SafeRegExp is not very practical because that would mean we can no longer use regex literals in core. Trying to migrate all of core to SafeRegExp seems like an unreasonable amount of work.

@hemanth
Copy link
Contributor

hemanth commented Jul 9, 2022

@aduh95 Hmm, We should revive #20549 ?

@aduh95
Copy link
Contributor

aduh95 commented Jul 9, 2022

From my point of view #20549 (comment) and #20549 (comment) is a good TL;DR for this thread. I'd say let's try this approach and run benchmark to see how it performs. IMO it would be worth creating a SafeRegExp class in primordials rather than the getInternalGlobal hack.

I wonder if ShadowRealms integration gives a new solution to this problem, allowing us to run REPL code on another Realm sounds like something that should also fix the issue. Should we move this discussion to #18931 since only the REPL is at stake now?

@benjamingr
Copy link
Member

Isn't RegExp.$_ a standard web-compatibility legacy feature? Spec wise are we safe to remove and not support it?

@aduh95
Copy link
Contributor

aduh95 commented Jul 9, 2022

It's implemented by V8, I'm sure we can easily disable it – but yes, as far as ECMAScript is concerned, it doesn't exist.

@benjamingr
Copy link
Member

@aduh95 what about https://github.com/tc39/proposal-regexp-legacy-features/#additional-properties-of-the-regexp-constructor ? I thought there was a proposal to make it standard but it looks abandoned?

Anyway - no strong feelings from me, I figured it's one of these things like String.prototype.bold

@hemanth
Copy link
Contributor

hemanth commented Jul 10, 2022

Legacy RegExp proposals is in stage 3!

So, are we leaning towards removing RegExp.$_ or SafeRegExp?

@aduh95 Nods, it makes sense to move the discussion to #18931

aduh95 added a commit to aduh95/node that referenced this issue Jul 12, 2022
aduh95 added a commit to aduh95/node that referenced this issue Jul 14, 2022
aduh95 added a commit to aduh95/node that referenced this issue Jul 17, 2022
aduh95 added a commit to aduh95/node that referenced this issue Aug 1, 2022
aduh95 added a commit to aduh95/node that referenced this issue Aug 4, 2022
aduh95 added a commit that referenced this issue Aug 5, 2022
Fixes: #43740

PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
aduh95 added a commit to aduh95/node that referenced this issue Aug 16, 2022
Fixes: nodejs#43740

PR-URL: nodejs#43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
aduh95 added a commit to aduh95/node that referenced this issue Aug 23, 2022
Fixes: nodejs#43740

PR-URL: nodejs#43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Fyko pushed a commit to Fyko/node that referenced this issue Sep 15, 2022
Fixes: nodejs#43740

PR-URL: nodejs#43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
RafaelGSS pushed a commit that referenced this issue Sep 23, 2022
Fixes: #43740

Backport-PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
PR-URL: #44247
Reviewed-By: Rafael Gonzaga <[email protected]>
RafaelGSS pushed a commit that referenced this issue Sep 26, 2022
Fixes: #43740

Backport-PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
PR-URL: #44247
Reviewed-By: Rafael Gonzaga <[email protected]>
juanarbol pushed a commit that referenced this issue Oct 3, 2022
Fixes: #43740

PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
juanarbol pushed a commit that referenced this issue Oct 4, 2022
Fixes: #43740

PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
juanarbol pushed a commit that referenced this issue Oct 7, 2022
Fixes: #43740

PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
juanarbol pushed a commit that referenced this issue Oct 10, 2022
Fixes: #43740

PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
juanarbol pushed a commit that referenced this issue Oct 11, 2022
Fixes: #43740

PR-URL: #43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
guangwong pushed a commit to noslate-project/node that referenced this issue Jan 3, 2023
Fixes: nodejs/node#43740

PR-URL: nodejs/node#43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
guangwong pushed a commit to noslate-project/node that referenced this issue Jan 3, 2023
Fixes: nodejs/node#43740

PR-URL: nodejs/node#43741
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants