Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly escape non-unicode regex #218

Merged
merged 5 commits into from
Jan 1, 2022
Merged

Conversation

sebastienros
Copy link
Owner

No description provided.

@lahma
Copy link
Collaborator

lahma commented Dec 24, 2021

Something seems to break with these changes.

  • I replaced NuGet reference in Jint.csproj with local reference to this branch
  • Enabeld unicode tests in Tests262Test.cs (search for unicode)
  • Enabled unicode tests in Jint.Tests.Tests262\test\skipped.json (search for unicode)

Observations:

  • Jint.Tests.CommonScripts has now failing test in babel testing:
    • babel-standalone.js - Line 79828: Invalid regular expression
  • Jint.Tests.Ecma now has 8 failing tests
    • one example: Error: #0: var arr = /\u0000/.exec(\u0000); arr[0] === "\u0000". Actual. null
  • Jint.Tests.Tests262 had 20 failing tests with unicode enabled, now it has 22

@lahma
Copy link
Collaborator

lahma commented Dec 26, 2021

Looks better now, I still see two test cases failing for old Ecma tests:

  • Ecma Chapter 7(sourceFile: ch07/7.8/7.8.5/S7.8.5_A1.1_T1.js)
  • Ecma Chapter 7(sourceFile: ch07/7.8/7.8.5/S7.8.5_A2.1_T1.js)

And then the failing ones in new test suite when unicode enabled (the shouldn't be blockers as new tests):

  • built-ins\RegExp(sourceFile: built-ins/RegExp/dotall/without-dotall-unicode.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/dotall/without-dotall.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/property-escapes/character-class.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/prototype/Symbol.match/builtin-infer-unicode.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/prototype/Symbol.match/builtin-success-u-return-val-groups.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/prototype/Symbol.match/u-advance-after-empty.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/prototype/Symbol.replace/u-advance-after-empty.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/prototype/Symbol.search/u-lastindex-advance.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_brackets.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_character_class_escape.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_identity_escape.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_identity_escape_alpha.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_identity_escape_c.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_identity_escape_u.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_incomple_quantifier.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_octal_escape.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/unicode_restricted_quantifiable_assertion.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/character-class-escape-non-whitespace.js)

  • built-ins\RegExp(sourceFile: built-ins/RegExp/dotall/with-dotall-unicode.js)

  • language\white-space(sourceFile: language/white-space/mongolian-vowel-separator-eval.js)

    • I think this has some changed spec behind it, if I recall right

@sebastienros
Copy link
Owner Author

I marked some of these explicitly in Jint for now. Some rules are not implemented in esprima, like verifying that []{}() are balances when the u flag is set. Will attempt to fix these too. Haven't checked the old Ecma ones yet, thanks.

Copy link
Collaborator

@lahma lahma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trust that you know what you are doing ;)

@sebastienros sebastienros enabled auto-merge (squash) January 1, 2022 23:53
@sebastienros sebastienros merged commit 5867b4d into main Jan 1, 2022
@sebastienros sebastienros deleted the sebros/unicoderegex branch January 1, 2022 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants