Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regexp/require-unicode-sets-regexp rule #598

Merged
merged 6 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/sour-feet-explain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"eslint-plugin-regexp": minor
---

Add `regexp/require-unicode-sets-regexp` rule
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,7 @@ The `plugin:regexp/all` config enables all rules. It's meant for testing, not fo
| [prefer-regexp-exec](https://ota-meshi.github.io/eslint-plugin-regexp/rules/prefer-regexp-exec.html) | enforce that `RegExp#exec` is used instead of `String#match` if no global flag is provided | | | | |
| [prefer-regexp-test](https://ota-meshi.github.io/eslint-plugin-regexp/rules/prefer-regexp-test.html) | enforce that `RegExp#test` is used instead of `String#match` and `RegExp#exec` | | | 🔧 | |
| [require-unicode-regexp](https://ota-meshi.github.io/eslint-plugin-regexp/rules/require-unicode-regexp.html) | enforce the use of the `u` flag | | | 🔧 | |
| [require-unicode-sets-regexp](https://ota-meshi.github.io/eslint-plugin-regexp/rules/require-unicode-sets-regexp.html) | enforce the use of the `v` flag | | | 🔧 | |
| [sort-alternatives](https://ota-meshi.github.io/eslint-plugin-regexp/rules/sort-alternatives.html) | sort alternatives if order doesn't matter | | | 🔧 | |
| [use-ignore-case](https://ota-meshi.github.io/eslint-plugin-regexp/rules/use-ignore-case.html) | use the `i` flag if it simplifies the pattern | ✅ | | 🔧 | |

Expand Down
1 change: 1 addition & 0 deletions docs/rules/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ sidebarDepth: 0
| [prefer-regexp-exec](prefer-regexp-exec.md) | enforce that `RegExp#exec` is used instead of `String#match` if no global flag is provided | | | | |
| [prefer-regexp-test](prefer-regexp-test.md) | enforce that `RegExp#test` is used instead of `String#match` and `RegExp#exec` | | | 🔧 | |
| [require-unicode-regexp](require-unicode-regexp.md) | enforce the use of the `u` flag | | | 🔧 | |
| [require-unicode-sets-regexp](require-unicode-sets-regexp.md) | enforce the use of the `v` flag | | | 🔧 | |
| [sort-alternatives](sort-alternatives.md) | sort alternatives if order doesn't matter | | | 🔧 | |
| [use-ignore-case](use-ignore-case.md) | use the `i` flag if it simplifies the pattern | ✅ | | 🔧 | |

Expand Down
68 changes: 68 additions & 0 deletions docs/rules/require-unicode-sets-regexp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
pageClass: "rule-details"
sidebarDepth: 0
title: "regexp/require-unicode-sets-regexp"
description: "enforce the use of the `v` flag"
---
# regexp/require-unicode-sets-regexp

🔧 This rule is automatically fixable by the [`--fix` CLI option](https://eslint.org/docs/latest/user-guide/command-line-interface#--fix).

<!-- end auto-generated rule header -->

> enforce the use of the `v` flag

## :book: Rule Details

This rule reports regular expressions without the `v` flag.

It will automatically replace the `v` flag to regular expressions where it is already uses the 'u' flag and statically guaranteed to be safe to do so. In all other cases, the developer has to check that adding the `v` flag doesn't cause the regex to behave incorrectly.

If you want to automatically add the `v` flag to legacy regular expressions that don't use the `u` flag, use them together with the [regexp/require-unicode-regexp] rule.

<eslint-code-block fix>

```js
/* eslint regexp/require-unicode-sets-regexp: "error" */

/* ✓ GOOD */
var foo = /foo/v;
var foo = /a\s+b/v;

/* ✗ BAD */
var foo = /foo/;
var foo = RegExp("a\\s+b");
var foo = /[a-z]/i;
var foo = /\S/;
var foo = /foo/u;
var foo = RegExp("a\\s+b", 'u');
var foo = /[a-z]/iu;
var foo = /\S/u;
```

</eslint-code-block>

## :wrench: Options

Nothing.

## :couple: Related rules

- [regexp/require-unicode-regexp]

[regexp/require-unicode-regexp]: ./require-unicode-regexp.md

## :books: Further reading

- [require-unicode-regexp]

[require-unicode-regexp]: https://eslint.org/docs/rules/require-unicode-regexp

## :rocket: Version

:exclamation: <badge text="This rule has not been released yet." vertical="middle" type="error"> ***This rule has not been released yet.*** </badge>

## :mag: Implementation

- [Rule source](https://github.com/ota-meshi/eslint-plugin-regexp/blob/master/lib/rules/require-unicode-sets-regexp.ts)
- [Test source](https://github.com/ota-meshi/eslint-plugin-regexp/blob/master/tests/lib/rules/require-unicode-sets-regexp.ts)
143 changes: 143 additions & 0 deletions lib/rules/require-unicode-sets-regexp.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
import type { RegExpVisitor } from "@eslint-community/regexpp/visitor"
import type { RegExpContext } from "../utils"
import { createRule, defineRegexpVisitor } from "../utils"
import { RegExpParser, visitRegExpAST } from "@eslint-community/regexpp"
import { toUnicodeSet } from "regexp-ast-analysis"

const CLASS_SET_RESERVED_DOUBLE_PUNCTUATORS = [
"&&",
"!!",
"##",
"$$",
"%%",
"**",
"++",
",,",
"..",
"::",
";;",
"<<",
"==",
">>",
"??",
"@@",
"^^",
"``",
"~~",
"--",
]

/**
* Returns whether the regex would keep its behavior if the v flag were to be
* added.
*/
function isCompatible(regexpContext: RegExpContext): boolean {
const INCOMPATIBLE = {}

const { flags, patternAst, pattern } = regexpContext

try {
const flagsWithV = { ...flags, unicodeSets: true, unicode: false }
visitRegExpAST(patternAst, {
onCharacterClassEnter(node) {
const us = toUnicodeSet(node, flags)
const vus = toUnicodeSet(
{ ...node, unicodeSets: true },
flagsWithV,
)
if (!us.equals(vus)) {
throw INCOMPATIBLE
}
if (
CLASS_SET_RESERVED_DOUBLE_PUNCTUATORS.some((punctuator) =>
node.raw.includes(punctuator),
)
) {
throw INCOMPATIBLE
}
},
})
} catch (error) {
if (error === INCOMPATIBLE) {
return false
}
// just rethrow
throw error
}

try {
// The `v` flag has more strict escape characters.
// To check whether it can be converted to a pattern with the `v` flag,
// parse the pattern with the `v` flag and check for errors.
new RegExpParser().parsePattern(pattern, undefined, undefined, {
unicodeSets: true,
})
} catch (_error) {
return false
}

return true
}

export default createRule("require-unicode-sets-regexp", {
meta: {
docs: {
description: "enforce the use of the `v` flag",
category: "Best Practices",
recommended: false,
},
schema: [],
fixable: "code",
messages: {
require: "Use the 'v' flag.",
},
type: "suggestion",
},
create(context) {
/**
* Create visitor
*/
function createVisitor(
regexpContext: RegExpContext,
): RegExpVisitor.Handlers {
const {
node,
flags,
flagsString,
getFlagsLocation,
fixReplaceFlags,
} = regexpContext

if (flagsString === null) {
// This means that there are flags (probably) but we were
// unable to evaluate them.
return {}
}

if (!flags.unicodeSets) {
context.report({
node,
loc: getFlagsLocation(),
messageId: "require",
fix: fixReplaceFlags(() => {
if (
// Only patterns with the u flag are auto-fixed.
// When migrating from legacy, first add the `u` flag with the `require-unicode-regexp` rule.
!flags.unicode ||
!isCompatible(regexpContext)
) {
return null
}
return `${flagsString.replace(/u/gu, "")}v`
}),
})
}

return {}
}

return defineRegexpVisitor(context, {
createVisitor,
})
},
})
2 changes: 2 additions & 0 deletions lib/utils/rules.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ import preferT from "../rules/prefer-t"
import preferUnicodeCodepointEscapes from "../rules/prefer-unicode-codepoint-escapes"
import preferW from "../rules/prefer-w"
import requireUnicodeRegexp from "../rules/require-unicode-regexp"
import requireUnicodeSetsRegexp from "../rules/require-unicode-sets-regexp"
import sortAlternatives from "../rules/sort-alternatives"
import sortCharacterClassElements from "../rules/sort-character-class-elements"
import sortFlags from "../rules/sort-flags"
Expand Down Expand Up @@ -153,6 +154,7 @@ export const rules = [
preferUnicodeCodepointEscapes,
preferW,
requireUnicodeRegexp,
requireUnicodeSetsRegexp,
sortAlternatives,
sortCharacterClassElements,
sortFlags,
Expand Down
12 changes: 9 additions & 3 deletions tests/lib/rules-with-unknown-flag.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,11 @@ describe("Don't crash even if with unknown flag.", () => {
],
}

for (const key of Object.keys(rules)) {
const rule = rules[key]
const pluginRules = Object.fromEntries(
Object.values(rules).map((rule) => [rule.meta.docs.ruleId, rule]),
)

for (const rule of Object.values(rules)) {
const ruleId = rule.meta.docs.ruleId

it(ruleId, () => {
Expand All @@ -73,11 +76,14 @@ describe("Don't crash even if with unknown flag.", () => {
rules: {
[ruleId]: "error",
"regexp/test": "error",
...(ruleId === "regexp/require-unicode-sets-regexp"
? { "regexp/require-unicode-regexp": "error" }
: {}),
},
}
// @ts-expect-error -- ignore
linter.defineParser("@typescript-eslint/parser", parser)
linter.defineRule(ruleId, rule)
linter.defineRules(pluginRules)

linter.defineRule("regexp/test", TEST_RULE)
const resultVue = linter.verifyAndFix(code, config, "test.js")
Expand Down
2 changes: 2 additions & 0 deletions tests/lib/rules/no-useless-flag.ts
Original file line number Diff line number Diff line change
Expand Up @@ -846,6 +846,7 @@ describe("Don't conflict even if using the rules together.", () => {
rulesConfig: {
"regexp/no-useless-flag": ["error"],
"regexp/require-unicode-regexp": "off",
"regexp/require-unicode-sets-regexp": "off",
"regexp/match-any": ["error", { allows: ["dotAll"] }],
},
messages: [
Expand Down Expand Up @@ -874,6 +875,7 @@ describe("Don't conflict even if using the rules together.", () => {
rulesConfig: {
"regexp/match-any": ["error", { allows: ["dotAll"] }],
"regexp/require-unicode-regexp": "off",
"regexp/require-unicode-sets-regexp": "off",
"regexp/no-useless-flag": ["error"],
},
messages: [
Expand Down
80 changes: 80 additions & 0 deletions tests/lib/rules/require-unicode-sets-regexp.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import { RuleTester } from "eslint"
import rule from "../../../lib/rules/require-unicode-sets-regexp"

const tester = new RuleTester({
parserOptions: {
ecmaVersion: "latest",
sourceType: "module",
},
})

tester.run("require-unicode-sets-regexp", rule as any, {
valid: [`/a/v`],
invalid: [
{
code: `/a/`,
output: null, // It will not auto-fix if it does not have the u flag.
errors: ["Use the 'v' flag."],
},
{
code: `/a/u`,
output: `/a/v`,
errors: ["Use the 'v' flag."],
},
{
code: String.raw`/[\p{ASCII}]/iu`,
output: String.raw`/[\p{ASCII}]/iv`,
errors: ["Use the 'v' flag."],
},
{
code: `/[[]/u`,
output: null, // Converting to the v flag will result in a parsing error.
errors: ["Use the 'v' flag."],
},
{
code: String.raw`/[^\P{Lowercase_Letter}]/giu`,
output: null, // Converting to the v flag changes the behavior of the character set.
errors: ["Use the 'v' flag."],
},
{
code: String.raw`/[^\P{ASCII}]/iu`,
output: null, // Converting to the v flag changes the behavior of the character set.
errors: ["Use the 'v' flag."],
},
{
code: String.raw`/[\P{ASCII}]/iu`,
output: null, // Converting to the v flag changes the behavior of the character set.
errors: ["Use the 'v' flag."],
},
...[
"&&",
"!!",
"##",
"$$",
"%%",
"**",
"++",
",,",
"..",
"::",
";;",
"<<",
"==",
">>",
"??",
"@@",
"^^",
"``",
"~~",
].map((punctuator) => ({
code: String.raw`/[a${punctuator}b]/u`,
output: null, // Converting to the v flag changes the behavior of the character set.
errors: ["Use the 'v' flag."],
})),
{
code: String.raw`/[+--b]/u`,
output: null, // Converting to the v flag changes the behavior of the character set.
errors: ["Use the 'v' flag."],
},
],
})
Loading