Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

args: use non-capturing groups when joining multiple patterns #2488

Closed
wants to merge 3 commits into from
Closed

Conversation

gofri
Copy link
Contributor

@gofri gofri commented Apr 13, 2023

Fixes #2480

Fix the joined pattern so that multiple VALID patterns cannot affect each other.
e.g. worked before but fixed now:
echo FooBar | rg -e '(?i)notfoo' -e bar # no result

Wontfix the joined pattern to handle some cases of multiple INVALID patterns that are "coordinated" to fix each other.
e.g. still works but is unexpected:
echo 'FooBar' | rg -e '(' -e ')' # matches with an empty string

also, introduces a new class of invalid patterns that work now but wouldn't work before:
e.g. got parse error before but not now:
echo 'FooBar' | rg ')(' # matches with an empty string

edit: pushed another commit to avoid the regerssion for a single pattern.
the new class still applies for multiple patterns:
e.g. got parse error before but not now:
echo 'FooBar' | rg -e ')(' -e 'x' # matches with an empty string

@gofri
Copy link
Contributor Author

gofri commented Apr 13, 2023

@BurntSushi I figured the new case just after creating the PR:

also, introduces a new class of invalid patterns that work now but wouldn't work before:
e.g. got parse error before but not now:
echo 'FooBar' | rg ')(' # matches with an empty string

IMHO:
It's not too bad to have this new class for multiple patterns, but I really don't like the fact that it also affects the detection for a single pattern (namely, )().
let me know if you think I should handle len(patterns) == 1 differently

edit: I pushed the commit to fix it, so let me know if you prefer that I revert it.

@BurntSushi BurntSushi added the rollup A PR that has been merged with many others in a rollup. label Jul 7, 2023
BurntSushi pushed a commit that referenced this pull request Jul 8, 2023
This was originally fixed by using non-capturing groups when joining
patterns in crates/core/args.rs, but before that landed, it ended up
getting fixed via a refactor in the course of migrating to regex 1.9.
Namely, it's now fixed by pushing pattern joining down into the regex
layer, so that patterns can be joined in the most effective way
possible.

Still, #2488 contains a useful test, so we bring that in here. The
test actually failed for `rg -e ')('`, since it expected the command to
fail with a syntax error. But my refactor actually causes this command
to succeed. And indeed, #2488 worked around this by special casing a
single pattern. That work-around fixes it for the single pattern case,
but doesn't fix it for the -w or -X or multi-pattern case. So for now,
we're content to leave well enough alone. The only real way to fix this
for real is to parse each regexp individual and verify that each is
valid on its own. It's not clear that doing so is worth it.

Fixes #2480, Closes #2488
@BurntSushi BurntSushi closed this in 36194c2 Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rollup A PR that has been merged with many others in a rollup.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Patterns may affect each other when multiple patterns are provided
2 participants