-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: make regexp parser throw an error if the global flag is missing #55
Comments
Or even add this flag automatically to not bother a library consumer with this constraint: RegExp(re.source, re.global ? re.flags : re.flags + 'g') |
While this option looks convenient, it has some shortcomings:
As for the first option... I'm generally not into the idea of throwing, because effects suck. So how about returning a failure and halting the parsing? As a side note: I've been thinking about implementing some sort of "levels" for errors, like |
Hmm, I wonder how
Magic indeed and I agree that it's not great, but from the other side I'd rather prefer this kind of magic than the chances of missing to use the global flag and spending time on debugging it. Making the Nevertheless it's not a major issue rather a question of opinion and convinience. Please let me know which approach is more preferable to you and I'm happy to implement it:
If none of the approaches suit this library you can just close this issue. |
Sigma doesn't "handle" parsers per se; all of them are essentially just pure functions which call other parsers and combinators and pass the state further. The whole process is recursive and happens top-down, from the root parser, hence "recursive descent parsers".
I experimented a bit more with it and actually if you do something like this: export function regexp(rs: RegExp, expected: string): Parser<string> {
const re = rs.global ? rs : new RegExp(rs.source, rs.flags + 'g')
return {
parse(input, pos) {
// snip
}
}
} The overhead is barely noticeable, something like: Without auto-inject average: 795.65 ops/sec
With auto-inject average: 787.95 ops/sec
Without auto-inject median: 797 ops/sec
With auto-inject median: 792 ops/sec
---
Diff average: 0.972%
Diff median: 0.629% These are aggregated results over 20 runs x 100 samples per run. Looks okay, so I guess we could just auto-inject the 'g' flag, just as you've suggested. It needs to be mentioned in the docs as well. Feel free to make a PR. (I can't commit this change myself right now, because I'm on a diverged branch with a hot experimental mess.) |
In this issue I've learnt that the
g
flag must be used in order to makeregexp
parser work correctly.Should
regexp
parser check itself whether theg
flag was used? It would be a minor improvement but it will prevent many mistakes. I think there is always a chance to forget adding this flag even if you know about this rule.Maybe the paragraph in the documentation explaining this requirement should be emphasized to make this moment more obvious. But I don't think that documentation itself would be enough and this rule should be validated programmatically.
Adding this check somewhere would be a great addition:
sigma/src/parsers/regexp.ts
Line 17 in 567b6f9
I can submit a PR myself if you agree with me on this point!
The text was updated successfully, but these errors were encountered: