-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thoughts about using an eslint plugin for REDOS detection? #1201
Comments
Tagging @UziTech |
It would be really nice to get some sort of ReDoS test on travis. Could we make it so the server side code runs on travis instead of making http requests? |
Testing a regex is expensive (potentially 60 seconds+ per regex) so memoizing them is helpful. Then that memoized result has to live somewhere. My impression (could be wrong) is that Travis isn't a good place for permanent storage. |
In my experience http requests from travis seem to be the least reliable part of travis and "one synchronous HTTP query for each regex" could be a lot of http requests for every pr. Could we download the memoized results and check the regexes ourselves or send a bunch of regexes at once to limit the number of http requests? |
100MB+, not a good option.
I don't think you can batch nicely with an eslint plugin. The "is vulnerable?" function gets called for each regex node in the AST and doesn't have a bigger sense of the context. If you want to set up a post-eslint "security scanner" that runs as part of
About 100 requests. grep '/.*/' lib/marked.js | grep -v '\/\/' | wc -l
98 |
I would rather it be separate from eslint so we could run tests, run redos tests, then run lint. I feel like that would make it easier to see what failed in travis. |
@UziTech The eslint plugin is now available here. Here are the steps I took:
Here's what I found:
See Slack for additional notes. |
@davisjam First of all, this is amazing work! However, I agree with UziTech that this is less than ideal for every build to fire off HTTP requests. Especially for a "lint" test. Linting should be fast and it would be very odd to be working on code in a Plane without wifi and you can't run lint for formatting errors. Can this be a separate build step? Why does it need to be an eslint plugin? |
I agree with @styfle. I have my IDE running eslint in the background and underlining code as I type, it would be pretty annoying to have to wait 16 seconds to get that visual cue as I'm typing. |
If the server is inaccessible, the vuln-regex-detector module returns "INVALID" and the eslint plugin ignores that regex. So there wouldn't be linter errors. It would still be slow though.
There's no particular reason it has to be an eslint plugin, other than convenience. I was exploring an eslint plugin for REDOS detection, along the lines of eslint-plugin-safe-regex and eslint-plugin-security. eslint handles the jobs any such tool needs:
Identifying REDOS issues without false positives requires either (1) expensive local computation or (2) hitting a server. So running this at the same time as the other linting rules (e.g. for use in an IDE as @UziTech notes) is not ideal because of the performance hit. Does anyone know of a good pattern for running this kind of check during the programming cycle? What are your thoughts about adding it as part of |
|
How about this?
It's still slow because it's making synchronous requests for each regex. But since it's off the linter fast path I imagine that's OK now. If acceptable, I'll open a PR. |
Time estimate: From an AWS micro instance, it takes 25 seconds to run I'm working on a local cache of results so on dev machines the overhead would be minimal. On Travis it would still be in the "cost of 100 queries" range. |
One other thought I just had: since this database is remotely managed, we can't easily tell if a given PR introduced a vulnerability or if the database was updated and the PR was the first to run CI. Have you thought about how to solve this problem? |
@styfle I don't see a good way around that with the "query a remote server" configuration. The blame may be indiscriminate in general. If devs run If developers really cared, the vuln-regex-detector module is in the client subdir of the vuln-regex-detector project. Developers could use aspects of the larger project as an application instead (see the |
Another thought: Because we can't easily distinguish between "this PR introduced a vulnerable regex" and "some other PR did so and we're the first PR to notice", we might ship a release with vulnerable regexes and only notice afterward. That's not a great situation. But it's better than not noticing. |
@styfle I think that is a problem we will have to assess on a case by case bases. Maybe set the regex test to allow failure on travis and check it manually. @davisjam I'm curious what the chance is for a false positive (invulnerable regex seen as vulnerable)? The eslint plugin might not be the best solution in our case since we have so many regexes and especially since we construct the regexes. I would think a better way for us to test the regexes would be to run the code (so the regexes are constructed) then send those regexes to the server (possibly with fewer http requests). something like: const vulnRegexDetector = require('vuln-regex-detector');
const marked = require('../lib/marked.js');
for(const name in marked.Lexer.rules) {
const regex = marked.Lexer.rules[name];
vulnRegexDetector.test(regex)
.then((result) => {
if (result === vulnRegexDetector.responses.vulnerable) {
console.log('Regex is vulnerable');
} else if (result === vulnRegexDetector.responses.safe) {
console.log('Regex is safe');
} else {
console.log('Not sure if regex is safe or not');
}
});
} |
That only tests the block level normal rules. We should probably refactor marked so the regexes are easier to pull out for testing. |
There are no false positives (famous last words). It might have false negatives though, most obviously if the detectors aren't aware of a class of vulnerabilities.
This approach requires that we keep the test in sync with the code. The approach I proposed will test every regex, albeit somewhat slowly in a CI context. But I don't view 30-60 seconds as a huge problem given the overhead of CI in general. For |
The problem with the eslint approach is that it won't catch regexes that are constructed into vulnerable regexes. example: block.vuln = edit(/(test)+/)
.replace('test', /a+/)
.getRegex(); the eslint plugin will test |
Entirely correct. When I said "there are no false positives" I meant for the regexes it considers -- namely statically defined ones that can be identified in an eslint pass. Is this a concern in marked? |
Yes, the example above is the way most of the regexes in marked are constructed. I propose that when we modularize marked we create a Regex "class" that holds all the regexes and handles the construction so it will be trivial to test them all. |
I didn't know that. The vulnerable regexes I've seen in marked are not constructed in this way. My initial scan used a static analysis similar to that of eslint. No static analysis will be perfect, but it seems like the |
@davisjam Is there a way to send multiple regexes in one http request with the npm package? |
Not yet. See this issue. |
I thought about this problem a little more and I think the best way to solve this problem is to accept a version parameter (or a date parameter) in the API call. This way, we can run the check for every PR and a failure will tell us that the new regex in the PR caused the failure. Then once a week/month, we can create a new PR to bump the version parameter (or date) parameter so that new vulnerabilities are checked in this new PR and can be fixed before merging. It requires manual updates but it avoids the scenario where we break a build for something like docs that isn't even touching the regex code. Thoughts? |
Since we're talking about security issues here, I would rather break the build as early as possible. |
I guess we'll have to click through on PRs that fail to see if its the |
vuln-regex-detector v1.0.2 (eslint-plugin-vuln-regex-detector v1.0.3) now maintains a local persistent cache in the FS. The first In a CI setting you'll pay the full cost of a cold cache. |
Seems reasonable to me. We can discuss details in Slack. |
- 'npm run test:redos' now scans for REDOS issues - added a Travis stage for 'security scan' Fixes: markedjs#1201 (a step towards it, anyway)
- 'npm run test:redos' now scans for REDOS issues - added a Travis stage for 'security scan' Fixes: markedjs#1201 (a step towards it, anyway)
This is a question for discussion, not an issue.
Once it's ready, does anyone have concerns with adding my REDOS plugin to this project's eslint config?
The text was updated successfully, but these errors were encountered: