-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ReDoS Vulnerability in PermitScrubber and Add Performance Test #186
Conversation
I'm looking into the JRuby failures which are unrelated to this change, but likely a bug that has been exposed by the test. |
JRuby failures should be fixed with #188, I've rebased this branch and force-pushed it. |
@ch4n3-yoon I appreciate the fact that you wrote a benchmark test, but it seems brittle, failing often for me in development and in CI (see https://github.com/rails/rails-html-sanitizer/actions/runs/10372673519/job/28715948562?pr=186). I'm open to you putting more work into it to be less brittle, but I am also fine with removing the test. Up to you. |
…mitScrubber" This reverts commit f461d2a.
Hi @flavorjones, thank you for the feedback. Given the brittleness of the benchmark test, I've decided to remove it to prevent potential instability in development and CI. The primary fix for the ReDoS vulnerability remains effective. |
@@ -150,7 +150,7 @@ def scrub_attribute(node, attr_node) | |||
Loofah::HTML5::Scrub.scrub_attribute_that_allows_local_ref(attr_node) | |||
end | |||
|
|||
if Loofah::HTML5::SafeList::SVG_ALLOW_LOCAL_HREF.include?(node.name) && attr_name == "xlink:href" && attr_node.value =~ /^\s*[^#\s].*/m | |||
if Loofah::HTML5::SafeList::SVG_ALLOW_LOCAL_HREF.include?(node.name) && attr_name == "xlink:href" && attr_node.value =~ /^\s*[^#].*/m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the meaning of the regexp is changed. It began to match to any character that starts with /\s/
.
I'm not sure what the original regexp want to match to. maybe:
/^[^\S\n]*[^#\s]/
: nearly or identical to the original regexp/\A\s*[^#\s]/
: matches to"\n\n a..."
but doesn't match to" #\n #\n a"
or perhaps using attr_node.value !~ allow_pattern
would be more simple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This regular expression originally came from the html5lib
library:
https://github.com/html5lib/html5lib-ruby/blob/master/lib/html5/sanitizer.rb#L141-L143
The goal of this line is to accept "local links" (like #some-id
) but not full URLs, with initial whitespace being ignored. There are test cases for this behavior in Loofah, which rails-html-sanitizer originally copied this code from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So @tompng is correct, this patch changes the behavior in unwanted ways. One example test case is:
<animatemotion xlink:href='\n#foo'>/</animatemotion>
Specifically, in this case the xlink:href
attribute should not be removed but is being removed because:
val = "\n#foo"
# => "\n#foo"
val =~ /^\s*[^#\s].*/
# => nil
val =~ /^\s*[^#].*/
# => 0
@@ -1,5 +1,6 @@ | |||
# frozen_string_literal: true | |||
|
|||
require "benchmark" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this line, it's no longer necessary.
Since this patch has some unwanted behavior changes, I'd like to suggest we close this and discuss the approach in the hackerone issue. @ch4n3-yoon if you agree, I'll close this. |
This pull request addresses a Regular Expression Denial of Service (ReDoS) vulnerability in the
PermitScrubber
class of therails-html-sanitizer
library. The vulnerability, caused by a regex pattern with potential for excessive backtracking, has been mitigated by optimizing the regular expression used in thescrub_attribute
method.