Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Sanitize with Loofah. Resolves #320 #351

Merged
merged 2 commits into from
Mar 13, 2020

Conversation

dometto
Copy link
Member

@dometto dometto commented Oct 10, 2019

Sanitize is no longer being maintained for JRuby (see #320). Loofah is the sanitizer used in the rails asset pipeline. It wasn't all that difficult to plug it in and get the tests passing, and I think it's actually a bit easier to understand what's happening than with Sanitize.

Loofah has some sane defaults for safe HTML5 elements: https://github.com/flavorjones/loofah/blob/master/lib/loofah/html5/safelist.rb#L47, so we don't have to maintain that ourselves. End users can still override these settings by overriding Loofah::HTML5::SafeList.

@dometto
Copy link
Member Author

dometto commented Oct 10, 2019

  • Sanitize's comparison:
    • The benchmarks there compare Sanitize vs an older Loofah (Sanitize wins)
    • @bartkamphorst more importantly, what do you think of the other differences?
  • And here for more recent benchmarks
    • More recent Loofah wins against Sanitize

@dometto
Copy link
Member Author

dometto commented Oct 10, 2019

From Sanitize's comparison:

Sanitize performs advanced whitelist-based CSS sanitization using Crass, a full-fledged CSS parser compliant with the CSS Syntax Module Level 3 parsing spec. Loofah and HTMLFilter both perform rudimentary regex-based CSS sanitization, but I wouldn't trust either of them to actually sanitize maliciously crafted CSS.

This appears to be out of date. Loofah also uses Crass.

@dometto
Copy link
Member Author

dometto commented Oct 10, 2019

Loofah is tested on JRuby, but see here

@dometto dometto changed the title Replace Sanitize with Loofah Replace Sanitize with Loofah. Resolves #320 Oct 10, 2019
Copy link
Member

@bartkamphorst bartkamphorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is looking good. Should we wait out a little longer with merging though until we hear back about JRuby support?

regexp = %r{^((#{accepted_protocols.join("|")}):)}
end
if url =~ regexp
@accepted_protocols_regex ||= %r{^((#{::Gollum::Sanitization.accepted_protocols.join('|')}):)?(//)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this stuff about :relative no longer relevant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see Sanitize had a separate flag for allowing :relative, loofah just accepts it.

What I don't really understand is the difference between the two regexes. Previously, if :relative was not allowed, the regex would match URI scheme's of the form http:, and not require http://. I don't know why the double slashes were not required in that case. Any thoughts?

lib/gollum-lib/sanitization.rb Show resolved Hide resolved
lib/gollum-lib/sanitization.rb Show resolved Hide resolved
lib/gollum-lib/sanitization.rb Show resolved Hide resolved
lib/gollum-lib/sanitization.rb Show resolved Hide resolved
@dometto
Copy link
Member Author

dometto commented Oct 11, 2019

PR is looking good. Should we wait out a little longer with merging though until we hear back about JRuby support?

👍

@@ -101,7 +101,7 @@ def name
#
# Returns the fully sanitized String title.
def title
Sanitize.clean(name).strip
@wiki.sanitizer.clean(name).strip
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we could make a direct Loofah :strip or :prune call here (no need for the postprocessing in the Sanitizer class here).

@dometto
Copy link
Member Author

dometto commented Mar 12, 2020

flavorjones/loofah#88 is showing progress on getting loofah on JRuby to pass the test suite without any errors. I am inclined to merge this and switch to loofah. There are a couple of known errors, but that seems better to me than being tied to an old version of sanitize. What do you think, @bartkamphorst?

@dometto dometto merged commit 4827d4c into gollum:gollum-lib-5.x Mar 13, 2020
@dometto dometto deleted the loofah branch March 13, 2020 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants