Skip to content

Commit

Permalink
Merge branch 'jf.safelist'
Browse files Browse the repository at this point in the history
  • Loading branch information
flavorjones committed Sep 28, 2019
2 parents 6c5ff2d + 775ab31 commit 46daa07
Show file tree
Hide file tree
Showing 13 changed files with 82 additions and 56 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,17 @@
* CSS hex values are no longer limited to lowercase hex. Previously uppercase hex were scrubbed. [#165] (Thanks, @asok!)


### Deprecations / Name Changes

The following method and constants are hereby deprecated, and will be completely removed in a future release:

* Deprecate `Loofah::Helpers::ActionView.white_list_sanitizer`, please use `Loofah::Helpers::ActionView.safe_list_sanitizer` instead.
* Deprecate `Loofah::Helpers::ActionView::WhiteListSanitizer`, please use `Loofah::Helpers::ActionView::SafeListSanitizer` instead.
* Deprecate `Loofah::HTML5::WhiteList`, please use `Loofah::HTML5::SafeList` instead.

Thanks to @JuanitoFatas for submitting these changes in #164 and for making the language used in Loofah more inclusive.


## 2.2.3 / 2018-10-30

### Security
Expand Down
2 changes: 1 addition & 1 deletion Manifest.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ lib/loofah/html/document.rb
lib/loofah/html/document_fragment.rb
lib/loofah/html5/libxml2_workarounds.rb
lib/loofah/html5/scrub.rb
lib/loofah/html5/whitelist.rb
lib/loofah/html5/safelist.rb
lib/loofah/instance_methods.rb
lib/loofah/metahelpers.rb
lib/loofah/scrubber.rb
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ documents and fragments. It's built on top of Nokogiri and libxml2, so
it's fast and has a nice API.

Loofah excels at HTML sanitization (XSS prevention). It includes some
nice HTML sanitizers, which are based on HTML5lib's whitelist, so it
nice HTML sanitizers, which are based on HTML5lib's safelist, so it
most likely won't make your codes less secure. (These statements have
not been evaluated by Netexperts.)

Expand All @@ -29,7 +29,7 @@ ActiveRecord extensions for sanitization are available in the

## Features

* Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's whitelists).
* Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's safelists).
* Common HTML sanitizing tasks are built-in:
* _Strip_ unsafe tags, leaving behind only the inner text.
* _Prune_ unsafe tags and their subtrees, removing all traces that they ever existed.
Expand Down Expand Up @@ -221,7 +221,7 @@ Loofah.xml_document(File.read('plague.xml')).scrub!(bring_out_your_dead)
=== Built-In HTML Scrubbers

Loofah comes with a set of sanitizing scrubbers that use HTML5lib's
whitelist algorithm:
safelist algorithm:

``` ruby
doc.scrub!(:strip) # replaces unknown/unsafe tags with their inner text
Expand Down
6 changes: 3 additions & 3 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,9 @@ task :doc_upload_to_rubyforge => :docs do
end
end

desc "generate whitelists from W3C specifications"
task :generate_whitelists do
load "tasks/generate-whitelists"
desc "generate safelists from W3C specifications"
task :generate_safelists do
load "tasks/generate-safelists"
end

Concourse.new("loofah", fly_target: "ci") do |c|
Expand Down
2 changes: 1 addition & 1 deletion lib/loofah.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
require 'loofah/metahelpers'
require 'loofah/elements'

require 'loofah/html5/whitelist'
require 'loofah/html5/safelist'
require 'loofah/html5/libxml2_workarounds'
require 'loofah/html5/scrub'

Expand Down
16 changes: 13 additions & 3 deletions lib/loofah/helpers.rb
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,13 @@ def full_sanitizer
@full_sanitizer ||= ::Loofah::Helpers::ActionView::FullSanitizer.new
end

def safe_list_sanitizer
@safe_list_sanitizer ||= ::Loofah::Helpers::ActionView::SafeListSanitizer.new
end

def white_list_sanitizer
@white_list_sanitizer ||= ::Loofah::Helpers::ActionView::WhiteListSanitizer.new
warn "warning: white_list_sanitizer is deprecated, please use safe_list_sanitizer instead."
safe_list_sanitizer
end
end

Expand All @@ -73,13 +78,13 @@ def sanitize html, *args
#
# To use by default, call this in an application initializer:
#
# ActionView::Helpers::SanitizeHelper.white_list_sanitizer = ::Loofah::Helpers::ActionView::WhiteListSanitizer.new
# ActionView::Helpers::SanitizeHelper.safe_list_sanitizer = ::Loofah::Helpers::ActionView::SafeListSanitizer.new
#
# Or, to generally opt-in to Loofah's view sanitizers:
#
# Loofah::Helpers::ActionView.set_as_default_sanitizer
#
class WhiteListSanitizer
class SafeListSanitizer
def sanitize html, *args
Loofah::Helpers.sanitize html
end
Expand All @@ -88,6 +93,11 @@ def sanitize_css style_string, *args
Loofah::Helpers.sanitize_css style_string
end
end

WhiteListSanitizer = SafeListSanitizer
if Object.respond_to?(:deprecate_constant)
deprecate_constant :WhiteListSanitizer
end
end
end
end
11 changes: 8 additions & 3 deletions lib/loofah/html5/whitelist.rb → lib/loofah/html5/safelist.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
module Loofah
module HTML5 # :nodoc:
#
# HTML whitelist lifted from HTML5lib sanitizer code:
# HTML safelist lifted from HTML5lib sanitizer code:
#
# http://code.google.com/p/html5lib/
#
Expand Down Expand Up @@ -44,7 +44,7 @@ module HTML5 # :nodoc:
# DEALINGS IN THE SOFTWARE.
#
# </html5_license>
module WhiteList
module SafeList

ACCEPTABLE_ELEMENTS = Set.new([
"a",
Expand Down Expand Up @@ -790,6 +790,11 @@ module WhiteList
ALLOWED_ELEMENTS_WITH_LIBXML2 = ALLOWED_ELEMENTS + TAGS_SAFE_WITH_LIBXML2
end

::Loofah::MetaHelpers.add_downcased_set_members_to_all_set_constants ::Loofah::HTML5::WhiteList
WhiteList = SafeList
if Object.respond_to?(:deprecate_constant)
deprecate_constant :WhiteList
end

::Loofah::MetaHelpers.add_downcased_set_members_to_all_set_constants ::Loofah::HTML5::SafeList
end
end
26 changes: 13 additions & 13 deletions lib/loofah/html5/scrub.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ module Scrub
class << self

def allowed_element? element_name
::Loofah::HTML5::WhiteList::ALLOWED_ELEMENTS_WITH_LIBXML2.include? element_name
::Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2.include? element_name
end

# alternative implementation of the html5lib attribute scrubbing algorithm
Expand All @@ -28,31 +28,31 @@ def scrub_attributes node
next
end

unless WhiteList::ALLOWED_ATTRIBUTES.include?(attr_name)
unless SafeList::ALLOWED_ATTRIBUTES.include?(attr_name)
attr_node.remove
next
end

if WhiteList::ATTR_VAL_IS_URI.include?(attr_name)
if SafeList::ATTR_VAL_IS_URI.include?(attr_name)
# this block lifted nearly verbatim from HTML5 sanitization
val_unescaped = CGI.unescapeHTML(attr_node.value).gsub(CONTROL_CHARACTERS,'').downcase
if val_unescaped =~ /^[a-z0-9][-+.a-z0-9]*:/ && ! WhiteList::ALLOWED_PROTOCOLS.include?(val_unescaped.split(WhiteList::PROTOCOL_SEPARATOR)[0])
if val_unescaped =~ /^[a-z0-9][-+.a-z0-9]*:/ && ! SafeList::ALLOWED_PROTOCOLS.include?(val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0])
attr_node.remove
next
elsif val_unescaped.split(WhiteList::PROTOCOL_SEPARATOR)[0] == 'data'
elsif val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[0] == 'data'
# permit only allowed data mediatypes
mediatype = val_unescaped.split(WhiteList::PROTOCOL_SEPARATOR)[1]
mediatype = val_unescaped.split(SafeList::PROTOCOL_SEPARATOR)[1]
mediatype, _ = mediatype.split(';')[0..1] if mediatype
if mediatype && !WhiteList::ALLOWED_URI_DATA_MEDIATYPES.include?(mediatype)
if mediatype && !SafeList::ALLOWED_URI_DATA_MEDIATYPES.include?(mediatype)
attr_node.remove
next
end
end
end
if WhiteList::SVG_ATTR_VAL_ALLOWS_REF.include?(attr_name)
if SafeList::SVG_ATTR_VAL_ALLOWS_REF.include?(attr_name)
attr_node.value = attr_node.value.gsub(/url\s*\(\s*[^#\s][^)]+?\)/m, ' ') if attr_node.value
end
if WhiteList::SVG_ALLOW_LOCAL_HREF.include?(node.name) && attr_name == 'xlink:href' && attr_node.value =~ /^\s*[^#\s].*/m
if SafeList::SVG_ALLOW_LOCAL_HREF.include?(node.name) && attr_name == 'xlink:href' && attr_node.value =~ /^\s*[^#\s].*/m
attr_node.remove
next
end
Expand All @@ -79,14 +79,14 @@ def scrub_css style
style_tree.each do |node|
next unless node[:node] == :property
next if node[:children].any? do |child|
[:url, :bad_url].include?(child[:node]) || (child[:node] == :function && !WhiteList::ALLOWED_CSS_FUNCTIONS.include?(child[:name].downcase))
[:url, :bad_url].include?(child[:node]) || (child[:node] == :function && !SafeList::ALLOWED_CSS_FUNCTIONS.include?(child[:name].downcase))
end
name = node[:name].downcase
if WhiteList::ALLOWED_CSS_PROPERTIES.include?(name) || WhiteList::ALLOWED_SVG_PROPERTIES.include?(name)
if SafeList::ALLOWED_CSS_PROPERTIES.include?(name) || SafeList::ALLOWED_SVG_PROPERTIES.include?(name)
sanitized_tree << node << CRASS_SEMICOLON
elsif WhiteList::SHORTHAND_CSS_PROPERTIES.include?(name.split('-').first)
elsif SafeList::SHORTHAND_CSS_PROPERTIES.include?(name.split('-').first)
value = node[:value].split.map do |keyword|
if WhiteList::ALLOWED_CSS_KEYWORDS.include?(keyword) || keyword =~ CSS_KEYWORDISH
if SafeList::ALLOWED_CSS_KEYWORDS.include?(keyword) || keyword =~ CSS_KEYWORDISH
keyword
end
end.compact
Expand Down
2 changes: 1 addition & 1 deletion lib/loofah/scrubbers.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
module Loofah
#
# Loofah provides some built-in scrubbers for sanitizing with
# HTML5lib's whitelist and for accomplishing some common
# HTML5lib's safelist and for accomplishing some common
# transformation tasks.
#
#
Expand Down
4 changes: 2 additions & 2 deletions loofah.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ Gem::Specification.new do |s|
s.require_paths = ["lib".freeze]
s.authors = ["Mike Dalessio".freeze, "Bryan Helmkamp".freeze]
s.date = "2018-02-12"
s.description = "Loofah is a general library for manipulating and transforming HTML/XML\ndocuments and fragments. It's built on top of Nokogiri and libxml2, so\nit's fast and has a nice API.\n\nLoofah excels at HTML sanitization (XSS prevention). It includes some\nnice HTML sanitizers, which are based on HTML5lib's whitelist, so it\nmost likely won't make your codes less secure. (These statements have\nnot been evaluated by Netexperts.)\n\nActiveRecord extensions for sanitization are available in the\n[`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).".freeze
s.description = "Loofah is a general library for manipulating and transforming HTML/XML\ndocuments and fragments. It's built on top of Nokogiri and libxml2, so\nit's fast and has a nice API.\n\nLoofah excels at HTML sanitization (XSS prevention). It includes some\nnice HTML sanitizers, which are based on HTML5lib's safelist, so it\nmost likely won't make your codes less secure. (These statements have\nnot been evaluated by Netexperts.)\n\nActiveRecord extensions for sanitization are available in the\n[`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).".freeze
s.email = ["[email protected]".freeze, "[email protected]".freeze]
s.extra_rdoc_files = ["CHANGELOG.md".freeze, "MIT-LICENSE.txt".freeze, "Manifest.txt".freeze, "README.md".freeze, "CHANGELOG.md".freeze, "README.md".freeze]
s.files = [".gemtest".freeze, "CHANGELOG.md".freeze, "Gemfile".freeze, "MIT-LICENSE.txt".freeze, "Manifest.txt".freeze, "README.md".freeze, "Rakefile".freeze, "benchmark/benchmark.rb".freeze, "benchmark/fragment.html".freeze, "benchmark/helper.rb".freeze, "benchmark/www.slashdot.com.html".freeze, "lib/loofah.rb".freeze, "lib/loofah/elements.rb".freeze, "lib/loofah/helpers.rb".freeze, "lib/loofah/html/document.rb".freeze, "lib/loofah/html/document_fragment.rb".freeze, "lib/loofah/html5/scrub.rb".freeze, "lib/loofah/html5/whitelist.rb".freeze, "lib/loofah/instance_methods.rb".freeze, "lib/loofah/metahelpers.rb".freeze, "lib/loofah/scrubber.rb".freeze, "lib/loofah/scrubbers.rb".freeze, "lib/loofah/xml/document.rb".freeze, "lib/loofah/xml/document_fragment.rb".freeze, "test/assets/testdata_sanitizer_tests1.dat".freeze, "test/helper.rb".freeze, "test/html5/test_sanitizer.rb".freeze, "test/integration/test_ad_hoc.rb".freeze, "test/integration/test_helpers.rb".freeze, "test/integration/test_html.rb".freeze, "test/integration/test_scrubbers.rb".freeze, "test/integration/test_xml.rb".freeze, "test/unit/test_api.rb".freeze, "test/unit/test_encoding.rb".freeze, "test/unit/test_helpers.rb".freeze, "test/unit/test_scrubber.rb".freeze, "test/unit/test_scrubbers.rb".freeze]
s.files = [".gemtest".freeze, "CHANGELOG.md".freeze, "Gemfile".freeze, "MIT-LICENSE.txt".freeze, "Manifest.txt".freeze, "README.md".freeze, "Rakefile".freeze, "benchmark/benchmark.rb".freeze, "benchmark/fragment.html".freeze, "benchmark/helper.rb".freeze, "benchmark/www.slashdot.com.html".freeze, "lib/loofah.rb".freeze, "lib/loofah/elements.rb".freeze, "lib/loofah/helpers.rb".freeze, "lib/loofah/html/document.rb".freeze, "lib/loofah/html/document_fragment.rb".freeze, "lib/loofah/html5/scrub.rb".freeze, "lib/loofah/html5/safelist.rb".freeze, "lib/loofah/instance_methods.rb".freeze, "lib/loofah/metahelpers.rb".freeze, "lib/loofah/scrubber.rb".freeze, "lib/loofah/scrubbers.rb".freeze, "lib/loofah/xml/document.rb".freeze, "lib/loofah/xml/document_fragment.rb".freeze, "test/assets/testdata_sanitizer_tests1.dat".freeze, "test/helper.rb".freeze, "test/html5/test_sanitizer.rb".freeze, "test/integration/test_ad_hoc.rb".freeze, "test/integration/test_helpers.rb".freeze, "test/integration/test_html.rb".freeze, "test/integration/test_scrubbers.rb".freeze, "test/integration/test_xml.rb".freeze, "test/unit/test_api.rb".freeze, "test/unit/test_encoding.rb".freeze, "test/unit/test_helpers.rb".freeze, "test/unit/test_scrubber.rb".freeze, "test/unit/test_scrubbers.rb".freeze]
s.homepage = "https://github.com/flavorjones/loofah".freeze
s.licenses = ["MIT".freeze]
s.rdoc_options = ["--main".freeze, "README.md".freeze]
Expand Down
14 changes: 7 additions & 7 deletions tasks/generate-allowlists → tasks/generate-safelists
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,12 @@ dompurify_metadata.each { |k, v| puts "#{k}: #{v.keys}" }
require "loofah"

pairs = {
"html:tags" => [Loofah::HTML5::WhiteList::ACCEPTABLE_ELEMENTS, dompurify_metadata["tags"]["html"]],
"mathml:tags" => [Loofah::HTML5::WhiteList::MATHML_ELEMENTS, dompurify_metadata["tags"]["mathMl"]],
"svg:tags" => [Loofah::HTML5::WhiteList::SVG_ELEMENTS, dompurify_metadata["tags"]["svg"]],
"html:attrs" => [Loofah::HTML5::WhiteList::ACCEPTABLE_ATTRIBUTES, dompurify_metadata["attrs"]["html"]],
"mathml:attrs" => [Loofah::HTML5::WhiteList::MATHML_ATTRIBUTES, dompurify_metadata["attrs"]["mathMl"]],
"svg:attrs" => [Loofah::HTML5::WhiteList::SVG_ATTRIBUTES, dompurify_metadata["attrs"]["svg"]],
"html:tags" => [Loofah::HTML5::SafeList::ACCEPTABLE_ELEMENTS, dompurify_metadata["tags"]["html"]],
"mathml:tags" => [Loofah::HTML5::SafeList::MATHML_ELEMENTS, dompurify_metadata["tags"]["mathMl"]],
"svg:tags" => [Loofah::HTML5::SafeList::SVG_ELEMENTS, dompurify_metadata["tags"]["svg"]],
"html:attrs" => [Loofah::HTML5::SafeList::ACCEPTABLE_ATTRIBUTES, dompurify_metadata["attrs"]["html"]],
"mathml:attrs" => [Loofah::HTML5::SafeList::MATHML_ATTRIBUTES, dompurify_metadata["attrs"]["mathMl"]],
"svg:attrs" => [Loofah::HTML5::SafeList::SVG_ATTRIBUTES, dompurify_metadata["attrs"]["svg"]],
}

pairs.each do |name, v|
Expand All @@ -53,4 +53,4 @@ pairs.each do |name, v|
puts
end

# TODO actually generate whitelists
# TODO actually generate safelists
Loading

0 comments on commit 46daa07

Please sign in to comment.