Skip to content

Commit

Permalink
Added multibyte whitespace support to :collapse_spaces option
Browse files Browse the repository at this point in the history
* Now collapses all multibyte whitespace (non-breaking, joiner,
  separator characters) down to just a regular space.
* This better mimicks the multibyte leading and trailing whitespace
  stripping behavior

Ref #32
  • Loading branch information
rmm5t committed May 31, 2016
1 parent f250d79 commit 14f6a35
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 2 deletions.
9 changes: 7 additions & 2 deletions lib/strip_attributes.rb
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ module StripAttributes
# U+FEFF ZERO WIDTH NO-BREAK SPACE
MULTIBYTE_WHITE = "\u180E\u200B\u200C\u200D\u2060\uFEFF"
MULTIBYTE_SPACE = /[[:space:]#{MULTIBYTE_WHITE}]/
MULTIBYTE_BLANK = /[[:blank:]#{MULTIBYTE_WHITE}]/
MULTIBYTE_SUPPORTED = "\u0020" == " "

def self.strip(record_or_string, options = nil)
Expand Down Expand Up @@ -82,8 +83,12 @@ def self.strip_string(value, options = nil)
value.gsub!(/[\r\n]+/, " ")
end

if collapse_spaces && value.respond_to?(:squeeze!)
value.squeeze!(' ')
if collapse_spaces
if MULTIBYTE_SUPPORTED && value.respond_to?(:gsub!) && Encoding.compatible?(value, MULTIBYTE_BLANK)
value.gsub!(/#{MULTIBYTE_BLANK}+/, " ")
elsif value.respond_to?(:squeeze!)
value.squeeze!(" ")
end
end

value
Expand Down
4 changes: 4 additions & 0 deletions test/strip_attributes_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,10 @@ def test_should_collapse_spaces
assert_equal "1 2 3", StripAttributes.strip(" 1 2 3\t ", :collapse_spaces => true)
end

def test_should_collapse_multibyte_spaces
assert_equal "1 2 3", StripAttributes.strip(" 1 \u00A0 2\u00A03\t ", :collapse_spaces => true)
end

def test_should_replace_newlines
assert_equal "1 2", StripAttributes.strip("1\n2", :replace_newlines => true)
assert_equal "1 2", StripAttributes.strip("1\r\n2", :replace_newlines => true)
Expand Down

0 comments on commit 14f6a35

Please sign in to comment.