-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option for dealing with unusual spaces #32
Comments
Do you just want to convert these non-breaking spaces to regular spaces, or do you want to potentially collapse both single and multiple non-breaking spaces down to just one (regular) space? The reason I ask is that it might be prudent to enhance the |
* Now collapses all multibyte whitespace (non-breaking, joiner, separator characters) down to just a regular space. * This better mimicks the multibyte leading and trailing whitespace stripping behavior Ref #32
@GeorgeDewar As an experiment, I added multi-byte space collapsing support to the Example Usage: class Comment < ActiveRecord::Base
strip_attributes collapse_spaces: true
end To test, edit your Gemfile to point at the master branch: gem "strip_attributes", github: "rmm5t/strip_attributes" If this behavior still doesn't suffice, could you please elaborate on your use-case where you want to replace non-breaking spaces, but also avoid collapsing them? |
Thanks @rmm5t! I believe this solves our problem. Collapsing consecutive spaces is not necessary for us, but not harmful either. I didn't think of the We have an application that deals with inbound emails for various sources, and some email clients do strange things with spaces - including using non-breaking spaces instead of normal spaces, or including the occasional zero-width space (Outlook Web Access does that). These cause various problems for us when trying to process the text in certain ways. |
@GeorgeDewar Great. I just published v1.8.0 to rubygems with this feature. |
Strip_attributes currently supports custom regex for characters or patterns that should be removed. However, a requirement that we (and perhaps others) have is to turn non-breaking spaces (U+00A0) into spaces.
I am wondering if you think that this would sit well in strip_attributes, either as an optional normalise_special_spaces type feature, or as an option to declare a custom regex replacement (i.e. in my case,
strip_attributes :replace => [/\u00A0/, " "]
)?I might be willing to contribute the feature if it is desirable.
The text was updated successfully, but these errors were encountered: