-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange Unicode characters pass validation, but fail actually sending an email #100
Comments
I'll note that although the codepoint is described as a zero-width space, it does not match Ruby's |
I did a little bit of digging into this… My first thought was to check what category the zero-width space belongs to in order to see if the whole category could be blacklisted. It turns out that it's a member of Cf[1] (which explains why it doesn't get caught by the space regex above), and the other characters don't seem like they ought to be in email addresses. Somewhat hopeful, I went looking for confirmation of this. But! RFC 6531, section 3.2 has this cryptic little note:
So… 😕 What does RFC 5321 say? The A dotted string is an An Atom is 1 or more instances of An It's actually in RFC 5322 section 3.2.3
Ha…nevermind!
So! It turns out that a valid email address CAN include a zero-width space, but only if the SMTP server supports the This leaves two major possibilities:
In either case, I believe that the [1] 'Other, Format' https://www.fileformat.info/info/unicode/category/Cf/list.htm |
Thank you for the investigation @wenley and @david-mitchell I will try to read the other RFC to see which non-ASCII UTF8 characters are allowed |
Hi! I wanted to revise this thread because I found another example where
|
I can submit a fix for the emoji issue next week! |
Hi @alexevanczuk, Thank you for the report. Re-reading this thread, I thought that one of the solution would to only add an option to only allow ASCII characters. That depends on the use case of course. Either add it in this gem, or as a another validation in the application. |
@hallelujah I am under the impression that many non-ASCII characters are legitimately allowed in email addresses these days, but emojis are not. I was thinking the emoji change could be made to the existing API and folks can create a "ASCII-only" API later if they want. What do you think? |
Actually – looking more into it turns out emojis are just a type of unicode character, hence the existence of this gem: https://github.com/ticky/ruby-emoji-regex. Given this, it probably makes sense to handle this like other unicode characters. Although – if we can find evidence in the email RFC that some unicode characters (perhaps those used in other non-English alphabets) are allowed, but some (e.g. emojis) are not, that might make more sense to include in the existing API without a separate flag. |
However, when attempting to send an email to that address via the
mail
gem, I receive this error: mikel/mail#1126I'm not entirely sure which gem is the best place to fix this issue, but in my head, this input "should" have been caught by
valid_email
. Hence, I'm filing the issue here.The text was updated successfully, but these errors were encountered: