-
Notifications
You must be signed in to change notification settings - Fork 936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support non-ASCII addresses (IDN) #1444
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, excellent work @c960657 Thank you! I know it's been a long time since you wrote it, but I've left some comments around the PR, I think we could tighten this up some more, let me know if you think otherwise.
else | ||
local | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this result in two separate cases? You'll either return the local + domain or just the local. Wouldn't it be better to just return the local here and put the domain_idna check in the same area where we return the full address (ie, local plus domain) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. This method just follows the same pattern as Address#address
, i.e. it does not add @
if the address does not include a domain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikel Would you elaborate? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikel Ping :-)
if /[\r\n]/ =~ addr | ||
raise ArgumentError, "SMTP #{addr_name} address may not contain CR or LF line breaks: #{addr.inspect}" | ||
end | ||
|
||
if !addr.ascii_only? && Encodings.idna_supported? | ||
addr = Address.new(addr).address_idna | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, could the Address
class handle this without having to spread the idna checks in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place where Encodings.idna_supported?
is called (apart from the guard in Encodings.idna_encode
). Deciding what to do if there is no IDNa support is a tradeoff – there is no “right” thing to do. Hiding this decision inside Address, making it IDNa-decode or silently return the input string, would be confusing IMO – if this is what you are suggesting.
I could remove the addr.ascii_only?
check from here and just rely and that in Address
, but with the addition of SMTPUTF8
we need it here anyway.
After this PR was filed, an update to net-smtp (see ruby/net-smtp#34) has allowed us to use the |
Hmm, I'm not sure why the tests are failing. It seems related to Psych (v5.0.0 was released 2 days ago). |
# Conflicts: # lib/mail/smtp_envelope.rb # spec/mail/network/delivery_methods/smtp_spec.rb # spec/spec_helper.rb
The first internationalized domains (IDNs) were registered around 2003. It took more than a decade for web browsers to gain sufficient support to make these domains generally useful, but today you can announce an IDN domain to the general public and expect them to be able to connect to your website.
This is not the case for email. The standards for using IDNs for email came much later and they are still not supported by many email clients and software libraries. Let's fix that, so that eventually – many years from now – people can use their non-ASCII address everywhere like any other email address.
RFC 6531 describes how to handle non-ASCII email addresses on the SMTP level using the
SMTPUTF8
SMTP extension. Unfortunately,Net::SMTP
does not support passing parameters forRCPT TO
which is required forSMTPUTF8
. I have submitted a patch for this, but the maintainers do not seem interested.But even without
SMTPUTF8
support we can still get far:With this PR, the domain of the SMTP envelope sender and recipient is IDNA-encoded, making the email address appear like any other ASCII email to the SMTP server. The SMTP envelope sender and recipient are never shown to end-users, so there is no UX reason not to do this encoding. Even if the SMTP server has support for non-ASCII domains, there is no drawback from doing the IDNA-encoding in the client – the two encodings are equivalent from a routing perspective.
This PR does not affect email addresses with non-ASCII characters in the local part. But even today, they might just work – not because email servers intentionally support internationalized email addresses, but simply because they pass on the local part verbatimly to the remote server.
If my PR for
Net::SMTP
is eventually accepted, this library could skip the IDNA-encoding and instead pass theSMTPUTF8
parameter for servers announcing support for this extension. At least for addresses with a non-ASCII local part, this would probably improve deliverability.