Skip to content

Commit

Permalink
Normalize some mailbox names like postmaster to lowercase per RFC 2142
Browse files Browse the repository at this point in the history
  • Loading branch information
JoshData committed Apr 15, 2023
1 parent de6527f commit 5c9973d
Show file tree
Hide file tree
Showing 5 changed files with 25 additions and 2 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ There are no significant changes to which email addresses are considered valid/i
* The quoted-string local part syntax (e.g. multiple @-signs, spaces, etc. if surrounded by quotes) and domain-literal addresses (e.g. @[192.XXX...] or @[IPv6:...]) are now parsed but not considered valid by default. Better error messages are now given for these addresses since it can be confusing for a technically valid address to be rejected, and new allow_quoted_local and allow_domain_literal options are added to allow these addresses if you really need them.
* Some other error messages have changed to not repeat the email address in the error message.
* The `email` field on the returned `ValidatedEmail` object has been renamed to `normalized` to be clearer about its importance, but access via `.email` is also still supported.
* Some mailbox names like `postmaster` are now normalized to lowercase per RFC 2142.
* The library has been reorganized internally into smaller modules.
* The tests have been reorganized and expanded. Deliverability tests now mostly use captured DNS responses so they can be run off-line.
* The __main__ tool now reads options to validate_email from environment variables.
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,8 @@ literal IPv6 addresses if you have allowed them by the `allow_quoted_local`
and `allow_domain_literal` options. In quoted-string local parts, unnecessary
backslash escaping is removed and even the surrounding quotes are removed if
they are unnecessary. For IPv6 domain literals, the IPv6 address is
normalized to condensed form.
normalized to condensed form. [RFC 2142](https://datatracker.ietf.org/doc/html/rfc2142)
also requires lowercase normalization for some specific mailbox names like `postmaster@`.

Examples
--------
Expand Down
7 changes: 7 additions & 0 deletions email_validator/rfc_constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,10 @@
LOCAL_PART_MAX_LENGTH = 64
DNS_LABEL_LENGTH_LIMIT = 63 # in "octets", RFC 1035 2.3.1
DOMAIN_MAX_LENGTH = 255 # in "octets", RFC 1035 2.3.4 and RFC 5321 4.5.3.1.2

# RFC 2142
CASE_INSENSITIVE_MAILBOX_NAMES = [
'info', 'marking', 'sales', 'support', # section 3
'abuse', 'noc', 'security', # section 4
'postmaster', 'hostmaster', 'usenet', 'news', 'webmaster', 'www', 'uucp', 'ftp', # section 5
]
11 changes: 10 additions & 1 deletion email_validator/validate_email.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from .exceptions_types import EmailSyntaxError, ValidatedEmail
from .syntax import validate_email_local_part, validate_email_domain_name, validate_email_domain_literal, get_length_reason
from .rfc_constants import EMAIL_MAX_LENGTH, QUOTED_LOCAL_PART_ADDR
from .rfc_constants import EMAIL_MAX_LENGTH, QUOTED_LOCAL_PART_ADDR, CASE_INSENSITIVE_MAILBOX_NAMES


def validate_email(
Expand Down Expand Up @@ -92,6 +92,15 @@ def validate_email(
ret.ascii_local_part = local_part_info["ascii_local_part"]
ret.smtputf8 = local_part_info["smtputf8"]

# Some local parts are required to be case-insensitive, so we should normalize
# to lowercase.
# RFC 2142
if ret.ascii_local_part is not None \
and ret.ascii_local_part.lower() in CASE_INSENSITIVE_MAILBOX_NAMES \
and ret.local_part is not None:
ret.ascii_local_part = ret.ascii_local_part.lower()
ret.local_part = ret.local_part.lower()

# Validate the email address's domain part syntax and get a normalized form.
is_domain_literal = False
if len(domain_part) == 0:
Expand Down
5 changes: 5 additions & 0 deletions tests/test_syntax.py
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,11 @@ def test_email_test_domain_name_in_test_environment():
validate_email("[email protected]", test_environment=True)


def test_case_insensitive_mailbox_name():
validate_email("POSTMASTER@test", test_environment=True).normalized = "postmaster@test"
validate_email("NOT-POSTMASTER@test", test_environment=True).normalized = "NOT-POSTMASTER@test"


# This is the pyIsEmail (https://github.com/michaelherold/pyIsEmail) test suite.
#
# The test data was extracted by:
Expand Down

0 comments on commit 5c9973d

Please sign in to comment.