Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Javadoc regarding default setting for email validation contradicting code, which is correct? #31

Closed
bbottema opened this issue Feb 28, 2016 · 2 comments

Comments

@bbottema
Copy link
Owner

8370db1#commitcomment-16375443

The following javadoc and code show what the default email validation strictness is set to:

    /**
     * The default setting is not strictly 2822 compliant. For example, it does not include the {@link #ALLOW_DOMAIN_LITERALS} criteria, which results in
     * exclusions on single domains.
     * <p>
     * Included in the defaults are: <ul> <li>{@link #ALLOW_QUOTED_IDENTIFIERS}</li> <li>{@link #ALLOW_PARENS_IN_LOCALPART}</li> </ul>
     */
    public static final EnumSet<EmailAddressCriteria> DEFAULT = of(ALLOW_DOMAIN_LITERALS);

However, I'm not sure what actually should be the default. Do we even need a default? What is its purpose?

Initially I thought a more strict-than-RFC-compliant default would be needed to make sure main stream services and servers can handle the more mundane email strings, rather than the exotic strings the RFC would allow.

What should be the default?

@chconnor
Copy link

Don't know if this helps, but:

I'm not sure what your use cases are for simple-java-mail, but the original options in EmailAddress were there to cover a few basic use cases:

  1. user wants to scrape as much data from a possibly-ugly address as they can and make a sensible address from it; these users typically allow all kinds of addresses (except perhaps for single-domain addresses) because in the wild, legitimate senders often violate 2822. E.g. If your goal is to parse spammy emails for analysis, you may want to allow every variation out there just so you can parse something useful.

  2. user wants to check to see if an email address is of proper, normal syntax; e.g. checking the value entered in a form. These users typically make everything strict, since what most people consider a "valid" email address is a drastic subset of 2822. For users with the strictest requirements, EmailAddress may not be the best for this use, since it might be too 'tolerant' for their needs. (Most people use a simple [email protected] type regex, which as we of course know is rarely good idea either: http://www.troyhunt.com/2013/11/dont-trust-net-web-forms-email-regex.html )

  3. user wants to intelligently parse a possibly-ugly address with the goal being a cleaned-up usable address that other software (MTAs, databases, whatever) can use/parse without breaking; sounds like this is maybe your use case? If so, the defaults specified at https://www.boxbe.com/freebox/jdoc/com/boxbe/pub/email/EmailAddress.html are what made sense to me (with the possible exception of ALLOW_DOT_IN_ATEXT, to taste.) In our experience they allowed "real" addresses the highest percentage of the time, and the addresses they failed on were almost all ridiculous.

Again, not sure if this is what you were asking, but maybe it's useful.

@bbottema bbottema changed the title Javadoc regarding default setting for email validation contradicting code, but what it should be Javadoc regarding default setting for email validation contradicting code, which is correct? Feb 29, 2016
bbottema added a commit that referenced this issue Feb 29, 2016
@bbottema
Copy link
Owner Author

@chconnor I included parts of your comment in the code to clarify the defaults and the use cases. That's good enough for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants