-
-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strict validation must be able to be turned off #146
Comments
Hi, this seems to me more of an Enhancement than a Bug. If you have more than one failing emails with different error, send me all of them in a zip please. |
@Slamdunk it's really not that you need a specific email, figure what ever invalid crap, and try to load it. I'm Eventum, I've created a separate loader for this, it tries to split headers in different ways ( It took several iterations to get where it is now because people are very creative about what and how to break: The header splitter has also own implementation because zend\mail was not able to handle variations. It probably has test input examples too of broken emails, need to find each for the corresponding PR they were added, i.e find PR's of each MailLoader.php change; Maybe this gets some ideas boiling, maybe the outcome is really not this library concern, maybe it instead could add a similar loader here with recovery mode like in eventum. So far I did not have to patch invalid mime types, maybe because the app doesn't care about them so much. |
@rikvdh as for email privacy, obfuscate them in the sense that it doesn't lose the value. usually not needed the exact value of the mail line. for mime body base64 parts, you can replace the content with |
@rikvdh please also include a minimal code how you load emails. |
@rikvdh perhaps add Storage\Message to the title to limit scope of reading messages from storage systems (imap, pop3) |
It took a while, sorry for that. @glensc I really don't load emails myself, I use IMAP and call the https://gist.github.com/rikvdh/c6d30d4d98d326445babb22026dbdf7f This is just one with an invalid header. I shall try to find more. |
We've been running into this problem too where Laminas throws exceptions for things like malformed headers. We would prefer to just get as much of the email as we can since we can't control what the sender sends. We've found that email clients like gmail and outlook accept these emails and display whatever they can. Would there be a way through configuration to disable throwing exceptions (and instead maybe log that an email has some problems) and still continue reading/parsing whatever it can? |
Hi, sorry for the late reply. I have a quick and extra-dirty solution, that not everyone will like, but can get you through: introduce a specific exception for each case with a reference to the original Header, so you can easily flatten or strip it out and re-try the parse: // Given a new laminas-mail exception
class InvalidHeaderException
extends \Laminas\Mail\Header\Exception\InvalidArgumentException
implements \Laminas\Mail\Header\Exception\ExceptionInterface
{
private $originalHeader;
public function __construct($originalHeader, $message = "", $code = 0, Throwable $previous = null)
{
parent::__construct($message, $code, $previous);
$this->originalHeader = $originalHeader;
}
public function getOriginalHeader(): string
{
return $this->originalHeader;
}
}
// The user code can look like
function parseMessage(string $message): \Laminas\Mail\Storage\Message {
try {
return new \Laminas\Mail\Storage\Message(['raw' => $message]);
} catch (InvalidHeaderException $invalidHeaderException) {
$originalHeader = $invalidHeaderException->getOriginalHeader();
// Record it
$flattenedMessage = preg_replace(
sprintf('s/^%s$/', preg_quote($originalHeader)),
'X-Broken-Header: ' . preg_replace('s/\W/', '-', $originalHeader),
$message
);
// OR remove it
$flattenedMessage = preg_replace(
sprintf('s/^%s$/', preg_quote($originalHeader)),
'',
$message
);
return parseMessage($flattenedMessage);
}
};
$message = parseMessage(file_get_contents('foo.eml')); WDYT? |
Also came across an issue like this with a non-compliant email address in the header, eg
The idea would be to still be able to process this message; and ignore this invalid address. I notice the laminas-validator does have some bypass options, but we need to be able to set them using laminas-mail. |
I was struggling with this too, so I had to extend storage (Pop3 and Imap), Storage\Message, Headers and Mime\Decode classes, so that I could override methods to try to parse headers and proceed in case of an error instead of breaking. |
I'm having issues transitioning to Laminas Mail from a very old Zend_Mail implementation (1.12). I am doing this because I began receiving deprecated warnings when using some Zend_Mail methods, so I tried migrating to Laminas, but this is preventing us to do so. I read the content of an IMAP mailbox, and it can contain all sort of things, including badly formatted emails in headers. I would like to be able to still parse and use the Message object even if some email addresses are not "standard". Currently Here are the 2 types of exception I got :
Example of header causing the second type of Exception :
|
This might not have full email parsing. I vaguely remember mime body content parsing was not implemented and I do not think anyone was working on that since. |
Bug Report
I'm using this library for parsing mails in different mailboxes for a ticketing system. While reading mail from mailboxes I face all kinds of "invalid" mails. Malformed headers, malformed e-mail addresses, invalid mime-types for multi-part mails, invalid subjects (containing non ASCII characters). But, the most important thing is, I want to recover as much as I can from these message since most of the time it is irrelevant or is read correctly.
Since I cannot control what people throw at me I want to be able to disable the 'strict' validation.
Summary
See above.
Current behavior
See above.
How to reproduce
I can provide some mails which I got, but these contain personal information. Let me know if this is needed.
Expected behavior
Give me the e-mail (for as far as it can be parsed)
The text was updated successfully, but these errors were encountered: