-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-30681: Support invalid date format or value in email Date header #22090
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). CLA MissingOur records indicate the following people have not signed the CLA: For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
@@ -124,7 +124,10 @@ of the new API. | |||
.. function:: parsedate_to_datetime(date) | |||
|
|||
The inverse of :func:`format_datetime`. Performs the same function as | |||
:func:`parsedate`, but on success returns a :mod:`~datetime.datetime`. If | |||
:func:`parsedate`, but on success returns a :mod:`~datetime.datetime`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at this now during the Python core sprint. Thanks for resurrecting this PR.
I find this description a bit confusing because it's saying that the function either returns None
or raises ValueError
when parsing an invalid date. That makes it difficult to know when I'll get None
and when I'll get an exception. I think you're saying that only some bogus strings will raise ValueError
, e.g. those where the hour is invalid. I also don't see where the tests you're adding include the test for None
return.
I'm not sure what to suggest, because I'm not sure what the intended semantics should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't see how parsedate_to_datetime would ever return None. It is _parsedate_tz which returns None on bogus input, in which case parsedate_to_datetime raises a TypeError.
This is also covered in the tests, so those should be fine.
In order to continue I suggest to fix the documentation on parsedate_to_datetime, remove the mention of it returning None and replacing it with it possibly returning TypeError in case of an invalid date input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in the ticket, I agree with you; parsedate_to_datetime()
can't return None, so let's remove that from the documentation. I also suggest checking for None
return from _parsedate_tz()
and turning that into an explicit raise of ValueError
rather than relying on the implicit TypeError
from a failing tuple unpack. Then the docs should say that ValueError
is raised on an invalid date input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments. I fully agree with your points, that makes it definitely cleaner.
I implemented them in my latest commit.
112eaea
to
949dbd8
Compare
Also, from the ticket:
|
Re-submitting an older PR which was abandoned but still relevant.
…it raises a TypeError on invalid input instead of it returning None
- _parsedate_tz should explicitely return None - parsedate_to_datetime should check for None as return value by _parsedate_tz and in that case raise ValueError - Fix documentation and tests accordingly
949dbd8
to
764b1dc
Compare
@sim0nx This PR looks great now, thanks! Before I approve it, I want to get a ruling from the RMs about backporting. If I don't hear from them in the next couple of days, I'll raise it on the mailing list. Either way, this will for sure land in 3.10. |
Great news, thank you! |
Actually, only @ambv needs to chime in, since this would only be a candidate for backporting to 3.8 and 3.9. |
Based on discussions in -committers, we won't back port. |
…lots1 * origin/master: (365 commits) bpo-42029: Remove IRIX code (pythonGH-23023) bpo-42143: Ensure PyFunction_NewWithQualName() can't fail after creating the func object (pythonGH-22953) bpo-34204: Use pickle.DEFAULT_PROTOCOL in shelve (pythonGH-19639) bpo-41805: Documentation for PEP 585 (pythonGH-22615) bpo-42161: Micro-optimize _collections._count_elements() (pythonGH-23008) bpo-42161: Remove private _PyLong_Zero and _PyLong_One (pythonGH-23003) bpo-42099: Fix reference to ob_type in unionobject.c and ceval (pythonGH-22829) bpo-41659: Disallow curly brace directly after primary (pythonGH-22996) bpo-6761: Enhance __call__ documentation (pythonGH-7987) bpo-42161: Modules/ uses _PyLong_GetZero() and _PyLong_GetOne() (pythonGH-22998) bpo-41474, Makefile: Add dependency on cpython/frameobject.h (pythonGH-22999) bpo-42157: Rename unicodedata.ucnhash_CAPI (pythonGH-22994) bpo-42161: Use _PyLong_GetZero() and _PyLong_GetOne() (pythonGH-22995) bpo-30681: Support invalid date format or value in email Date header (pythonGH-22090) bpo-42161: Add _PyLong_GetZero() and _PyLong_GetOne() (pythonGH-22993) bpo-42123: Run the parser two times and only enable invalid rules on the second run (pythonGH-22111) bpo-42157: Convert unicodedata.UCD to heap type (pythonGH-22991) bpo-42157: unicodedata avoids references to UCD_Type (pythonGH-22990) bpo-39101: Fixes BaseException hang in IsolatedAsyncioTestCase. (pythonGH-22654) bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (pythonGH-22713) ...
…ythonGH-22090) I am re-submitting an older PR which was abandoned but is still relevant, python#10783 by @timb07. The issue being solved () is still relevant. The original PR python#10783 was closed as the final request changes were not applied and since abandoned. In this new PR I have re-used the original patch plus applied both comments from the review, by @maxking and @pganssle. For reference, here is the original PR description: In email.utils.parsedate_to_datetime(), a failure to parse the date, or invalid date components (such as hour outside 0..23) raises an exception. Document this behaviour, and add tests to test_email/test_utils.py to confirm this behaviour. In email.headerregistry.DateHeader.parse(), check when parsedate_to_datetime() raises an exception and add a new defect InvalidDateDefect; preserve the invalid value as the string value of the header, but set the datetime attribute to None. Add tests to test_email/test_headerregistry.py to confirm this behaviour; also added test to test_email/test_inversion.py to confirm emails with such defective date headers round trip successfully. This pull request incorporates feedback gratefully received from @bitdancer, @brettcannon, @Mariatta and @warsaw, and replaces the earlier PR python#2254. Automerge-Triggered-By: GH:warsaw
I am re-submitting an older PR which was abandoned but is still relevant, #10783 by @timb07.
The issue being solved (https://bugs.python.org/issue30681) is still relevant. The original PR #10783 was closed as
the final request changes were not applied and since abandoned.
In this new PR I have re-used the original patch plus applied both comments from the review, by @maxking and @pganssle.
For reference, here is the original PR description:
In email.utils.parsedate_to_datetime(), a failure to parse the date, or invalid date components (such as hour outside 0..23) raises an exception. Document this behaviour, and add tests to test_email/test_utils.py to confirm this behaviour.
In email.headerregistry.DateHeader.parse(), check when parsedate_to_datetime() raises an exception and add a new defect InvalidDateDefect; preserve the invalid value as the string value of the header, but set the datetime attribute to None.
Add tests to test_email/test_headerregistry.py to confirm this behaviour; also added test to test_email/test_inversion.py to confirm emails with such defective date headers round trip successfully.
This pull request incorporates feedback gratefully received from @bitdancer, @brettcannon, @Mariatta and @warsaw, and replaces the earlier PR #2254.
https://bugs.python.org/issue30681
https://bugs.python.org/issue30681
Automerge-Triggered-By: GH:warsaw