Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime.strptime without a year fails on Feb 29 #70647

Open
SriramRajagopalan mannequin opened this issue Feb 29, 2016 · 17 comments
Open

datetime.strptime without a year fails on Feb 29 #70647

SriramRajagopalan mannequin opened this issue Feb 29, 2016 · 17 comments
Assignees
Labels
3.13 bugs and security fixes docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@SriramRajagopalan
Copy link
Mannequin

SriramRajagopalan mannequin commented Feb 29, 2016

BPO 26460
Nosy @gpshead, @abalkin, @pganssle, @tirkarthi, @nickzoic

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2016-02-29.18:02:35.000>
labels = ['3.7', '3.8', 'type-feature', 'library']
title = 'datetime.strptime without a year fails on Feb 29'
updated_at = <Date 2020-03-03.17:16:20.440>
user = 'https://bugs.python.org/SriramRajagopalan'

bugs.python.org fields:

activity = <Date 2020-03-03.17:16:20.440>
actor = 'nickzoic'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2016-02-29.18:02:35.000>
creator = 'Sriram Rajagopalan'
dependencies = []
files = []
hgrepos = []
issue_num = 26460
keywords = []
message_count = 13.0
messages = ['261014', '261024', '261027', '261028', '261033', '343085', '363123', '363202', '363215', '363217', '363223', '363257', '363280']
nosy_count = 8.0
nosy_names = ['gregory.p.smith', 'belopolsky', 'polymorphm', 'Sriram Rajagopalan', 'p-ganssle', 'xtreak', '[email protected]', 'nickzoic']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue26460'
versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

Linked PRs

@SriramRajagopalan
Copy link
Mannequin Author

SriramRajagopalan mannequin commented Feb 29, 2016

$ python
    Python 3.5.1 (default, Dec  7 2015, 12:58:09) 
    [GCC 5.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 
    >>> 
    >>> 
    >>> import time
    >>> 
    >>> time.strptime("Feb 29", "%b %d")
    time.struct_time(tm_year=1900, tm_mon=2, tm_mday=29, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=60, tm_isdst=-1)
    >>> 
    >>> 
    >>> import datetime
    >>> 
    >>> datetime.datetime.strptime("Feb 29", "%b %d")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.5/_strptime.py", line 511, in _strptime_datetime
        return cls(*args)
    ValueError: day is out of range for month

The same issue is seen in all versions of Python

@SriramRajagopalan SriramRajagopalan mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Feb 29, 2016
@gpshead
Copy link
Member

gpshead commented Feb 29, 2016

Python's time.strptime() behavior is consistent with that of glibc 2.19:

======= strptime_c.c =======

#define _XOPEN_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

int
main(void)
{
  struct tm tm;
  char buf[255];
  memset(&tm, 0, sizeof(struct tm));
  strptime("Feb 29", "%b %d", &tm);
  strftime(buf, sizeof(buf), "%d %b %Y %H:%M", &tm);
  puts(buf);
  exit(EXIT_SUCCESS);
}

=======

$ gcc strptime_c.c 
$ ./a.out
29 Feb 1900 00:00

I'm not saying that the behavior is a good API, but given the unfortunate API at hand, parsing a date without specifying what year it is using strptime is a bad idea.

@abalkin
Copy link
Member

abalkin commented Feb 29, 2016

This is not no more bug than

>>> from datetime import *
>>> datetime.strptime('0228', '%m%d')
datetime.datetime(1900, 2, 28, 0, 0)

Naturally, as long as datetime.strptime('0228', '%m%d') is the same as datetime.strptime('19000228', '%Y%m%d'), datetime.strptime('0229', '%m%d') should raise a ValueError as long as datetime.strptime('19000229', '%Y%m%d') does.

The only improvement, I can think of in this situation is to point the user to time.strptime() in the error message. The time.strptime method works just fine in the recent versions (see bpo-14157.)

>>> time.strptime('0229', '%m%d')[1:3]
(2, 29)

@abalkin abalkin added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Feb 29, 2016
@abalkin
Copy link
Member

abalkin commented Feb 29, 2016

Python's time.strptime() behavior is consistent with that of glibc 2.19

Gregory,

I believe OP is complaining about the way datetime.datetime.strptime() behaves, not time.strptime() which is mentioned as (preferred?) alternative.

See msg261015 in bpo-14157 for context.

@gpshead
Copy link
Member

gpshead commented Feb 29, 2016

time.strptime() is "working" (not raising an exception) as it appears not to validate the day of the month when a year is not specified, yet the return value from either of these APIs is a date which has no concept of an ambiguous year.

## Via the admittedly old Python 2.7.6 from Ubuntu 14.04: ##
# 1900 was not a leap year as it is not divisible by 400.
>>> time.strptime("1900 Feb 29", "%Y %b %d")
ValueError: day is out of range for month
>>> time.strptime("Feb 29", "%b %d")
time.struct_time(tm_year=1900, tm_mon=2, tm_mday=29, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=60, tm_isdst=-1)

So what should the validation behavior be?

>>> datetime.datetime.strptime("Feb 29", "%b %d")
ValueError: day is out of range for month
>>> datetime.datetime.strptime("2016 Feb 29", "%Y %b %d")
datetime.datetime(2016, 2, 29, 0, 0)
>>> datetime.datetime.strptime("1900 Feb 29", "%Y %b %d")
ValueError: day is out of range for month
>>> datetime.datetime(year=1900, month=2, day=29)
ValueError: day is out of range for month

datetime objects cannot be constructed with the invalid date (as the time.strptime return value allows).

Changing the API to assume the current year or a +/- 6 months from "now" when no year is parsed is likely to break existing code.

@tirkarthi
Copy link
Member

See also bpo-19376. This behavior is now documented with 56027cc

@nickzoic
Copy link
Mannequin

nickzoic mannequin commented Mar 2, 2020

I suspect this is going to come up about this time of every leap year :-/

The workaround is prepending "%Y " to the pattern and eg: "2020 " to the date string, but that's not very nice.

Would adding a kwarg "default_year" be an acceptable solution?
I can't think of any other situation other than leap years when this is going to come up. If both "default_year" and "%Y" are present throw an exception (maybe therefore just call the kwarg "year")

In the weird case where you want to do date maths involving the month as well, you can always use a safe choice like "default_year=2020" and then fix the year up afterwards:

    dt = datetime.strptime(date_str, "%b %d", default_year=2020)
    dt = dt.replace(year=2021 if dt.month > 6 else 2022)

@nickzoic nickzoic mannequin added 3.7 (EOL) end of life 3.8 (EOL) end of life labels Mar 2, 2020
@pganssle
Copy link
Member

pganssle commented Mar 2, 2020

I don't think adding a default_year parameter is the right solution here.

The actual problem is that time.strptime, and by extension datetime.strptime has a strange and confusing interface. What should happen is either that year is set to None or some other marker of a missing value or datetime.strptime should raise an exception when it's being asked to construct something that does not contain a year.

Since there is no concept of a partial datetime, I think our best option would be to throw an exception, except that this has been baked into the library for ages and would start to throw exceptions even when the person has correctly handled the Feb 29th case.

I think one possible "solution" to this would be to raise a warning any time someone tries to use datetime.strptime without requesting a year to warn them that the thing they're doing only exists for backwards compatibility reasons. We could possibly eventually make that an exception, but I'm not sure it's totally worth a break in backwards compatibility when a warning should put people on notice.

@nickzoic
Copy link
Mannequin

nickzoic mannequin commented Mar 2, 2020

Not disagreeing with you that "%b %d" timestamps with no "%Y" are excerable, but they're fairly common in the *nix world unfortunately.

People need to parse them, and the simple and obvious way to do this breaks every four years.

I like the idea of having a warning for not including %Y *and* not setting a default_year kwarg though.

@gpshead
Copy link
Member

gpshead commented Mar 2, 2020

I _doubt_ there is code expecting the default year when unspecified to actually be 1900.

Change that default to any old year with a leap year (1904?) and it'll still (a) stand out as a special year that can be looked up should it wind up being _used_ as the year in code somewhere and (b) not fail every four years for people just parsing to extract Month + Day values.

@abalkin
Copy link
Member

abalkin commented Mar 3, 2020

On Mar 2, 2020, at 6:33 PM, Gregory P. Smith <[email protected]> wrote:

Change that default to any old year with a leap year (1904?)

In the 21st century, the year 2000 default makes much more sense than 1900. Luckily 2000 is also a leap year.

@gerardwalummitedu
Copy link
Mannequin

gerardwalummitedu mannequin commented Mar 3, 2020

Yes, code that has been working for my organization the past two years just broke this weekend.

Meaning depends on context. The straightforward solution is that if no year is specified, the return value should default to the current year.

@nickzoic
Copy link
Mannequin

nickzoic mannequin commented Mar 3, 2020

It's kind of funny that there's already consideration of this in _strptime._strptime(), which returns a tuple used by datetime.datetime.strptime() to construct the new datetime.
Search for leap_year_fix.

I think the concern though is if we changed the default year that might possibly break someone's existing code: thus my suggestion to allow the programmer to explicitly change the default.

However, I can also see that if their code is parsing dates in this way it is already wrong, and that if we're causing users pain now when they upgrade Python we're at least saving them pain at 2024-02-29 00:00:01.

Taking that approach, perhaps parsing dates with no year should just throw an exception, forcing the programmer to do it right the first time. In this case though, I'd rather have a "year" kwarg to prevent the programmer having to do horrible string hacks like my code currently does.

I'm not sure: is it useful for me to produce a PR so we have something specific to consider?

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@abalkin
Copy link
Member

abalkin commented Feb 8, 2023

This issue is almost a duplicate of GH-58365 which was fixed in 1682e5d. However, this violated ordering of time.strptime("Feb 29", "%b %d") and time.strptime("Mar 1", "%b %d")), so the logic was adjusted to return an invalid date in 072e4a3.

@zhu-xiaowei
Copy link

Unfortunately, we encountered this problem again recently on 2024-02-29. It seems that the source code has considered the issue that 1900 is not a leap year and change the year to 1904.

cpython/Lib/_strptime.py

Lines 490 to 496 in 2e94a66

leap_year_fix = False
if year is None:
if month == 2 and day == 29:
year = 1904 # 1904 is first leap year of 20th century
leap_year_fix = True
else:
year = 1900

But it was eventually modified to return to 1900

cpython/Lib/_strptime.py

Lines 535 to 539 in 2e94a66

if leap_year_fix:
# the caller didn't supply a year but asked for Feb 29th. We couldn't
# use the default of 1900 for computations. We set it back to ensure
# that February 29th is smaller than March 1st.
year = 1900

Then an exception occurs when we execute datetime.strptime("02-29", "%m-%d"), so what's the point of doing this?

This hidden problem will occur once every four years. For those who encounter it once, it will be fixed directly. I still hope to fix it in later versions, for example, change it to 2000, or only return 1904 when the date is 2.29 and explain it in the documentation.

@pganssle
Copy link
Member

pganssle commented Mar 1, 2024

I would be in favor of either raising an exception if the year is unspecified or changing the default to some leap year. 2000 is fine. I think 4 would be defensible. If we could extend datetime.min back to year 0, I think the proleptic Gregorian calendar should have 0 as a leap year, and that would probably be the best option (since it would not be mistaken for an actual year).

I don't think it's a terrible string hack to have people prepend a default year to their strings, considering they know the format, so I'm not really in favor of the added complexity of something like "default_year" (or a whole host of default_x parameters).

I would say raising an exception is the "correct" thing to do, but also the most annoying fix here, so I'm on the fence about it.

Either way, we should probably get this one fixed by ~2026-2027 😛

gpshead added a commit to gpshead/cpython that referenced this issue Mar 1, 2024
…datetime.

Every four years people encounter this because it just isn't obvious.
This moves the footnote up to a note with a code example.

We'd love to change the default year value for datetime but doing
that could have other consequences for existing code.  This documented
workaround *always* works.
@gpshead
Copy link
Member

gpshead commented Mar 1, 2024

If nothing else, more clearly calling this out with the recommended workaround in the docs is a good idea. Draft PR up to do that.

While changing the default datetime date feels doable because we'd like to think "who could ever be depending on its default values", it would be a public API change and thus should go through an announcement and 2+ release deprecation cycle if we're actually going to change the default or if we're going to have datetime start raising when no year is specified.

Users clearly want a library to parse partial values. Otherwise this bug wouldn't keep coming up like clockwork. So if we want to raise an exception, we also need to offer an actual API intended for partial value parsing...

The datetime type unfortunately does not have a way to represent "I dunno 🤷🏾, NaN?" for individual field values.

Perhaps a datetime.partial type is desired for this use case?

The third party dateutil.parser.parse API appears to default to the current now() when no default datetime instance is supplied. That doesn't solve the problem, but given this issue gets raised and commented on every four years on 2/29 that suggests it is what most users who are tripped up by this actually might want. That'd also be a major API change.

@gpshead gpshead removed 3.8 (EOL) end of life 3.7 (EOL) end of life labels Mar 1, 2024
gpshead added a commit to gpshead/cpython that referenced this issue Mar 1, 2024
… month.

The presence of the values in the error message gives a stronger hint as
to what went wrong.

```
>>> datetime.strptime("2.29", "%m.%d")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    datetime.strptime("2.29", "%m.%d")
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File ".../Lib/_strptime.py", line 565, in _strptime_datetime
    return cls(*args)
           ~~~^^^^^^^
ValueError: day 29 is out of range for month 2 in year 1900
```
@gpshead gpshead self-assigned this Mar 20, 2024
@gpshead gpshead added 3.13 bugs and security fixes docs Documentation in the Doc dir labels Mar 20, 2024
encukou pushed a commit that referenced this issue Apr 3, 2024
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
gpshead pushed a commit that referenced this issue May 11, 2024
…H-117668)

* Fix `test_strptime` raises a DeprecationWarning
* Ignore deprecation warnings where appropriate.
* Update Lib/test/datetimetester.py

This is follow on work to silence unnecessary warnings from the test suite that changes for #70647 added.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 11, 2024
…rning (pythonGH-117668)

* Fix `test_strptime` raises a DeprecationWarning
* Ignore deprecation warnings where appropriate.
* Update Lib/test/datetimetester.py

This is follow on work to silence unnecessary warnings from the test suite that changes for python#70647 added.
(cherry picked from commit abead54)

Co-authored-by: Nice Zombies <[email protected]>
gpshead pushed a commit that referenced this issue May 11, 2024
…arning (GH-117668) (GH-118956)

gh-117655: Prevent `test_strptime` from raising a DeprecationWarning (GH-117668)

* Fix `test_strptime` raises a DeprecationWarning
* Ignore deprecation warnings where appropriate.
* Update Lib/test/datetimetester.py

This is follow on work to silence unnecessary warnings from the test suite that changes for #70647 added.
(cherry picked from commit abead54)

Co-authored-by: Nice Zombies <[email protected]>
hroncok added a commit to hroncok/jupyter_client that referenced this issue Jun 10, 2024
    ...
    /usr/lib/python3.13/site-packages/jupyter_client/jsonutil.py:31: in <module>
        datetime.strptime("1", "%d")  # noqa
    /usr/lib64/python3.13/_strptime.py:573: in _strptime_datetime
        tt, fraction, gmtoff_fraction = _strptime(data_string, format)
    /usr/lib64/python3.13/_strptime.py:336: in _strptime
        format_regex = _TimeRE_cache.compile(format)
    /usr/lib64/python3.13/_strptime.py:282: in compile
        return re_compile(self.pattern(format), IGNORECASE)
    /usr/lib64/python3.13/_strptime.py:270: in pattern
        warnings.warn("""\
    E   DeprecationWarning: Parsing dates involving a day of month without a year specified is ambiguious
    E   and fails to parse leap day. The default behavior will change in Python 3.15
    E   to either always raise an exception or to use a different default year (TBD).
    E   To avoid trouble, add a specific year to the input & format.
    E   See python/cpython#70647.

Fixes jupyter#1020
hroncok added a commit to hroncok/ipykernel that referenced this issue Jun 10, 2024
    ...
    /usr/lib/python3.13/site-packages/ipykernel/jsonutil.py:29: in <module>
        datetime.strptime("1", "%d")
    /usr/lib64/python3.13/_strptime.py:573: in _strptime_datetime
        tt, fraction, gmtoff_fraction = _strptime(data_string, format)
    /usr/lib64/python3.13/_strptime.py:336: in _strptime
        format_regex = _TimeRE_cache.compile(format)
    /usr/lib64/python3.13/_strptime.py:282: in compile
        return re_compile(self.pattern(format), IGNORECASE)
    /usr/lib64/python3.13/_strptime.py:270: in pattern
        warnings.warn("""\
    E   DeprecationWarning: Parsing dates involving a day of month without a year specified is ambiguious
    E   and fails to parse leap day. The default behavior will change in Python 3.15
    E   to either always raise an exception or to use a different default year (TBD).
    E   To avoid trouble, add a specific year to the input & format.
    E   See python/cpython#70647.

See also jupyter/jupyter_client#1020
hroncok added a commit to hroncok/nbclient that referenced this issue Jun 10, 2024
    ...
    /usr/lib/python3.13/site-packages/nbclient/jsonutil.py:29: in <module>
        datetime.strptime("1", "%d")
    /usr/lib64/python3.13/_strptime.py:573: in _strptime_datetime
        tt, fraction, gmtoff_fraction = _strptime(data_string, format)
    /usr/lib64/python3.13/_strptime.py:336: in _strptime
        format_regex = _TimeRE_cache.compile(format)
    /usr/lib64/python3.13/_strptime.py:282: in compile
        return re_compile(self.pattern(format), IGNORECASE)
    /usr/lib64/python3.13/_strptime.py:270: in pattern
        warnings.warn("""\
    E   DeprecationWarning: Parsing dates involving a day of month without a year specified is ambiguious
    E   and fails to parse leap day. The default behavior will change in Python 3.15
    E   to either always raise an exception or to use a different default year (TBD).
    E   To avoid trouble, add a specific year to the input & format.
    E   See python/cpython#70647.

See also jupyter/jupyter_client#1020
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…rning (pythonGH-117668)

* Fix `test_strptime` raises a DeprecationWarning
* Ignore deprecation warnings where appropriate.
* Update Lib/test/datetimetester.py

This is follow on work to silence unnecessary warnings from the test suite that changes for python#70647 added.
minrk pushed a commit to jupyter/jupyter_client that referenced this issue Sep 17, 2024
...
    /usr/lib/python3.13/site-packages/jupyter_client/jsonutil.py:31: in <module>
        datetime.strptime("1", "%d")  # noqa
    /usr/lib64/python3.13/_strptime.py:573: in _strptime_datetime
        tt, fraction, gmtoff_fraction = _strptime(data_string, format)
    /usr/lib64/python3.13/_strptime.py:336: in _strptime
        format_regex = _TimeRE_cache.compile(format)
    /usr/lib64/python3.13/_strptime.py:282: in compile
        return re_compile(self.pattern(format), IGNORECASE)
    /usr/lib64/python3.13/_strptime.py:270: in pattern
        warnings.warn("""\
    E   DeprecationWarning: Parsing dates involving a day of month without a year specified is ambiguious
    E   and fails to parse leap day. The default behavior will change in Python 3.15
    E   to either always raise an exception or to use a different default year (TBD).
    E   To avoid trouble, add a specific year to the input & format.
    E   See python/cpython#70647.

Fixes #1020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
Status: In Progress
Development

No branches or pull requests

5 participants