Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix to_datetime not respecting dayfirst #58876

Merged

Conversation

Aloqeely
Copy link
Member

@Aloqeely Aloqeely commented Jun 1, 2024

@Aloqeely Aloqeely requested a review from MarcoGorelli as a code owner June 1, 2024 02:12
@mroeschke mroeschke added Datetime Datetime data dtype Warnings Warnings that appear or should be added to pandas labels Jun 3, 2024
Copy link
Contributor

github-actions bot commented Jul 4, 2024

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jul 4, 2024
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your PR!

Are you sure that dateutil doesn't take dayfirst into account? it's an argument that's passed in, and it seems to still be respected in cases when the dateutil path is used:

In [11]: pd.to_datetime(['foo', '10.03.2000'], dayfirst=True, errors='coerce')
<ipython-input-11-69ca28852760>:1: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  pd.to_datetime(['foo', '10.03.2000'], dayfirst=True, errors='coerce')
Out[11]: DatetimeIndex(['NaT', '2000-03-10'], dtype='datetime64[s]', freq=None)

In [12]: pd.to_datetime(['foo', '10.03.2000'], errors='coerce')
<ipython-input-12-8697ff36715a>:1: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  pd.to_datetime(['foo', '10.03.2000'], errors='coerce')
Out[12]: DatetimeIndex(['NaT', '2000-10-03'], dtype='datetime64[s]', freq=None)

but it's been a while since I looked at this code too closely, so I might be missing something

@Aloqeely
Copy link
Member Author

Aloqeely commented Jul 4, 2024

Ah, I guess the dateutil path was not being used here.

@Aloqeely Aloqeely removed Warnings Warnings that appear or should be added to pandas Stale labels Jul 4, 2024
@Aloqeely Aloqeely changed the title ENH: Warn when to_datetime falls back to dateutil when dayfirst is True BUG: Fix to_datetime not respecting dayfirst Jul 4, 2024
@Aloqeely
Copy link
Member Author

Aloqeely commented Jul 7, 2024

The formatting makes it seem like I made a lot of changes but essentially I just added an if not dayfirst check before using string_to_dts per your suggestion.

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Aloqeely , this looks better!

the test you've added hits the conversion.pyx code you've modified

however, I don't think it hits the line you've modified in parsing.pyx? in which case, could you add such a test please?

@Aloqeely
Copy link
Member Author

Aloqeely commented Jul 16, 2024

The test fails if I revert the changes in parsing.pyx, I believe this is the line of code in the test that hits the changes:

result1, _ = parsing.parse_datetime_string_with_reso(
date_str, dayfirst=dayfirst, yearfirst=yearfirst
)

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nice, so the test already covers both paths

looks good, just a merge conflict to resolve

@mroeschke mroeschke added this to the 3.0 milestone Jul 17, 2024
@mroeschke mroeschke merged commit 288af5f into pandas-dev:main Jul 17, 2024
45 checks passed
@mroeschke
Copy link
Member

Thanks @Aloqeely

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: to_datetime behaves differently depending of the format of the string provided
3 participants