Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: timezone issue in Pandas 2 #24955

Merged
merged 2 commits into from
Aug 11, 2023
Merged

fix: timezone issue in Pandas 2 #24955

merged 2 commits into from
Aug 11, 2023

Conversation

betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented Aug 11, 2023

SUMMARY

Pandas 2.0 doesn't like when we create a series of type datetime64[ns] from a timezone-aware datetime:

>>> from datetime import datetime, timezone
>>> import pandas as pd
>>> pd.__version__
'2.0.3'
>>> d = datetime.now(timezone.utc)
>>> pd.Series([d], dtype="datetime64[ns]")
Traceback (most recent call last):
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1243, in maybe_cast_to_datetime
    dta = DatetimeArray._from_sequence(value, dtype=dtype)
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 291, in _from_sequence
    return cls._from_sequence_not_strict(scalars, dtype=dtype, copy=copy)
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 343, in _from_sequence_not_strict
    _validate_tz_from_dtype(dtype, tz, explicit_tz_none)
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2392, in _validate_tz_from_dtype
    raise ValueError(
ValueError: cannot supply both a tz and a timezone-naive dtype (i.e. datetime64[ns])

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/series.py", line 509, in __init__
    data = sanitize_array(data, index, dtype, copy)
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/construction.py", line 599, in sanitize_array
    subarr = _try_cast(data, dtype, copy)
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/construction.py", line 756, in _try_cast
    return maybe_cast_to_datetime(arr, dtype)
  File "/Users/beto/.pyenv/versions/superset-3.9.2/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1247, in maybe_cast_to_datetime
    raise ValueError(
ValueError: Cannot convert timezone-aware data to timezone-naive dtype. Use pd.Series(values).dt.tz_localize(None) instead.

This used to work fine in 1.5.3 (also tested in 1.3.4):

>>> from datetime import datetime, timezone
>>> import pandas as pd
>>> pd.__version__
'1.5.3'
>>> d = datetime.now(timezone.utc)
>>> pd.Series([d], dtype="datetime64[ns]")
sys:1: FutureWarning: Data is timezone-aware. Converting timezone-aware data to timezone-naive by passing dtype='datetime64[ns]' to DataFrame or Series is deprecated and will raise in a future version. Use `pd.Series(values).dt.tz_localize(None)` instead.
0   2023-08-11 01:34:05.638490
dtype: datetime64[ns]

This PR fixes it.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Aug 11, 2023

Codecov Report

Merging #24955 (378dc1a) into master (ce65a3b) will decrease coverage by 10.67%.
The diff coverage is 0.00%.

@@             Coverage Diff             @@
##           master   #24955       +/-   ##
===========================================
- Coverage   69.03%   58.36%   -10.67%     
===========================================
  Files        1905     1905               
  Lines       74136    74136               
  Branches     8212     8212               
===========================================
- Hits        51181    43272     -7909     
- Misses      20832    28741     +7909     
  Partials     2123     2123               
Flag Coverage Δ
hive ?
mysql ?
postgres ?
presto 54.08% <0.00%> (ø)
python 61.06% <0.00%> (-22.32%) ⬇️
sqlite ?
unit 55.06% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
superset/result_set.py 80.98% <0.00%> (-16.91%) ⬇️

... and 293 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@betodealmeida betodealmeida added the merge-if-green If approved and tests are green, please go ahead and merge it for me label Aug 11, 2023
@betodealmeida betodealmeida merged commit aca006f into master Aug 11, 2023
Copy link
Member

@hughhhh hughhhh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sadpandajoe pushed a commit to preset-io/superset that referenced this pull request Aug 11, 2023
@sadpandajoe
Copy link
Member

🏷️ preset:2023.31

@sebastianliebscher
Copy link
Contributor

Thanks @betodealmeida and sorry for the trouble with my PR. Next time I'll try to test my PRs more and not just rely on unit / integration tests and my own datasets. I hope this was the last regression caused by #24768 🙏

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
@mistercrunch mistercrunch deleted the sc_73474 branch March 26, 2024 18:04
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels merge-if-green If approved and tests are green, please go ahead and merge it for me size/M 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants