Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sqla): apply time grain to all temporal groupbys #16318

Merged
merged 1 commit into from
Aug 18, 2021

Conversation

villebro
Copy link
Member

SUMMARY

Currently the time grain only applies to the main temporal column, making it impossible to have multiple time grained groupbys in the same chart. This PR changes the SqlAlchemy model to apply the time grain to all selected temporal columns. By applying this change it will no longer be possible to mix time grains and original values from two different temporal columns. However, it is unlikely that anyone would prefer that behavior over the proposed new flow, given that in the old flow it was impossible to apply a time grain to other than the main temporal column.

AFTER

Table chart:
image
image

Pivot table v2:
image
image

BEFORE

Table chart:
image
image

Pivot table v2:
image
image

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link
Member

@zhaoyongjie zhaoyongjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A really useful feature for the time column.

@villebro villebro merged commit 9075e42 into apache:master Aug 18, 2021
@villebro villebro deleted the villebro/sqla-time-grain branch August 18, 2021 08:47
@john-bodley
Copy link
Member

john-bodley commented Oct 26, 2021

@villebro this seems like a breaking change and thus my sense is it should be behind a feature flag.

Note I understand the use case but it could be seen as a regression in terms of UX, i.e., it's not apparent to the user that the group-by temporal columns are also subject to the granularity which historically only applies to the temporal column. Also as you mentioned there may be existing cases which this pattern breaks—like when the time grains need to be decoupled.

An alternative approach—which requires more user work and isn't as dynamic—is to add custom columns which are granularity specific or have a mechanism to choose which columns the granularity (which really is a transformation) applies to.

@john-bodley
Copy link
Member

Also @villebro this breaks the logic in normalize_dttm_col given that resulting transformed data will be formatted as a TIMESTAMP (or similar) per the time grain transformations and thus no longer adhere to the Python date format say, hence we see errors of the form:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/tools/datetimes.py", line 456, in _convert_listlike_datetimes
    values, tz = conversion.datetime_to_datetime64(arg)
  File "pandas/_libs/tslibs/conversion.pyx", line 350, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/superset-internal/superset-fork/superset/viz.py", line 543, in get_df_payload
    df = self.get_df(query_obj)
  File "/srv/superset-internal/superset-fork/superset/viz.py", line 284, in get_df
    utils.normalize_dttm_col(
  File "/srv/superset-internal/superset-fork/superset/utils/core.py", line 1728, in normalize_dttm_col
    df[DTTM_ALIAS] = pd.to_datetime(
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/tools/datetimes.py", line 805, in to_datetime
    values = convert_listlike(arg._values, format)
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/tools/datetimes.py", line 460, in _convert_listlike_datetimes
    raise e
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/tools/datetimes.py", line 423, in _convert_listlike_datetimes
    result, timezones = array_strptime(
  File "pandas/_libs/tslibs/strptime.pyx", line 150, in pandas._libs.tslibs.strptime.array_strptime
ValueError: unconverted data remains: -01-01T00:00:00.000Z

john-bodley added a commit that referenced this pull request Oct 26, 2021
john-bodley added a commit that referenced this pull request Oct 29, 2021
…17239)

* Revert "feat(sqla): apply time grain to all temporal groupbys (#16318)"

This reverts commit 9075e42.

* Update models.py
opus-42 pushed a commit to opus-42/incubator-superset that referenced this pull request Nov 14, 2021
QAlexBall pushed a commit to QAlexBall/superset that referenced this pull request Dec 28, 2021
AAfghahi pushed a commit that referenced this pull request Jan 10, 2022
…17239)

* Revert "feat(sqla): apply time grain to all temporal groupbys (#16318)"

This reverts commit 9075e42.

* Update models.py
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.4.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels preset-io size/S 🚢 1.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants