-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(explore): Prevent unnecessary series limit subquery #21154
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21154 +/- ##
==========================================
- Coverage 66.39% 66.37% -0.03%
==========================================
Files 1781 1781
Lines 67885 67885
Branches 7244 7244
==========================================
- Hits 45073 45059 -14
- Misses 20953 20967 +14
Partials 1859 1859
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
/testenv up FEATURE_GENERIC_CHART_AXES=true |
@jinghua-qa Ephemeral environment spinning up at http://54.200.17.110:8080. Credentials are |
superset/connectors/sqla/models.py
Outdated
@@ -1410,7 +1410,7 @@ def get_sqla_query( # pylint: disable=too-many-arguments,too-many-locals,too-ma | |||
col=selected, template_processor=template_processor | |||
) | |||
groupby_all_columns[outer.name] = outer | |||
if not series_column_names or outer.name in series_column_names: | |||
if outer.name in series_column_names: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update this line for the helpers.py
as well?
I'm currently refactoring this piece of code for SIP-69 models via a Mixin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work - legacy timeseries charts will have set is_timeseries
to True
, and new charts will no longer be using it:
if outer.name in series_column_names: | |
if ( | |
is_timeseries and not series_column_names | |
) or outer.name in series_column_names: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro That makes sense to me, thanks! I can do some double-checking with the legacy timeseries charts – which ones are you talking about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The legacy time series line, bar etc charts. Essentially the ones with is_timeseries
set to True
and that are served by viz.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok great, I'll update the PR ASAP!
@codyml thanks for working on this - this is indeed an important fix. One of the reasons for that check is because legacy timeseries charts don't populate |
@villebro Thanks for the info, that's the kind of thing I was worried about. One idea I had was to extend the query object to include the |
IMO including But back to the problem at hand - I have a hypothesis that I will validate now that should be uninvasive. Let me try it out and post my findings so we can get this fix merged asap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know what you think of the proposed solution
superset/connectors/sqla/models.py
Outdated
@@ -1410,7 +1410,7 @@ def get_sqla_query( # pylint: disable=too-many-arguments,too-many-locals,too-ma | |||
col=selected, template_processor=template_processor | |||
) | |||
groupby_all_columns[outer.name] = outer | |||
if not series_column_names or outer.name in series_column_names: | |||
if outer.name in series_column_names: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work - legacy timeseries charts will have set is_timeseries
to True
, and new charts will no longer be using it:
if outer.name in series_column_names: | |
if ( | |
is_timeseries and not series_column_names | |
) or outer.name in series_column_names: |
3c6ee9e
to
0a6e102
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Adding an integration test would be great, but as this appears to be a critical hotfix we can probably wait a few days before adding one (I can help if needed).
Storybook has completed and can be viewed at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK with temporarily skip the test and would like to see it fixed asap.
Ephemeral environment shutdown and build artifacts deleted. |
Nice! thank you for fixing this! |
* Prevent series limit when no series limit columns specified. * Add timeseries check for legacy charts. * Apply fix to helpers.py. * Skip Cypress color consistency tests.
* Prevent series limit when no series limit columns specified. * Add timeseries check for legacy charts. * Apply fix to helpers.py. * Skip Cypress color consistency tests.
* Prevent series limit when no series limit columns specified. * Add timeseries check for legacy charts. * Apply fix to helpers.py. * Skip Cypress color consistency tests.
* Prevent series limit when no series limit columns specified. * Add timeseries check for legacy charts. * Apply fix to helpers.py. * Skip Cypress color consistency tests.
SUMMARY
This PR attempts to fix a bug with the series limit control when
GENERIC_CHART_AXES
feature is turned on. Steps to reproduce bug:Upon investigation, it looks like the SQL query builder is adding a perhaps unnecessary join against a subquery whenever Series Limit is set, regardless of if there is actually a dimension set that results in series needing limiting. This PR attempts to make the join against the subquery only included when the frontend includes values in the
series_columns
array in the query context.Update: Thanks to @villebro's input, this PR now handles legacy time-series charts as before by checking the
is_timeseries
param.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before:
After:
TESTING INSTRUCTIONS
GENERIC_CHART_AXES
disabled:GENERIC_CHART_AXES
and test the same:ADDITIONAL INFORMATION
GENERIC_CHART_AXES