-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: don't strip SQL comments in Explore - 2nd try #28753
Conversation
@michael-s-molina could use a failing query or more input on the pattern identified |
/testenv up |
|
|
|
Reading the examples, pattern seems to be simply some dash-dash comment anywhere in the query. Hoping I can recreate it on top of the example database. Will keep you posted here. |
186e925
to
f0b7202
Compare
Here I'm trying to recreate the issue that led to revert #28363 in #28567, likely some sort of trailing comment of some kind. This is DRAFT for now until I can reproduce the issue, write a covering test for it, and address it. First attempt at creating a problematic virtual dataset was the following ->
tests/integration_tests/conftest.py
Outdated
UNION ALL | ||
SELECT 8 as col1, 'i' as col2, 1.8, NULL, '2000-01-09 00:00:00', 9 | ||
UNION ALL | ||
SELECT * FROM cte --COMMENT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michael-s-molina this is the only case where I was able to reproduce a clear failure given the previous logic. Basically this particular test case fails on the previous PR that we merge/reverted, but passes here.
Happy to add more test cases here if you'd like to build more confidence before merging this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mistercrunch I cherry-picked your latest changes into our staging environment and was able to confirm the issue was fixed. Thanks for the investigation and added tests.
@@ -1450,7 +1450,7 @@ def get_from_clause( | |||
if not self.is_virtual: | |||
return self.get_sqla_table(), None | |||
|
|||
from_sql = self.get_rendered_sql(template_processor) | |||
from_sql = self.get_rendered_sql(template_processor) + "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the fix compared to the original PR, basically forcing a line break after the FROM clause subquery.
"""
-- BEFORE, notice how the `) AS` is in-comment
{...}
FROM cte --COMMENT) AS virtual_table GROUP BY col1, col2
-- AFTER, notice the line break after the comment, allowing proper aliasing of the subquery
{...}
FROM cte --COMMENT
) AS virtual_table GROUP BY col1, col2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only suggestion would be to add a comment in the code explaining why do we need the line break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mistercrunch just to verify, does this logic work for /* ... */
multiline comments?
@@ -1106,7 +1104,7 @@ def get_from_clause( | |||
CTE, the CTE is returned as the second value in the return tuple. | |||
""" | |||
|
|||
from_sql = self.get_rendered_sql(template_processor) | |||
from_sql = self.get_rendered_sql(template_processor) + "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i noticed (again) that superset/connectors/sqla/models.py and superset/models/helpers.py have a fair amount of copy/pasta/dup-logic. I'll try and fix that in a future PR.
@@ -24,7 +24,7 @@ jobs: | |||
mysql+mysqldb://superset:[email protected]:13306/superset?charset=utf8mb4&binary_prefix=true | |||
services: | |||
mysql: | |||
image: mysql:5.7 | |||
image: mysql:8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this was intended as part of this PR 🤷🏼♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I should have added a note, but my new unit test test_virtual_dataset_with_comments
uses "in-a-subquery CTE", which apparently isn't supported in mysql 5.7
, failing that one test against old mysql. That CTE is one way that I was able to reproduce some of the issues from the previous reverted PR.
Given the wikipedia page, 5.7 isn't supported anymore and 8.0 is the next release. I'm unclear about the gap but might have considered doing a smaller bump if there was an option. Bumping to 8.0 seems like the right move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code owner stamp of approval :)
(cherry picked from commit 514eda8)
Here I'm trying to recreate the issue that led to revert #28363 in #28567, likely some sort of trailing comment of some kind. This is DRAFT for now until I can reproduce the issue, write a covering test for it, and address it.
First attempt at creating a problematic virtual dataset was the following ->