-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sqllab): sqllab/execute returns 500 when user only has schema access #28357
base: master
Are you sure you want to change the base?
fix(sqllab): sqllab/execute returns 500 when user only has schema access #28357
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1582,7 +1582,7 @@ def extract_tables_from_jinja_sql(sql: str, database: Database) -> set[Table]: | |||
return ( | |||
tables | |||
| ParsedQuery( | |||
sql_statement=processor.process_template(template), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toniphan21 I think the current logic (which I added) might be correct, i.e., we first have to render/sanitize the template in order for it to be "proper" SQL so that SQLGlot can actually parse it.
In your PR description you mentioned,
There are 2 ways to extract table set:
- Extract any tables referenced within the confines of specific Jinja macros.
- Parse SQL and get table.
whereas in actuality we need to do both*, which is what the current code does.
* Actuality for (2) we need to parse "sanitized" SQL, i.e., we the Jinja templates rendered or removed so that SQLGlot is able to parse it.
That said it's interesting that Mypy didn't barf on the fact that the first argument to process_template
is of type nodes.Template
rather than str
. Here's were the argument is processed, where the from_string() method accepts either a node.Template
or str
(albeit being named to only accept the late).
The TL;DR is I think the code is correct, though the process_template()
method could be updated to include a union of str | nodes.Template
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion (I updated the description). From the code, I understand that this function will get tables from both ways. However, the problem lies in ParsedQuery.stripped(), which is supposed to strip whitespace characters out of the given sql
. If you pass nodes.Template
, it will throw an exception because there is no strip()
method in nodes.Template
.
Please correct me if I'm wrong, but I understand that in the first part, you already "Extract any tables referenced within the confines of specific Jinja macros." Then, in this part, the given sql
should be processed via the processor
to create a raw sql_statement
, which is then put into the ParsedQuery
class to extract tables
. If that is the purpose, hence the variable passed to processor.process_template(...)
should be a given SQL, which may include Jinja macros.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #28357 +/- ##
===========================================
+ Coverage 60.48% 77.56% +17.07%
===========================================
Files 1931 521 -1410
Lines 76236 37570 -38666
Branches 8568 0 -8568
===========================================
- Hits 46114 29140 -16974
+ Misses 28017 8430 -19587
+ Partials 2105 0 -2105
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Relevant to this issue as well: #28218 |
Looks like there are some failing unit tests. Do you have time to take a look, @toniphan21 ? |
e4d0612
to
e85728a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, in this part, the given sql should be processed via the processor to create a raw sql_statement, which is then put into the ParsedQuery class to extract tables. If that is the purpose, hence the variable passed to processor.process_template(...) should be a given SQL, which may include Jinja macros.
@toniphan21 Given the following:
Actuality for (2) we need to parse "sanitized" SQL, i.e., we the Jinja templates rendered or removed so that SQLGlot is able to parse it.
We can't pass sql
to processor.process_template
because we need the processed template with the following:
# Replace the potentially problematic Jinja macro with some benign SQL.
node.__class__ = nodes.TemplateData
node.fields = nodes.TemplateData.fields
node.data = "NULL"
The question that we need to answer is why Template.render is not returning a str
in accordance to its signature:
def render(self, *args: t.Any, **kwargs: t.Any) -> str:
SUMMARY
When a user has no admin access and is trying to run a query in SQL Lab, Superset needs to check:
To be able to check schema access, Superset needs to know which table the user is trying to execute on and use
extract_tables_from_jinja_sql()
to get it. The function extract tables in both ways:In the line I changed, there is simply a bug that uses
template
which has type Template. The correct one should besql
, which is a string sent by the user.TESTING INSTRUCTIONS
Reproduce steps:
/api/v1/sqllab/execute/
endpoint returns 500Impact