-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add validate_sql_json
endpoint for checking that a given sql query is valid for the chosen database
#7422
feat: Add validate_sql_json
endpoint for checking that a given sql query is valid for the chosen database
#7422
Conversation
@bearcage looks like there is a lint error we want to fix https://travis-ci.org/apache/incubator-superset/jobs/526725518 |
A couple of high-level points and suggestions for consideration:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of general comments:
- Would you mind adding some unit tests?
- Superset supports (and encourages) the use of typing (see
CONTRIBUTING.md
). Would you mind adding type hints to the new functions you added? There’s also a note about how to document exceptions (if still relevant).
Initially I had this in a separate endpoint, but it seemed like a lot of duplication of effort to pull over all the other conditional modifications we do on queries before executing them (limit, ctas, and templating). Templating in particular was what initially convinced me to move this in to the query execution flow. That said, I'm not at all used to OSS projects, so I hadn't considered the rhetorical angle. I'll split it back out. Point well taken on the name. How about Happy to add tests and typing — I have some tests which I'd planned to stack on this commit. I'll squash them instead. Since this portion of the code isn't particularly liberal with types, do y'all prefer py3-only annotations, or comment annotations? |
Please use Python 3 only type annotations as Superset only supports Python 3.6+. |
This is the first step toward supporting live as-you-type SQL validation against databases that support doing it (for starters just presto). I'm trying to stick as much to the existing query machinery as possible to avoid introducing too much new state for handling this, so I plumbed it through as a db spec parameter and an option you can set on an individual query. As of this commit: - [X] Plumb validation db_spec options - [X] Add validation_only option and transform to the sql_json view - [ ] Tests for the above on the backend - [ ] Debounced requestValidationQuery in onSqlChange - [ ] Check incoming results -- full query vs validation - [ ] Tests for the above on the frontend - [ ] Highlight error lines - [ ] Print error message in alert-danger box above toolbar controls
11179fa
to
d076e0b
Compare
I don't think it is much more effort... you can still use all of that. I think it would literally just become:
And that type would just do the last bit of prep that the method currently does, for that impl. The main thing here is this would move it to a configuration option and provide abstractions for impls, rather than make it an intrinsic property of database types. Why am I pushing on this? LOL... I would love to see this be a feature that works well for all databases eventually, not just a patchwork or one-off if the customer has the right thing. A consistent, delightful experience that isn't tied to the db in use, or at least leaving that possibility open. |
Tagged [WIP] since I need to update unit tests to go with the review changes. |
Factors out the validation process to a separate class which does validation inline, rather than passing it through the existing query flow implicitly. This is meant to address Dave's feedback requesting that the validation flow not be explicitly tied to a query transform since that's uniquely a presto-ism. Next up in this stack: unit tests.
f16789f
to
ed9ee8d
Compare
Just to demo it and see if it's cleaner this wak.
The new validator files are >90% covered, and the coverage on the core view is back up to about where it was before this changeset.
@DiggidyDave @mistercrunch since you've both mentioned it at one point or another, I pulled validate out to a separate endpoint. I think this is probably the right call in the long run since the more I think about it the less it makes sense to fuse validation in to the query path. For presto it's quite similar, but for any other validation scheme I can come up with it won't necessarily be. This also avoids conflating validation requests with query requests if you e.g. track 5xx response rate by endpoint. I've also fixed up the unit tests and picked that commit in to this PR for review together. One remaining question on the backend side of this change — what's the right way to expose, from backend-to-frontend, whether or not the frontend should bother trying to call validate_sql_json at all for a given selected db? In other words, how do I expose the db engine config map (or something derived from it) to the frontend so we don't spam the endpoint with calls that will always error out? |
`tox -e flake8` is clean now
validate_sql_json
endpoint for checking that a given sql query is valid for the chosen database
It's convinced these are the wrong type. They aren't.
validate_sql_json
endpoint for checking that a given sql query is valid for the chosen databasevalidate_sql_json
endpoint for checking that a given sql query is valid for the chosen database
Codecov Report
@@ Coverage Diff @@
## lyft-release-sp8 #7422 +/- ##
===================================================
+ Coverage 64.96% 65.1% +0.13%
===================================================
Files 430 433 +3
Lines 21164 21290 +126
Branches 2342 2342
===================================================
+ Hits 13750 13861 +111
- Misses 7290 7305 +15
Partials 124 124
Continue to review full report at Codecov.
|
@DiggidyDave @mistercrunch @xtinec @john-bodley I think all the review comments are addressed here — does anybody have any other issues to raise? |
👍 |
validate_sql_json
endpoint for checking that a given sql query is valid for the chosen databasevalidate_sql_json
endpoint for checking that a given sql query is valid for the chosen database
This builds on apache#7422 to build check-as-you-type sql query validation in Sql Lab. This closes apache#6707 too. It adds a (debounced) call to the validate_sql_json API endpoint with the querytext, and on Lyft infra is able to return feedback to the user (end to end) in $TBD seconds. At present feedback is provided only through the "annotations" mechanism build in to ACE, although I'd be open to adding full text elsewhere on the page if there's interest.
…query is valid for the chosen database (apache#7422) merge from lyft-release-sp8 to master
…query is valid for the chosen database (apache#7422) merge from lyft-release-sp8 to master
* [WIP] Live query validation, where supported This builds on #7422 to build check-as-you-type sql query validation in Sql Lab. This closes #6707 too. It adds a (debounced) call to the validate_sql_json API endpoint with the querytext, and on Lyft infra is able to return feedback to the user (end to end) in $TBD seconds. At present feedback is provided only through the "annotations" mechanism build in to ACE, although I'd be open to adding full text elsewhere on the page if there's interest. * fix: Unbreak lints and tests
* filter out all nan series (#7313) * improve not rich tooltip (#7345) * Create issue_label_bot.yaml (#7341) * fix: do not save colors without a color scheme (#7347) * [wtforms] Strip leading/trailing whitespace (#7084) * [schema] Updating the datasources schema (#5451) * limit tables/views returned if schema is not provided (#7358) * limit tables/views returned if schema is not provided * fix typo * improve code performance * handle the case when table name or view name does not present a schema * Add type anno (#7342) * Updated local dev instructions to include missing step * First pass at type annotations * [schema] Updating the base column schema (#5452) * Update 937d04c16b64_update_datasources.py (#7361) * Feature flag for client cache (#7348) * Feature flag for client cache * Fix integration test * Revert "Fix integration test" This reverts commit 58434ab. * Feature flag for client cache * Fix integration tests * Add feature flag to config.py * Add another feature check * Fix more integration tests * Fix raw HTML in SliceAdder (#7338) * remove backendSync.json (#7331) * [bubbles] issue when using duplicated metrics (#7087) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359) * SUPERSET-8: Update text in docs copyright footer (#7360) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 * SUPERSET-8: Extra text in docs copyright footer * [schema] Adding commits and removing unnecessary foreign-key definitions (#7371) * Store last selected dashboard in sessionStorage (#7181) * Store last selected dashboard in sessionStorage * Fix tests * [schema] Updating the base metric schema (#5453) * Fix NoneType bug & fill the test recipients with original recipients if empty (#7365) * Added living goods as among the users of Superset (#7407) * Added living goods as among the users of Superset Living Goods is a non profit organisation with operation in africa and the middle east. We work in community health use data heavily on day to day. Superset is our platform of choice for dashboards. * Update README.md * [dashboard] allow user re-order top-level tabs (#7390) * [SQL Lab] Increase timeout threshold for offline check (#7411) * Bump FAB to 2.0.0 (#7323) * Bump FAB to 2.0.0 * [tests] whitelist SecurityApi login and refresh endpoints * [style] Fix, C812 missing trailing commas * [security] Remove SUPERSET_UPDATE_PERMS flag Registering sources needs to be performed after the views are initialized on UPDATE_PERMS=False configuration * [docs] New, FAB_UPDATE_PERMS and flask fab cli * [docs] Fix, db upgrade needs to come first, create-admin needs a db * [cli] New, superset init bootstraps all permissions for FAB and Superset * [style] Fix, flakes * [annotations] Improves UX on annotation validation, start_dttm, end_dttm (#7326) * Setting renderTrigger on label_colors (#7410) * Refactor out controlUtils.js module + unit tests (#7350) * [WiP]refactor out a controlUtils.js file * unit tests * add missing license * Addressing comments * feature: see Presto row and array data types (#7413) * Merge lastest from master into lyft-release-sp8 (#7405) * filter out all nan series (#7313) * improve not rich tooltip (#7345) * Create issue_label_bot.yaml (#7341) * fix: do not save colors without a color scheme (#7347) * [wtforms] Strip leading/trailing whitespace (#7084) * [schema] Updating the datasources schema (#5451) * limit tables/views returned if schema is not provided (#7358) * limit tables/views returned if schema is not provided * fix typo * improve code performance * handle the case when table name or view name does not present a schema * Add type anno (#7342) * Updated local dev instructions to include missing step * First pass at type annotations * [schema] Updating the base column schema (#5452) * Update 937d04c16b64_update_datasources.py (#7361) * Feature flag for client cache (#7348) * Feature flag for client cache * Fix integration test * Revert "Fix integration test" This reverts commit 58434ab. * Feature flag for client cache * Fix integration tests * Add feature flag to config.py * Add another feature check * Fix more integration tests * Fix raw HTML in SliceAdder (#7338) * remove backendSync.json (#7331) * [bubbles] issue when using duplicated metrics (#7087) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359) * SUPERSET-8: Update text in docs copyright footer (#7360) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 * SUPERSET-8: Extra text in docs copyright footer * [schema] Adding commits and removing unnecessary foreign-key definitions (#7371) * Store last selected dashboard in sessionStorage (#7181) * Store last selected dashboard in sessionStorage * Fix tests * [schema] Updating the base metric schema (#5453) * Fix NoneType bug & fill the test recipients with original recipients if empty (#7365) * feat: see Presto row and array data types (#7391) * feat: see Presto row and array data types * fix: address PR comments * fix: lint and build issues * fix: add types * add stronger type hints where possible * fix: lint issues and add select_star func in Hive * add missing pkg init * fix: build issues * fix: pylint issues * fix: use logging instead of print * Removed --console-log and superset runserver (#7421) * Fixes dashboard export button missing download and #7353 (#7427) * Added additional German translations to string file (#6604) * Added additional German translations to string file Updates to German translation files as per directions * Removed messages.json * [fix] Fixing SQL parsing issue (#7374) * add chinese translate (#7402) * Quick fix to address deadlock issue (#7434) * feat: view presto row objects in data grid (#7445) * Merge lastest from master into lyft-release-sp8 (#7405) * filter out all nan series (#7313) * improve not rich tooltip (#7345) * Create issue_label_bot.yaml (#7341) * fix: do not save colors without a color scheme (#7347) * [wtforms] Strip leading/trailing whitespace (#7084) * [schema] Updating the datasources schema (#5451) * limit tables/views returned if schema is not provided (#7358) * limit tables/views returned if schema is not provided * fix typo * improve code performance * handle the case when table name or view name does not present a schema * Add type anno (#7342) * Updated local dev instructions to include missing step * First pass at type annotations * [schema] Updating the base column schema (#5452) * Update 937d04c16b64_update_datasources.py (#7361) * Feature flag for client cache (#7348) * Feature flag for client cache * Fix integration test * Revert "Fix integration test" This reverts commit 58434ab. * Feature flag for client cache * Fix integration tests * Add feature flag to config.py * Add another feature check * Fix more integration tests * Fix raw HTML in SliceAdder (#7338) * remove backendSync.json (#7331) * [bubbles] issue when using duplicated metrics (#7087) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359) * SUPERSET-8: Update text in docs copyright footer (#7360) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 * SUPERSET-8: Extra text in docs copyright footer * [schema] Adding commits and removing unnecessary foreign-key definitions (#7371) * Store last selected dashboard in sessionStorage (#7181) * Store last selected dashboard in sessionStorage * Fix tests * [schema] Updating the base metric schema (#5453) * Fix NoneType bug & fill the test recipients with original recipients if empty (#7365) * feat: see Presto row and array data types (#7391) * feat: see Presto row and array data types * fix: address PR comments * fix: lint and build issues * fix: add types * Incorporate feedback from initial PR (prematurely merged to lyft-release-sp8) (#7415) * add stronger type hints where possible * fix: lint issues and add select_star func in Hive * add missing pkg init * fix: build issues * fix: pylint issues * fix: use logging instead of print * feat: view presto row objects in data grid * fix: address feedback * fix: spacing * feat: Scheduling queries from SQL Lab (#7416) (#7446) * Merge lastest from master into lyft-release-sp8 (#7405) * filter out all nan series (#7313) * improve not rich tooltip (#7345) * Create issue_label_bot.yaml (#7341) * fix: do not save colors without a color scheme (#7347) * [wtforms] Strip leading/trailing whitespace (#7084) * [schema] Updating the datasources schema (#5451) * limit tables/views returned if schema is not provided (#7358) * limit tables/views returned if schema is not provided * fix typo * improve code performance * handle the case when table name or view name does not present a schema * Add type anno (#7342) * Updated local dev instructions to include missing step * First pass at type annotations * [schema] Updating the base column schema (#5452) * Update 937d04c16b64_update_datasources.py (#7361) * Feature flag for client cache (#7348) * Feature flag for client cache * Fix integration test * Revert "Fix integration test" This reverts commit 58434ab. * Feature flag for client cache * Fix integration tests * Add feature flag to config.py * Add another feature check * Fix more integration tests * Fix raw HTML in SliceAdder (#7338) * remove backendSync.json (#7331) * [bubbles] issue when using duplicated metrics (#7087) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359) * SUPERSET-8: Update text in docs copyright footer (#7360) * SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 * SUPERSET-8: Extra text in docs copyright footer * [schema] Adding commits and removing unnecessary foreign-key definitions (#7371) * Store last selected dashboard in sessionStorage (#7181) * Store last selected dashboard in sessionStorage * Fix tests * [schema] Updating the base metric schema (#5453) * Fix NoneType bug & fill the test recipients with original recipients if empty (#7365) * feat: see Presto row and array data types (#7391) * feat: see Presto row and array data types * fix: address PR comments * fix: lint and build issues * fix: add types * Incorporate feedback from initial PR (prematurely merged to lyft-release-sp8) (#7415) * add stronger type hints where possible * fix: lint issues and add select_star func in Hive * add missing pkg init * fix: build issues * fix: pylint issues * fix: use logging instead of print * feat: view presto row objects in data grid * fix: address feedback * fix: spacing * Workaround for no results returned (#7442) * feat: view presto row objects in data grid (#7436) * feat: view presto row objects in data grid * fix: address feedback * fix: spacing * feat: Scheduling queries from SQL Lab (#7416) * Lightweight pipelines POC * Add docs * Minor fixes * Remove Lyft URL * Use enum * Minor fix * Fix unit tests * Mark props as required * feat: Add `validate_sql_json` endpoint for checking that a given sql query is valid for the chosen database (#7422) (#7462) merge from lyft-release-sp8 to master * Adds missing metric sum__SP_RUR_TOTL (#7452) * Late import for optional lib pyhive (#7471) * Late import for optional lib pyhive * fix * fix: calendar heatmap examples (#7375) Fixing a set of examples that trip on ValueError vs TypeError * bugfix: Improve support for special characters in schema and table names (#7297) * Bugfix to SQL Lab to support tables and schemas with characters that require quoting * Remove debugging prints * Add uri encoding to secondary tables call * Quote schema names for presto * Quote selected_schema on Snowflake, MySQL and Hive * Remove redundant parens * Add python unit tests * Add js unit test * Fix flake8 linting error * [dashboard] After update filter, trigger new queries when charts are visible (#7233) * trigger query when chart is visible * add integration test * fix: alter sql columns to long text #7463 (#7476) Merge lyft-release-sp8@7bfe7bc to master * Refactor ConsoleLog (#7428) * Revised Chinese translation (#7464) * add chinese translate * edit chinese translation * druid connector: avoid using 'dimensions' for scan queries (#7377) After the following PyDruid change (contained in version 0.5.2) the Superset Histogram charts rendered with Druid data are broken: druid-io/pydruid@0a59a70 Bump the pydruid requirements accordingly in setup.py Issue: #7368
* [WIP] Live query validation, where supported This builds on apache#7422 to build check-as-you-type sql query validation in Sql Lab. This closes apache#6707 too. It adds a (debounced) call to the validate_sql_json API endpoint with the querytext, and on Lyft infra is able to return feedback to the user (end to end) in $TBD seconds. At present feedback is provided only through the "annotations" mechanism build in to ACE, although I'd be open to adding full text elsewhere on the page if there's interest. * fix: Unbreak lints and tests
* [WIP] Live query validation, where supported This builds on #7422 to build check-as-you-type sql query validation in Sql Lab. This closes #6707 too. It adds a (debounced) call to the validate_sql_json API endpoint with the querytext, and on Lyft infra is able to return feedback to the user (end to end) in $TBD seconds. At present feedback is provided only through the "annotations" mechanism build in to ACE, although I'd be open to adding full text elsewhere on the page if there's interest. * fix: Unbreak lints and tests
) * [WIP] Live query validation, where supported This builds on apache#7422 to build check-as-you-type sql query validation in Sql Lab. This closes apache#6707 too. It adds a (debounced) call to the validate_sql_json API endpoint with the querytext, and on Lyft infra is able to return feedback to the user (end to end) in $TBD seconds. At present feedback is provided only through the "annotations" mechanism build in to ACE, although I'd be open to adding full text elsewhere on the page if there's interest. * fix: Unbreak lints and tests
* [WIP] Live query validation, where supported This builds on #7422 to build check-as-you-type sql query validation in Sql Lab. This closes #6707 too. It adds a (debounced) call to the validate_sql_json API endpoint with the querytext, and on Lyft infra is able to return feedback to the user (end to end) in $TBD seconds. At present feedback is provided only through the "annotations" mechanism build in to ACE, although I'd be open to adding full text elsewhere on the page if there's interest. * fix: Unbreak lints and tests
CATEGORY
Choose one
SUMMARY
This is the first step toward supporting live as-you-type
SQL validation against databases that support doing it (for
starters just presto).
I'm trying to stick as much to the existing query machinery
as possible to avoid introducing too much new state for handling
this, so I plumbed it through as a db spec parameter and an
option you can set on an individual query.
As of this commit:
TEST PLAN
Tests incoming in another commit on this PR
ADDITIONAL INFORMATION
REVIEWERS
@xtinec @betodealmeida @khtruong