Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: more specific presto error messages #11099

Merged
merged 1 commit into from
Sep 30, 2020

Conversation

etr2460
Copy link
Member

@etr2460 etr2460 commented Sep 29, 2020

SUMMARY

Add custom error messages, issue types, and documentation for invalid column and table errors in Presto. While these are Presto specific error types for now, if we start doing error message parsing for other db engines, then it might make sense to generalize them in the future.

Designs are from @Steejay's work, please review!

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Screen Shot 2020-09-28 at 9 18 11 PM

Screen Shot 2020-09-28 at 9 14 46 PM

TEST PLAN

Create a chart with an invalid column and see the correct error. Do the same with an invalid table

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

to: @john-bodley @villebro @graceguo-supercat
cc: @Steejay

@etr2460 etr2460 added the airbnb Airbnb related label Sep 29, 2020
@codecov-commenter
Copy link

Codecov Report

Merging #11099 into master will decrease coverage by 5.48%.
The diff coverage is 50.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #11099      +/-   ##
==========================================
- Coverage   65.84%   60.35%   -5.49%     
==========================================
  Files         816      383     -433     
  Lines       38510    24193   -14317     
  Branches     3621        0    -3621     
==========================================
- Hits        25355    14602   -10753     
+ Misses      13047     9591    -3456     
+ Partials      108        0     -108     
Flag Coverage Δ
#cypress ?
#javascript ?
#python 60.35% <50.00%> (-1.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/db_engine_specs/presto.py 69.76% <43.75%> (-12.16%) ⬇️
superset/errors.py 100.00% <100.00%> (ø)
superset/db_engines/hive.py 0.00% <0.00%> (-85.72%) ⬇️
superset/databases/commands/create.py 31.91% <0.00%> (-59.58%) ⬇️
superset/db_engine_specs/hive.py 53.90% <0.00%> (-30.08%) ⬇️
superset/views/database/mixins.py 59.64% <0.00%> (-22.81%) ⬇️
superset/databases/commands/update.py 85.71% <0.00%> (-8.17%) ⬇️
superset/databases/api.py 81.38% <0.00%> (-7.98%) ⬇️
superset/databases/dao.py 94.11% <0.00%> (-5.89%) ⬇️
... and 456 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e6f7d6...0c08776. Read the comment docs.

@etr2460 etr2460 force-pushed the erik-ritter--presto-errors branch from 0c08776 to 5b08d8e Compare September 29, 2020 04:51
@@ -43,3 +43,33 @@ This may be due to a syntax error, a bug in your query, or some other
internal failure within the database. This is usually not an
issue within Superset, but instead a problem with the underlying
database that serves your query.

## Issue 1003
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gather this is fine for now, i.e., the encoding can be mutated in the future, but I do wonder if issues should be engine specific and thus are identifiable, i.e., PRESTO-1001.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope was this this could actually be db agnostic, and we could use the same issue number for any db

There is a syntax error in the SQL query. Perhaps there was a misspelling or a typo.
```

Your query failed because of a syntax error within the underlying query. Please
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this file format but why do these messages span multiple lines, i.e., contain line breaks? Shouldn't they be on a single line (unless they're paragraphs) and the presentation layer wrap the lines accordingly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no clue, but i was matching the line breaks from earlier in the file. @pkdotson could you comment?

## Issue 1004

```
The column was changed in the underlying database structure.
Copy link
Member

@john-bodley john-bodley Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "changed" may be too general, changed name/changed type? I'm also not sure about the term "database structure" or whether "underlying" is necessary. Why not simply say,

The column no longer exists in the datasource.

Also could this error surface due to free form filters? If so the "no longer" may not be correct as it may have never existed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was from the designs that @Steejay provided. What do you think about the wording here Stephen?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley @etr2460 thanks for these comments. Yes, agreed. The more specific we can be the better the message. I was under the assumption that we couldn't always pin point the exact cause so I ended up using more generalized terms.

Does "The column no longer exists in the datasource." mean it has been deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it means it was deleted, or it never existed in the first place. so i don't think no longer is correct either

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Erik. We could break it down more and be more specific by listing more potential triggers.. though I'm hesitant to do that assuming the list can get really long.

We could also be more concise by changing the existing to: "The column was changed in the datasource." suggesting that the column still exists but there could have been a name or type change. Also adding another potential trigger/issue #: "The column doesn't exists in the datasource." suggesting the column was deleted or never existed in the first place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to change this to The column was deleted or renamed in the database.

I think that's more clear, while still being concise. i'll also update the documentation to cover the possibilities here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the column/table never existed, then i think that's covered under the syntax error/misspelling error

## Issue 1005

```
The table was changed in the underlying database structure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly why not,

The table no longer exists in the database.

@@ -26,6 +26,8 @@ export const ErrorTypeEnum = {

// DB Engine errors
GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR',
PRESTO_CANNOT_RESOLVE_COLUMN_ERROR: 'PRESTO_CANNOT_RESOLVE_COLUMN_ERROR',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terms "cannot resolve" and "does not exist" are quite different, i.e., the first is more generic. I wonder if it's worth examining how these errors can surface and ensuring that we use the most specific possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this was matching the wording from the presto error message. as mentioned before though, i wonder if this should be db engine agnostic and should just be COLUMN_DOES_NOT_EXIST_ERROR and TABLE_DOES_NOT_EXIST_ERROR?

@@ -55,6 +57,9 @@
# prevent circular imports
from superset.models.core import Database

COLUMN_NOT_RESOLVED_ERROR_REGEX = "line (.+?): .*Column '(.+?)' cannot be resolved"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this answered my previous question.

SupersetError(
error_type=SupersetErrorType.PRESTO_CANNOT_RESOLVE_COLUMN_ERROR,
message=__(
'We can\'t seem to resolve the column "%(column_name)s" at '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use a f string here? The %(column_name)s formatting is somewhat outdated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how flask_babel says to do string replacement in translation functions. I'm not sure this works with f strings by default (and am uncertain how to test): https://flask-babel.tkte.ch/#using-translations

SupersetError(
error_type=SupersetErrorType.PRESTO_TABLE_DOES_NOT_EXIST_ERROR,
message=__(
'The table "%(table_name)s" does not exist. '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See ^^.

@john-bodley
Copy link
Member

john-bodley commented Sep 29, 2020

@etr2460 I think it depends (per your comments) on whether the errors are supposed to be engine or Superset centric. If it’s the later then I think having universal errors makes sense though the wording probably needs to be more aligned with the Superset vernacular. Note per the screenshots it messages do state "Presto Error" and thus it does currently seem more engine centric.

That said my comments are non-blocking. I don't see there being any real issues if the wording of these errors changes over time not perceive there to be any major costs with updating/refactoring the code if needed.

@etr2460
Copy link
Member Author

etr2460 commented Sep 29, 2020

I think the end goal is probably to attribute the error to the correct source (e.g. Presto, Hive, Superset, etc.), but to use Superset's layer to clarify a raw error message. So I think aligning the wording with superset is probably the right thing to do here, I'll try to make things a bit more generic throughout (while still showing to the user what the root cause service was)

@etr2460 etr2460 force-pushed the erik-ritter--presto-errors branch from 5b08d8e to c0b4b45 Compare September 29, 2020 22:49
@etr2460 etr2460 force-pushed the erik-ritter--presto-errors branch from c0b4b45 to 1439aee Compare September 30, 2020 00:00
@etr2460 etr2460 merged commit fa5dab8 into apache:master Sep 30, 2020
auxten pushed a commit to auxten/incubator-superset that referenced this pull request Nov 20, 2020
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.38.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
airbnb Airbnb related 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 0.38.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants