-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sqllab): table list exceeds MAX_TABLE_NAMES #21262
fix(sqllab): table list exceeds MAX_TABLE_NAMES #21262
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21262 +/- ##
=======================================
Coverage 66.43% 66.43%
=======================================
Files 1784 1784
Lines 68185 68174 -11
Branches 7265 7265
=======================================
- Hits 45298 45293 -5
+ Misses 21018 21012 -6
Partials 1869 1869
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regrettably I don't think your logic is correct, nor whether fixing the bug is actually desirable, i.e., people likely aren't aware that truncating even exists.
superset/views/core.py
Outdated
if total_items and substr_parsed: | ||
max_tables = max_items * len(tables) // total_items | ||
max_views = max_items * len(views) // total_items | ||
max_tables = max_items or len(tables) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinpark this logic isn't quite right, the //
ensures that max_tables
+ max_views
= max_items
.
Furthermore this endpoint is only invoked here which implies that substr_parsed
is never truthy and the MAX_TABLE_NAMES
config doesn't actually get invoked.
I think we should actually refactor this method to remove the substr_parsed
logic and deprecate the MAX_TABLE_NAMES
, especially given that it's non deterministic in terms of how it truncates the list.
Furthermore it seems like schema_parsed
is always truthy (assuming I'm reading the frontend code correctly), i.e., per line #212 currentSchema
is truthy and never "undefined" which simplifies the logic and thus large swaths of this method can be refactored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@john-bodley you're right. it used to allow the empty schema_parsed
option before (#1466) but no longer allowed as far as the current FE logic exists.
And MAX_TABLE_NAMES
never invoked since substr_parsed
is never truthy.
I hope to restore MAX_TABLE_NAMES
option to fix our edge cases (>100k).
I posted updates to make max_items
= max_tables + max_views
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current FE doesn't pass substr_parsed
but I'll make FE changes to use substr_parsed
for typeahead on truncated list
SUMMARY
Since
max_tables
logic multiplies (would have intended toor
) max_items by the current size, it always returns entire list.(i.e.
config["MAX_TABLE_NAMES"] = 1000
andlen(tables) = 40000
thenmax_tables = 40000 * 1000 = 40,000,000
)The previous logic also cuts out only if substr passed.
This commit always follows the
MAX_TABLE_NAMES
for both cases. (must set MAX_TABLE_NAMES to 0 if full table list needed)BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A
TESTING INSTRUCTIONS
Set MAX_TABLE_NAMES to a small number
Go to Sqllab and check the count of table list
ADDITIONAL INFORMATION
cc: @john-bodley @ktmud