-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug report] list-table api is very slow when table quantity is very large #4089
Comments
I found that using this getTableObjectsByName mainly to filter out inner and outer tables, as well as to filter out iceberg tables. |
we have tested id。 when I excute "show tables" in hive |
|
sorry, i will send you a new one |
I have tryed the listTableNamesByFilter inteface to filter iceberg table。It is a feasible approach. but I did not pay attention to filter the manager and external table,I dont know the point of filtering manager and external table. Great! I think we can work on this way. WDYT? @jerryshao @FANNG1 |
I think it's ok, because this method seems extensible and not only works for filter Iceberg tables. |
Hi @mygrsun , is there any progress? Can I assign this issue to you? |
…ble list (#4469) ### What changes were proposed in this pull request? the problem of slow acquisition of hive table list. Using listTableNamesByFilter replace the getTableObjectsByName method. ### Why are the changes needed? I found that list-table will takes 300s when a schema has 5000 tables . Fix: #4089 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? Manual testing --------- Co-authored-by: ericqin <[email protected]> Co-authored-by: mchades <[email protected]>
…ble list (#5439) ### What changes were proposed in this pull request? the problem of slow acquisition of hive table list. Using listTableNamesByFilter replace the getTableObjectsByName method. ### Why are the changes needed? I found that list-table will takes 300s when a schema has 5000 tables . Fix: #4089 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? Manual testing Co-authored-by: mygrsun <[email protected]> Co-authored-by: ericqin <[email protected]> Co-authored-by: mchades <[email protected]>
…ive table list (apache#4469) ### What changes were proposed in this pull request? the problem of slow acquisition of hive table list. Using listTableNamesByFilter replace the getTableObjectsByName method. ### Why are the changes needed? I found that list-table will takes 300s when a schema has 5000 tables . Fix: apache#4089 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? Manual testing --------- Co-authored-by: ericqin <[email protected]> Co-authored-by: mchades <[email protected]>
Version
main branch
Describe what's wrong
Through my test,I found that list-table will takes 300s when a schema has 5000 tables .
I analysis the code and add some logs ,then found is the reason for calling the getTableObjectsByName interface.
listtable use the getTableObjectsByName .this metatore interface is very slow.
Error message and/or stacktrace
I add some logs at 3 positions.
the result is:
How to reproduce
add 5000 tables to one schema
Additional context
No response
The text was updated successfully, but these errors were encountered: