Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement]: There are slow SQL issues when querying the optimizing process #2869

Open
3 tasks done
Tracked by #3048
XBaith opened this issue May 28, 2024 · 4 comments
Open
3 tasks done
Tracked by #3048

Comments

@XBaith
Copy link
Contributor

XBaith commented May 28, 2024

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

Improve query performance
When I execute the below SQL statement, it cost over 1min:

SELECT a.process_id, a.table_id, a.catalog_name, a.db_name, a.table_name, a.target_snapshot_id,
       a.target_change_snapshot_id, a.status, a.optimizing_type, a.plan_time, a.end_time,
       a.fail_reason, a.summary, a.from_sequence, a.to_sequence 
FROM table_optimizing_process a
INNER JOIN table_identifier b ON a.table_id = b.table_id
WHERE a.catalog_name = 'glue' 
  AND a.db_name = 'xxx' 
  AND a.table_name = 'deprecated_table_version_1716357896'
  AND b.catalog_name = 'glue' 
  AND b.db_name = 'xxx' 
  AND b.table_name = 'deprecated_table_version_1716357896'
ORDER BY process_id DESC;

The execute plan is:

id                       |estRows  |task     |access object                                                                    |operator info                                                                                                                                                                                                                          |
-------------------------+---------+---------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Sort_10                  |88.57    |root     |                                                                                 |bts_arctic.table_optimizing_process.process_id:desc                                                                                                                                                                                    |
└─HashJoin_28            |88.57    |root     |                                                                                 |inner join, equal:[eq(bts_arctic.table_identifier.table_id, bts_arctic.table_optimizing_process.table_id)]                                                                                                                             |
  ├─Point_Get_29(Build)  |1.00     |root     |table:table_identifier, index:table_name_index(catalog_name, db_name, table_name)|                                                                                                                                                                                                                                       |
  └─TableReader_32(Probe)|88.57    |root     |                                                                                 |data:Selection_31                                                                                                                                                                                                                      |
    └─Selection_31       |88.57    |cop[tikv]|                                                                                 |eq(bts_arctic.table_optimizing_process.catalog_name, "glue"), eq(bts_arctic.table_optimizing_process.db_name, "xxx"), eq(bts_arctic.table_optimizing_process.table_name, "deprecated_table_version_1716357896")|
      └─TableFullScan_30 |132215.00|cop[tikv]|table:a                                                                          |keep order:false                                                                                                                                                                                                                       |

How should we improve?

Add indexes to avoid full table scans

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

@XBaith XBaith changed the title [Improvement]: There is are slow SQL issues when querying the optimizing process [Improvement]: There are slow SQL issues when querying the optimizing process May 28, 2024
@czy006 czy006 added this to the Release 0.8.0 milestone Jul 12, 2024
@XBaith
Copy link
Contributor Author

XBaith commented Sep 20, 2024

Before adding index:

id                       |estRows  |task     |access object                                                                    |operator info                                                                                                                                                                                                      |
-------------------------+---------+---------+---------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Sort_8                   |471.80   |root     |                                                                                 |arctic.table_optimizing_process.process_id:desc                                                                                                                                                                    |
└─HashJoin_19            |471.80   |root     |                                                                                 |inner join, equal:[eq(arctic.table_identifier.table_id, arctic.table_optimizing_process.table_id)]                                                                                                                 |
  ├─Point_Get_20(Build)  |1.00     |root     |table:table_identifier, index:table_name_index(catalog_name, db_name, table_name)|                                                                                                                                                                                                                   |
  └─TableReader_23(Probe)|471.80   |root     |                                                                                 |data:Selection_22                                                                                                                                                                                                  |
    └─Selection_22       |471.80   |cop[tikv]|                                                                                 |eq(arctic.table_optimizing_process.catalog_name, "dls_dfw_iceberg"), eq(arctic.table_optimizing_process.db_name, "pda_sparkdata"), eq(arctic.table_optimizing_process.table_name, "bc_usprod_kafka_sa_uctg_v1_raw")|
      └─TableFullScan_21 |248991.00|cop[tikv]|table:a                                                                          |keep order:false                                                                                                                                                                                                   |

After Adding index

id                            |estRows|task     |access object                                                                    |operator info                                                                                                                                                  |
------------------------------+-------+---------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
Sort_8                        |735.95 |root     |                                                                                 |arctic.table_optimizing_process.process_id:desc                                                                                                                |
└─HashJoin_19                 |735.95 |root     |                                                                                 |inner join, equal:[eq(arctic.table_identifier.table_id, arctic.table_optimizing_process.table_id)]                                                             |
  ├─Point_Get_20(Build)       |1.00   |root     |table:table_identifier, index:table_name_index(catalog_name, db_name, table_name)|                                                                                                                                                               |
  └─IndexLookUp_26(Probe)     |3131.00|root     |                                                                                 |                                                                                                                                                               |
    ├─IndexRangeScan_24(Build)|3131.00|cop[tikv]|table:a, index:idx_catalog_db_table(catalog_name, db_name, table_name)           |range:["dls_dfw_iceberg" "pda_sparkdata" "bc_usprod_kafka_sa_uctg_v1_raw","dls_dfw_iceberg" "pda_sparkdata" "bc_usprod_kafka_sa_uctg_v1_raw"], keep order:false|
    └─TableRowIDScan_25(Probe)|3131.00|cop[tikv]|table:a                                                                          |keep order:false                                                                                                                                               |

@klion26
Copy link
Member

klion26 commented Oct 24, 2024

@XBaith could you please help to check whether #3066 can help in this, thanks.

@XBaith
Copy link
Contributor Author

XBaith commented Nov 1, 2024

#3066

I think adding an index to the table accelerates it a bit better

@klion26
Copy link
Member

klion26 commented Nov 20, 2024

I think adding an index to the table accelerates it a bit better

Adding an index can improve performance and does not conflict with #3066. My previous reply was not to oppose adding indexes. looks forward to seeing the pr.

@Aireed Aireed mentioned this issue Dec 4, 2024
33 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants