-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement] Add iceberg metadata cache and support manifest file content cache #22336
Conversation
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
return icebergMetadataCache; | ||
} | ||
|
||
public static IcebergMetadataCacheMgr get() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why using Singleton?
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
private void initIcebergTableFileIO(Table table) { | ||
Map<String, String> ioConf = new HashMap<>(); | ||
table.properties().forEach((key, value) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is table.properties() and CatalogProperty different? Is it missing some properties such as s3 or hdfs HA?
…ntent cache (apache#22336) Cache the iceberg table. When accessing the same table, the metadata will only be loaded once. Cache the snapshot of the table to optimize the performance of the iceberg table function. Add cache support for iceberg's manifest file content a simple test from 2.0s to 0.8s before mysql> refresh table tb3; Query OK, 0 rows affected (0.03 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | | 3 | a | c | .... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (2.10 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | | 3 | a | c | ... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (2.00 sec) after mysql> refresh table tb3; Query OK, 0 rows affected (0.03 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | ... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (2.05 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | | 3 | a | c | ... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (0.80 sec)
…ntent cache (#22336) Cache the iceberg table. When accessing the same table, the metadata will only be loaded once. Cache the snapshot of the table to optimize the performance of the iceberg table function. Add cache support for iceberg's manifest file content a simple test from 2.0s to 0.8s before mysql> refresh table tb3; Query OK, 0 rows affected (0.03 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | | 3 | a | c | .... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (2.10 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | | 3 | a | c | ... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (2.00 sec) after mysql> refresh table tb3; Query OK, 0 rows affected (0.03 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | ... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (2.05 sec) mysql> select * from tb3; +------+------+------+ | id | par | data | +------+------+------+ | 1 | a | a | | 2 | a | b | | 3 | a | c | ... | 68 | a | a | | 69 | a | b | | 70 | a | c | +------+------+------+ 70 rows in set (0.80 sec)
Proposed changes
a simple test from 2.0s to 0.8s
before
after
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...