Replies: 3 comments
-
In the scenario of sparse table, since the inline compaction will impact the performance of writing data, we prefer to use the dedicated compaction. So we need a companion job for each sparse paimon table. And the need of management those companion job is increased. We want to know the running status of dedicated compaction job, including the performance of compaction, how many small files be compacted to large files and how long it takes, can it catch up with the speed of writing? etc. Hope Amoro can support online compaction for paimon, just like it supports iceberg now. |
Beta Was this translation helpful? Give feedback.
-
If AMS supports online compaction of Paimon tables, we should also consider the following issues:
|
Beta Was this translation helpful? Give feedback.
-
I think AMS should manage the compaction of all paimon tables. |
Beta Was this translation helpful? Give feedback.
-
Currently, there are two main forms of compaction for Paimon. One is inline compaction, which is embedded in the write task. The other is dedicated compaction, which starts a separate Flink task to perform compaction for a specific table. If a table has multiple writes, such as partial updates, then dedicated compaction is required.
Therefore, initiating a discussion on online compaction means discovering tables that need to be compacted through AMS and scheduling them to a shared resource for compaction merging. Paimon users are welcome to provide feedback on whether this is necessary.
Beta Was this translation helpful? Give feedback.
All reactions