-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for compaction offloading #1545
Comments
looks great! |
I think in the first stage, we can treat the And we can start to desigin the specific |
Thanks for your suggestion! I think it's a good idea to first implement a simple working version. |
I think I guess the third step can be integration test, test is very important actually. |
Maybe you will be insterested when writing test |
## Rationale The subtask to support compaction offloading. See #1545 ## Detailed Changes **Compaction node support remote compaction service** - Define `CompactionServiceImpl` to support compaction rpc service. - Introduce `NodeType` to distinguish compaction node and horaedb node. Enable the deployment of compaction node. - Impl `compaction_client` for horaedb node to access remote compaction node. **Horaedb node support compaction offload** - Introduce `compaction_mode` in analytic engine's `Config` to determine whether exec compaction offload or not. - Define `CompactionNodePicker` trait, supporting get remote compaction node info. - Impl `RemoteCompactionRunner`, supporting pick remote node and pass compaction task to the node. - Add docs (e.g. `example-cluster-n.toml`) to explain how to deploy a cluster supporting compaction offload. ## Test Plan --------- Co-authored-by: kamille <[email protected]>
Describe This Problem
We found in production that the speed of sst compaction is unable to keep up with the speed of sst generation, leading to poor query performance. However we are unable give more resource to compaction to solve the problem because query/write is more important than compaction in the same node.
It is really hard to do a trade-off about resource allocation among query, write and compaction in lsm model. We want to compact the generated small ssts as fast as possible, but we can't tolerate its influence to query/write. And finally I think offload the compaction to the seperated nodes may be the key for it.
Proposal
The following is the architecture for compaction offloading.
To support compaction offloading, we need:
Additional Context
This issue replaces issue #1480. Please close issue #1480 as it is outdated.
incubator-horaedb-proto#133 is highly related to this issue.
The text was updated successfully, but these errors were encountered: