-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store/proxy: Deduplicate chunks on StoreAPI level. Recommend chunk sorting for StoreAPI. #2603
Conversation
973ad6a
to
8717698
Compare
715a3ed
to
4f3f53d
Compare
As expected there is some overhead of this. 3x CPU 1.5x Mem allocs. Overhead was expected but maybe not that significant. NOTE: Dataset does not have ANY duplicates, so we do in this function MORE, to do later, less (:
|
I think the strategy seems sane, but could we add a couple of benchmarks that prove the theory that this does indeed make the typical setup faster? |
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]>
4f3f53d
to
da3c83e
Compare
Signed-off-by: Bartlomiej Plotka <[email protected]>
da3c83e
to
f6004ab
Compare
Ok I optimized a lot and I think it's good to go. I can think of more optimizations, but it looks good.
NOTE: I just noticed this is the full overlapping benchmark. So it means 100% chunks in those "blocks" are overlapping. That's why comparison was so crucial (: This is normally a rare operation though. I extended tests with non overlap as well. Also this version is with |
34d9706
to
0ebf5e1
Compare
Ok last version uses LESS resources and is FASTER and adds MORE features =D This is compared to the code WITHOUT this PR.
Considering this done! |
Signed-off-by: Bartlomiej Plotka <[email protected]>
0ebf5e1
to
b6579bb
Compare
We got hit by my chaining practice again - it was not merged to master after all ): We need some bot helping or I can try modifying titles or something |
Blocked by: #2548
Partially fixes: #2303
It should make certain calls where we have overlapped data from StoreAPIs much faster and more efficient.
Changes:
select
. This allows for better dedup efficiency.Signed-off-by: Bartlomiej Plotka [email protected]