-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query: Improve deduplication of same series (e.g from receiver) #2303
Comments
Hello 👋 Looks like there was no activity on this issue for last 30 days. |
Essentially we can dedup on chunk level, without iterating over samples (!) in query. |
Also deduplicating exactly same series from store vs prometheus or many stores. This might mean it would be useful to move deduplication layer to the proxy.go |
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]>
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]>
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]>
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]>
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]>
…rting for StoreAPI. (#2603) * Deduplicate chunk dups on proxy StoreAPI level. Recommend chunk sorting for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized algorithm to combine series only on start. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized chunk comparision for overlaps. Signed-off-by: Bartlomiej Plotka <[email protected]>
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]>
…rting for StoreAPI + Optimized iter chunk dedup. (#2710) * Deduplicate chunk dups on proxy StoreAPI level. Recommend chunk sorting for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized algorithm to combine series only on start. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized chunk comparision for overlaps. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized deduplication for deduplicated chunk on query level as well. Never use proto .String() in fast path! Signed-off-by: Bartlomiej Plotka <[email protected]> # Conflicts: # CHANGELOG.md # pkg/store/storepb/custom.go # pkg/store/storepb/custom_test.go
…rting for StoreAPI + Optimized iter chunk dedup. (#2710) * Deduplicate chunk dups on proxy StoreAPI level. Recommend chunk sorting for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized algorithm to combine series only on start. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized chunk comparision for overlaps. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized deduplication for deduplicated chunk on query level as well. Never use proto .String() in fast path! Signed-off-by: Bartlomiej Plotka <[email protected]> # Conflicts: # CHANGELOG.md # pkg/store/storepb/custom.go # pkg/store/storepb/custom_test.go
…rting for StoreAPI + Optimized iter chunk dedup. (#2710) (#2711) * Deduplicate chunk dups on proxy StoreAPI level. Recommend chunk sorting for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized algorithm to combine series only on start. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized chunk comparision for overlaps. Signed-off-by: Bartlomiej Plotka <[email protected]> * Optimized deduplication for deduplicated chunk on query level as well. Never use proto .String() in fast path! Signed-off-by: Bartlomiej Plotka <[email protected]> # Conflicts: # CHANGELOG.md # pkg/store/storepb/custom.go # pkg/store/storepb/custom_test.go
Right now it works for both Prometheus replicas and receiver replicas, but there is some optimization we can make if we work on receiver only data.
The text was updated successfully, but these errors were encountered: