-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
querier: Avoid global sort for dedup when possible + proposal. #5988
Conversation
d47710a
to
df2f146
Compare
Performing last benchmarks & tests, but otherwise good for review. |
9c668d5
to
ae40cda
Compare
Tests still seem to fail? 🤔 |
27f5d7e
to
29cb985
Compare
Should be fixed now. It's ready for your eyes (: |
71622ae
to
3df92ac
Compare
} | ||
|
||
// NewDedupResponseHeap returns a wrapper around ProxyResponseHeap that merged duplicated series messages into one. | ||
// It also deduplicates identical chunks identified by the same checksum from each series message. | ||
func NewDedupResponseHeap(h *ProxyResponseHeap) *dedupResponseHeap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this code need to change? From what I understand we just need to fix the dedup iterator so that it splits overlapping chunks into separate streams. I don't understand what the changed version of dedup heap does now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There was a bug in aggregating the series into one series with non-series responses in between.
- Two big complexity points (
Next
andAt
), instead of focusing complexity on one place (Next) and keeping At trivial (less cognitive load). - I found previous code less readable than my version (too ambiguous state, too many if else, complex deferred logic), but I guess it's a biased opinion. 🙈
MinTime: minTime, | ||
MaxTime: maxTime, | ||
SupportsSharding: true, | ||
SupportsWithoutReplicaLabels: false, // TODO(bwplotka): Add support for efficiency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not implement this for every store in the same PR? It should involve just calling a single function on the external labels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea... "just" and then 1000 lines to fix and have tests (: Next PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwplotka mind resolving conflicts so we can proceed with this?
Yes, will do this week! |
* Proposal * Removed deprecated fields from internal proxy series usage. Signed-off-by: bwplotka <[email protected]>
➜ store git:(pre-sort-auto) ✗ benchstat v1.txt v2.txt name old time/op new time/op delta SortWithoutLabels-12 4.02ms ± 2% 1.06ms ± 5% -73.54% (p=0.016 n=5+4) name old alloc/op new alloc/op delta SortWithoutLabels-12 1.04MB ± 0% 0.00MB ±13% -99.99% (p=0.029 n=4+4) name old allocs/op new allocs/op delta SortWithoutLabels-12 30.0k ± 0% 0.0k ± 0% -99.99% (p=0.000 n=5+4) Signed-off-by: bwplotka <[email protected]>
Signed-off-by: bwplotka <[email protected]>
Signed-off-by: bwplotka <[email protected]>
Signed-off-by: bwplotka <[email protected]>
Signed-off-by: bwplotka <[email protected]>
Signed-off-by: bwplotka <[email protected]>
Co-authored-by: Filip Petkovski <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]>
060d880
to
e6bdf1c
Compare
Done @fpetkovski , restarting (hopefully flaky) e2e test. |
I restarted the CI job again, and it looks like some test fails consistently. |
Hard to say which test... Not a flake I guess.
|
@bwplotka do you mind if I would take this over and try to finish this? Forced global sorting is a huge performance bottleneck 😄 |
Trying to fix this since it's so close. |
Ensure labels are ordered in each time series. Signed-off-by: Giedrius Statkevičius <[email protected]>
Signed-off-by: Giedrius Statkevičius <[email protected]>
Fixed the problem with tests: the root cause is that our test data was invalid - labels were not ordered hence grouping key generation in the PromQL engine was not working correctly. Also, I've merged the newest |
Signed-off-by: Giedrius Statkevičius <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Thank you so much for putting this over the line 💪🏽 |
…s-io#5988) * querier: Avoid global sort for dedup when possible. * Proposal * Removed deprecated fields from internal proxy series usage. Signed-off-by: bwplotka <[email protected]> * Optimized storeWithoutLabels. ➜ store git:(pre-sort-auto) ✗ benchstat v1.txt v2.txt name old time/op new time/op delta SortWithoutLabels-12 4.02ms ± 2% 1.06ms ± 5% -73.54% (p=0.016 n=5+4) name old alloc/op new alloc/op delta SortWithoutLabels-12 1.04MB ± 0% 0.00MB ±13% -99.99% (p=0.029 n=4+4) name old allocs/op new allocs/op delta SortWithoutLabels-12 30.0k ± 0% 0.0k ± 0% -99.99% (p=0.000 n=5+4) Signed-off-by: bwplotka <[email protected]> * Added back dedup with simple hack for tmp use. Signed-off-by: bwplotka <[email protected]> * Heap fix. Signed-off-by: bwplotka <[email protected]> * Dedup is now working on all dimensions. Signed-off-by: bwplotka <[email protected]> * Fixed tests. Signed-off-by: bwplotka <[email protected]> * Fixed tests. Signed-off-by: bwplotka <[email protected]> * Apply suggestions from code review Co-authored-by: Filip Petkovski <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]> * test/e2e: fix test Ensure labels are ordered in each time series. Signed-off-by: Giedrius Statkevičius <[email protected]> --------- Signed-off-by: bwplotka <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]> Signed-off-by: Giedrius Statkevičius <[email protected]> Co-authored-by: Filip Petkovski <[email protected]> Co-authored-by: Giedrius Statkevičius <[email protected]>
…s-io#5988) * querier: Avoid global sort for dedup when possible. * Proposal * Removed deprecated fields from internal proxy series usage. Signed-off-by: bwplotka <[email protected]> * Optimized storeWithoutLabels. ➜ store git:(pre-sort-auto) ✗ benchstat v1.txt v2.txt name old time/op new time/op delta SortWithoutLabels-12 4.02ms ± 2% 1.06ms ± 5% -73.54% (p=0.016 n=5+4) name old alloc/op new alloc/op delta SortWithoutLabels-12 1.04MB ± 0% 0.00MB ±13% -99.99% (p=0.029 n=4+4) name old allocs/op new allocs/op delta SortWithoutLabels-12 30.0k ± 0% 0.0k ± 0% -99.99% (p=0.000 n=5+4) Signed-off-by: bwplotka <[email protected]> * Added back dedup with simple hack for tmp use. Signed-off-by: bwplotka <[email protected]> * Heap fix. Signed-off-by: bwplotka <[email protected]> * Dedup is now working on all dimensions. Signed-off-by: bwplotka <[email protected]> * Fixed tests. Signed-off-by: bwplotka <[email protected]> * Fixed tests. Signed-off-by: bwplotka <[email protected]> * Apply suggestions from code review Co-authored-by: Filip Petkovski <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]> * test/e2e: fix test Ensure labels are ordered in each time series. Signed-off-by: Giedrius Statkevičius <[email protected]> --------- Signed-off-by: bwplotka <[email protected]> Signed-off-by: Bartlomiej Plotka <[email protected]> Signed-off-by: Giedrius Statkevičius <[email protected]> Co-authored-by: Filip Petkovski <[email protected]> Co-authored-by: Giedrius Statkevičius <[email protected]>
See proposal in Diff to learn more about this PR.
E2e benchmark looks good (check UI latency):
Without this PR:
With this PR: