-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize postings fetching by checking postings and series size #6465
Conversation
9e4d533
to
139e3c2
Compare
f3eb423
to
9adfd9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a way to know finalSeriesMatched but in worst case we can assume it is 0.
Isn't that the best case that we don't have to fetch any series? Perhaps the worst case is when we have to fetch all of the series identified by each fetched series reference? But yeah, your formula sounds about right.
@GiedriusS Yeah you got what I meant. Worst/best case is relative to the two plans. |
I am going to rebase this on top of #6532 |
9adfd9c
to
294288f
Compare
9aa311e
to
3ead951
Compare
@GiedriusS I updated the pr a little bit more. Still need to add more test cases but it would be really appreciated if you could help review the algorithm first. |
882a333
to
7d56777
Compare
168f67b
to
936652e
Compare
daf9b43
to
32b21b6
Compare
removeKeys []string | ||
addAll bool | ||
name string | ||
matchers []*labels.Matcher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to worry about the matchers now on the merge
function?
Ex: cpu_utilization{pod="a.+", pod!="aaabb"}
will be converted in one postingGroup
but i dont see we are merging those matchers?
Line 2415 in 266a760
func (pg postingGroup) merge(other *postingGroup) *postingGroup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh ok..
Would it make more sense to add those matchers on the merge function? cause now you have a resulting merged posting group that is incomplete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I don't know if there is a good way to merge matchers as it requires additional sorting and probably a map to dedup.
27d04c8 I have changed the method to only merge keys
32b21b6
to
27d04c8
Compare
Is this ready for a review? |
@GiedriusS Yes it is ready for review. |
52f6cf6
to
0afc5d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, still haven't finished digging into the algorithm
if r == indexheader.NotFoundRange { | ||
continue | ||
} | ||
pg.cardinality += (r.End - r.Start - 4) / 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-4
because the postings length is encoded in the 4 bytes, right? Perhaps worth adding a comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment about the 4 bytes
@MichaHoffmann I also enabled lazy expanded postings in the labels acceptance test. |
Signed-off-by: Ben Ye <[email protected]>
Signed-off-by: Ben Ye <[email protected]>
… keys Signed-off-by: Ben Ye <[email protected]>
Signed-off-by: Ben Ye <[email protected]>
Signed-off-by: Ben Ye <[email protected]>
Signed-off-by: Ben Ye <[email protected]>
c547948
to
80685af
Compare
Added 2 more metrics.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Hi @GiedriusS, maybe you wanna take another look before I merge it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement, LGTM. Thank you for the comments in the code explaining what is happening
r, | ||
postingGroups, | ||
int64(r.block.estimatedMaxSeriesSize), | ||
0.5, // TODO(yeya24): Expose this as a flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps in the future we could use some kind of heuristics to set this automatically 🤔
…os-io#6465) * optimize postings fetching by checking postings and series size Signed-off-by: Ben Ye <[email protected]> * address some review comments Signed-off-by: Ben Ye <[email protected]> * add acceptance test and fixed bug of skipping posting groups with add keys Signed-off-by: Ben Ye <[email protected]> * add lazy postings param to block series clinet Signed-off-by: Ben Ye <[email protected]> * switch to use block estimated max series size Signed-off-by: Ben Ye <[email protected]> * added two more metrics Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]>
…os-io#6465) * optimize postings fetching by checking postings and series size Signed-off-by: Ben Ye <[email protected]> * address some review comments Signed-off-by: Ben Ye <[email protected]> * add acceptance test and fixed bug of skipping posting groups with add keys Signed-off-by: Ben Ye <[email protected]> * add lazy postings param to block series clinet Signed-off-by: Ben Ye <[email protected]> * switch to use block estimated max series size Signed-off-by: Ben Ye <[email protected]> * added two more metrics Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]>
* Cut release candidate `v0.32.0-rc.1` (#6630) * store: fix missing flush when handling pushed down queries (#6612) In the case that we have pushed down queries and internal labels that are overriden by external labels we are not flushing the sorted response. Signed-off-by: Michael Hoffmann <[email protected]> * Cut release candidate v0.32.0-rc.1 Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> * queryfrontend: fix explanation with query_range (#6633) * Cut final release for `v0.32.0` (#6634) * queryfrontend: fix explanation with query_range (#6633) Signed-off-by: Saswata Mukherjee <[email protected]> * Cut final release candidate for v0.32.0 Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]> Co-authored-by: Giedrius Statkevičius <[email protected]> * Correct version Signed-off-by: Saswata Mukherjee <[email protected]> * Update shepherd doc and fix release link Signed-off-by: Saswata Mukherjee <[email protected]> * Update CHANGELOG.md (#6640) The marked change requires that users set a security context so that mounted volumes (PVCs in particular) will be writable by the `thanos` user. Signed-off-by: verejoel <[email protected]> * store: fix error handling in decodePostings (#6650) Signed-off-by: Michael Hoffmann <[email protected]> * store: fix ignored error in postings (#6654) Signed-off-by: Michael Hoffmann <[email protected]> * Store: fix bufio pool handling (#6655) Signed-off-by: Michael Hoffmann <[email protected]> * Add `--disable-admin-operations` flag in Compactor UI and Bucket UI (#6646) * adding flags Signed-off-by: Harsh Pratap Singh <[email protected]> * adding docs Signed-off-by: Harsh Pratap Singh <[email protected]> * fixing tools.md Signed-off-by: Harsh Pratap Singh <[email protected]> * fixing tools.md Signed-off-by: Harsh Pratap Singh <[email protected]> * adding changelog Signed-off-by: Harsh Pratap Singh <[email protected]> * fixing changelog Signed-off-by: Harsh Pratap Singh <[email protected]> --------- Signed-off-by: Harsh Pratap Singh <[email protected]> * Fix mutable stringset memory usage (#6669) This commit fixes the Insert function for the mutable stringset to only insert unique labels instead of adding every label to the set. Signed-off-by: Filip Petkovski <[email protected]> * Cut patch release `v0.32.1` (#6670) * store: fix error handling in decodePostings (#6650) Signed-off-by: Michael Hoffmann <[email protected]> * store: fix ignored error in postings (#6654) Signed-off-by: Michael Hoffmann <[email protected]> * Store: fix bufio pool handling (#6655) Signed-off-by: Michael Hoffmann <[email protected]> * Fix mutable stringset memory usage (#6669) This commit fixes the Insert function for the mutable stringset to only insert unique labels instead of adding every label to the set. Signed-off-by: Filip Petkovski <[email protected]> * Cut patch release v0.32.1 Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> Co-authored-by: Filip Petkovski <[email protected]> * Update thanos engine and Prometheus dependencies (#6664) * Update thanos engine and Prometheus dependencies This commit bumps thanos/promql-engine to latest main and resolves breaking changes from the prometheus/prometheus dependency. Signed-off-by: Filip Petkovski <[email protected]> * Add changelog entry Signed-off-by: Filip Petkovski <[email protected]> * Avoid closing head more than once Signed-off-by: Filip Petkovski <[email protected]> * Remove call to t.TempDir() Signed-off-by: Filip Petkovski <[email protected]> --------- Signed-off-by: Filip Petkovski <[email protected]> * Cut patch release `v0.32.1` (#6670) (#6673) * store: fix error handling in decodePostings (#6650) * store: fix ignored error in postings (#6654) * Store: fix bufio pool handling (#6655) * Fix mutable stringset memory usage (#6669) This commit fixes the Insert function for the mutable stringset to only insert unique labels instead of adding every label to the set. * Cut patch release v0.32.1 --------- Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> Co-authored-by: Filip Petkovski <[email protected]> * store: fix race when iterating blocks (#6675) * build(deps): bump github.com/prometheus/alertmanager (#6671) Bumps [github.com/prometheus/alertmanager](https://github.com/prometheus/alertmanager) from 0.25.0 to 0.25.1. - [Release notes](https://github.com/prometheus/alertmanager/releases) - [Changelog](https://github.com/prometheus/alertmanager/blob/v0.25.1/CHANGELOG.md) - [Commits](https://github.com/prometheus/alertmanager/compare/v0.25.0...v0.25.1) --- updated-dependencies: - dependency-name: github.com/prometheus/alertmanager dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Build with Go 1.21 (#6615) * Build with Go 1.21 Signed-off-by: Saswata Mukherjee <[email protected]> * Update tools Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]> * store: add acceptance tests for label methods to bucket store (#6668) Signed-off-by: Michael Hoffmann <[email protected]> * store: Record stats even on ExpandPostings error (#6679) * Store: fix forgotten field in store stats merge (#6681) Signed-off-by: Michael Hoffmann <[email protected]> * Store: fix postings reader short reads (#6684) bufio.Reader can return less bytes than needed. Go documentation suggests to use io.ReadFull Signed-off-by: Michael Hoffmann <[email protected]> * Cut patch release `v0.32.2` (#6685) * store: fix race when iterating blocks (#6675) Signed-off-by: Saswata Mukherjee <[email protected]> * store: Record stats even on ExpandPostings error (#6679) Signed-off-by: Saswata Mukherjee <[email protected]> * Store: fix forgotten field in store stats merge (#6681) Signed-off-by: Michael Hoffmann <[email protected]> * Store: fix postings reader short reads (#6684) bufio.Reader can return less bytes than needed. Go documentation suggests to use io.ReadFull Signed-off-by: Michael Hoffmann <[email protected]> * Cut patch release v0.32.2 Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> * remove deprecated log.request.decision flag (#6686) * remove deprecated log.request.decision flag Signed-off-by: Coleen Iona Quadros <[email protected]> * add changelog Signed-off-by: Coleen Iona Quadros <[email protected]> --------- Signed-off-by: Coleen Iona Quadros <[email protected]> * Ruler: Add update label names routine for stateful ruler (#6689) Signed-off-by: Saswata Mukherjee <[email protected]> * Store: add some acceptance tests for label matching (#6691) Signed-off-by: Michael Hoffmann <[email protected]> * Store: fix regex matching with set that matches empty (#6692) Signed-off-by: Michael Hoffmann <[email protected]> * docs: Update lightstep link (#6694) * docs: Update lightstep link Signed-off-by: Saswata Mukherjee <[email protected]> * Add to mdox config Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]> * Store: add failing test for potential dedup issue (#6693) Signed-off-by: Michael Hoffmann <[email protected]> * Receive: Change write log level from warn to info (#6698) This commit moves several log lines from `warn` to `info`. These are non-recoverable/non-actionable situations, which mostly are captured by metrics such as `prometheus_tsdb_out_of_order_samples_total`. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Store: fix block dedup (#6697) Signed-off-by: Michael Hoffmann <[email protected]> * Query: Add pop-up when Explain Checkbox is disabled (#6662) * Added popup when hovering Signed-off-by: Luis Marques <[email protected]> * Small temp fixes Signed-off-by: Luis Marques <[email protected]> * Reverting temp changes Signed-off-by: Luis Marques <[email protected]> * Fixed pop-up Signed-off-by: Luis Marques <[email protected]> * Solved infinite loop caused by useState function Signed-off-by: Luis Marques <[email protected]> * reverted htmlFor Signed-off-by: Luis Marques <[email protected]> * Fixed the tests Signed-off-by: Luis Marques <[email protected]> * Small fixes Signed-off-by: Luis Marques <[email protected]> * Adding explanation to pop-up text Signed-off-by: Luís Marques <[email protected]> --------- Signed-off-by: Luis Marques <[email protected]> Signed-off-by: Luís Marques <[email protected]> * Optimize postings fetching by checking postings and series size (#6465) * optimize postings fetching by checking postings and series size Signed-off-by: Ben Ye <[email protected]> * address some review comments Signed-off-by: Ben Ye <[email protected]> * add acceptance test and fixed bug of skipping posting groups with add keys Signed-off-by: Ben Ye <[email protected]> * add lazy postings param to block series clinet Signed-off-by: Ben Ye <[email protected]> * switch to use block estimated max series size Signed-off-by: Ben Ye <[email protected]> * added two more metrics Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * compact: data corruption during downsapmle, test and fix. (#6598) * Samples to reproduce data corruption during downsapmle, tests and fix. Signed-off-by: Vasiliy Rumyantsev <[email protected]> * Samples to reproduce data corruption during downsapmle, tests and fix. Signed-off-by: Vasiliy Rumyantsev <[email protected]> * added test for chunk with NaN values only Signed-off-by: Vasiliy Rumyantsev <[email protected]> * CHANGELOG.md Signed-off-by: Vasiliy Rumyantsev <[email protected]> * added check for math.NaN Signed-off-by: Vasiliy Rumyantsev <[email protected]> * optimized NaN checking Signed-off-by: Vasiliy Rumyantsev <[email protected]> --------- Signed-off-by: Vasiliy Rumyantsev <[email protected]> * use single instance of typed error and use errors.Is() for comparison (#6719) Signed-off-by: Jake Keeys <[email protected]> * Ruler: Add alert source template (#6308) * Add alert source template in rule Signed-off-by: Zhuoyuan Liu <[email protected]> * Validate template in start phase Signed-off-by: Zhuoyuan Liu <[email protected]> * Move the start check to runrule Signed-off-by: Zhuoyuan Liu <[email protected]> * move the flag to config.go Signed-off-by: Zhuoyuan Liu <[email protected]> * Updates the docs Signed-off-by: Zhuoyuan Liu <[email protected]> * Add test for validateTemplate Signed-off-by: Zhuoyuan Liu <[email protected]> * Add new test case Signed-off-by: Zhuoyuan Liu <[email protected]> * Remove unnecessary variable Signed-off-by: Zhuoyuan Liu <[email protected]> * Add changelogs Signed-off-by: Zhuoyuan Liu <[email protected]> * Update CHANGELOG.md Signed-off-by: Matej Gera <[email protected]> --------- Signed-off-by: Zhuoyuan Liu <[email protected]> Signed-off-by: Matej Gera <[email protected]> Co-authored-by: Matej Gera <[email protected]> * Add Shipper bytes uploaded metric #6438 (#6544) * [FEAT] Add uploaded bytes metric Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Add PR number to log Signed-off-by: rita.canavarro <[email protected]> * [FIX] Log msg Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Clean code Signed-off-by: rita.canavarro <[email protected]> * [FIX] Remove shadow code Signed-off-by: rita.canavarro <[email protected]> * [FIX] Go format Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update objstore Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update objstore package Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update storage.md Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update erroring bucket Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update erroring bucket Signed-off-by: rita.canavarro <[email protected]> --------- Signed-off-by: rita.canavarro <[email protected]> * Update objstore library to latest main (#6722) This commit updates the obstore library to the latest main version which optimizes the Iter operation to only request object names. Signed-off-by: Filip Petkovski <[email protected]> * Store: store responses should always be sorted (#6706) * Store: always sort, just compare labelset in proxy heap Signed-off-by: Michael Hoffmann <[email protected]> * Store: add escape hatch to skip store resorting Signed-off-by: Michael Hoffmann <[email protected]> * Store: remove stringset This is the wrong approach to detect if we need to resort. It cannot detect if we might end up with an unsorted series set if we add extLabels. Signed-off-by: Michael Hoffmann <[email protected]> * Docs: drop paragraph about deduplication on inner labels Signed-off-by: Michael Hoffmann <[email protected]> --------- Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> * Updates busybox SHA (#6724) Signed-off-by: GitHub <[email protected]> Co-authored-by: fpetkovski <[email protected]> * Add BB as an Adopte (#6725) Signed-off-by: Fernando Vargas <[email protected]> Co-authored-by: C1323453 Fernando Vargas Teotonio De Oliveira <[email protected]> * add get_all_duration and merge_duration to SG query hints (#6730) Signed-off-by: Ben Ye <[email protected]> * Add absolute total download time metrics for series and chunks (#6726) * add metrics for absolute latency of loading series and chunks per block Signed-off-by: Ben Ye <[email protected]> * fix lint Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * fix bug when merging query stats for chunkFetchDurationSum Signed-off-by: Ben Ye <[email protected]> * add tests for stats merge Signed-off-by: Ben Ye <[email protected]> * Cut patch release `v0.32.3` (#6736) * Update thanos engine and Prometheus dependencies (#6664) * Update thanos engine and Prometheus dependencies This commit bumps thanos/promql-engine to latest main and resolves breaking changes from the prometheus/prometheus dependency. Signed-off-by: Filip Petkovski <[email protected]> * Add changelog entry Signed-off-by: Filip Petkovski <[email protected]> * Avoid closing head more than once Signed-off-by: Filip Petkovski <[email protected]> * Remove call to t.TempDir() Signed-off-by: Filip Petkovski <[email protected]> --------- Signed-off-by: Filip Petkovski <[email protected]> * build(deps): bump github.com/prometheus/alertmanager (#6671) Bumps [github.com/prometheus/alertmanager](https://github.com/prometheus/alertmanager) from 0.25.0 to 0.25.1. - [Release notes](https://github.com/prometheus/alertmanager/releases) - [Changelog](https://github.com/prometheus/alertmanager/blob/v0.25.1/CHANGELOG.md) - [Commits](https://github.com/prometheus/alertmanager/compare/v0.25.0...v0.25.1) --- updated-dependencies: - dependency-name: github.com/prometheus/alertmanager dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * store: add acceptance tests for label methods to bucket store (#6668) Signed-off-by: Michael Hoffmann <[email protected]> * Ruler: Add update label names routine for stateful ruler (#6689) Signed-off-by: Saswata Mukherjee <[email protected]> * Store: add some acceptance tests for label matching (#6691) Signed-off-by: Michael Hoffmann <[email protected]> * Store: fix regex matching with set that matches empty (#6692) Signed-off-by: Michael Hoffmann <[email protected]> * Store: add failing test for potential dedup issue (#6693) Signed-off-by: Michael Hoffmann <[email protected]> * Store: fix block dedup (#6697) Signed-off-by: Michael Hoffmann <[email protected]> * Add Shipper bytes uploaded metric #6438 (#6544) * [FEAT] Add uploaded bytes metric Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Add PR number to log Signed-off-by: rita.canavarro <[email protected]> * [FIX] Log msg Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Clean code Signed-off-by: rita.canavarro <[email protected]> * [FIX] Remove shadow code Signed-off-by: rita.canavarro <[email protected]> * [FIX] Go format Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update objstore Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update objstore package Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update storage.md Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update erroring bucket Signed-off-by: rita.canavarro <[email protected]> * [FEAT] Update erroring bucket Signed-off-by: rita.canavarro <[email protected]> --------- Signed-off-by: rita.canavarro <[email protected]> * Update objstore library to latest main (#6722) This commit updates the obstore library to the latest main version which optimizes the Iter operation to only request object names. Signed-off-by: Filip Petkovski <[email protected]> * Store: store responses should always be sorted (#6706) * Store: always sort, just compare labelset in proxy heap Signed-off-by: Michael Hoffmann <[email protected]> * Store: add escape hatch to skip store resorting Signed-off-by: Michael Hoffmann <[email protected]> * Store: remove stringset This is the wrong approach to detect if we need to resort. It cannot detect if we might end up with an unsorted series set if we add extLabels. Signed-off-by: Michael Hoffmann <[email protected]> * Docs: drop paragraph about deduplication on inner labels Signed-off-by: Michael Hoffmann <[email protected]> --------- Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> * Updates busybox SHA (#6724) Signed-off-by: GitHub <[email protected]> Co-authored-by: fpetkovski <[email protected]> * Cut patch release v0.32.3 Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: rita.canavarro <[email protected]> Signed-off-by: GitHub <[email protected]> Co-authored-by: Filip Petkovski <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Michael Hoffmann <[email protected]> Co-authored-by: Rita Canavarro <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: fpetkovski <[email protected]> * update objstore version to latest (#6746) Signed-off-by: Ben Ye <[email protected]> * update go alpine image to 3.18 (#6750) Signed-off-by: Coleen Iona Quadros <[email protected]> * StoreGateway: Add a metric to track block load duration (#6729) * BinaryIndexReader: always lookup name symbol first (#6741) * always lookup name symbol first Signed-off-by: Ben Ye <[email protected]> * add tests to verify Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * Add latency histogram for fetching index cache (#6749) * add latency histogram for fetching index cache Signed-off-by: Ben Ye <[email protected]> * update changelog Signed-off-by: Ben Ye <[email protected]> * use timer Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * Fix for mixin workflow actions rules check step failed cases (#6753) * Fix for mixin check step - rules.yaml Signed-off-by: l.preethvika <[email protected]> * Fixed the mixin rules with duplicate names Modified the mixin rules and changelog Signed-off-by: preethivika <[email protected]> * Update the promtool from v0.37.0 to v0.47.0 Signed-off-by: preethivika <[email protected]> * Update the promtool changelog Signed-off-by: preethivika <[email protected]> * Updated the promtool changelog Signed-off-by: preethivika <[email protected]> --------- Signed-off-by: l.preethvika <[email protected]> Signed-off-by: preethivika <[email protected]> Co-authored-by: l-preethvika <[email protected]> * Store: Don't hardcode series batch size (#6761) * not hardcode series batch size Signed-off-by: Ben Ye <[email protected]> * fix unit test Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * fix index fetch latency metric timer (#6758) Signed-off-by: Ben Ye <[email protected]> * added tls config in downstream query (#6760) * added tls config Signed-off-by: bazooka3000 <[email protected]> * docs Signed-off-by: bazooka3000 <[email protected]> * Update CHANGELOG.md Co-authored-by: Saswata Mukherjee <[email protected]> Signed-off-by: Dattatreya <[email protected]> * lint check Signed-off-by: bazooka3000 <[email protected]> --------- Signed-off-by: bazooka3000 <[email protected]> Signed-off-by: Dattatreya <[email protected]> Co-authored-by: Saswata Mukherjee <[email protected]> * Add improbable.io to mdox ignore (#6764) * Add improbable.io to mdox ignore Signed-off-by: Saswata Mukherjee <[email protected]> * Run make docs Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Saswata Mukherjee <[email protected]> * Cut patch release `v0.32.4` (#6763) * update objstore version to latest (#6746) Signed-off-by: Ben Ye <[email protected]> * Cut patch release v0.32.4 Signed-off-by: Saswata Mukherjee <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> Co-authored-by: Ben Ye <[email protected]> * Target Ui: Fixed responsiveness of Search Bar (#6642) * Target Ui: Fixed responsiveness of Search Bar Signed-off-by: Vanshika <[email protected]> * Rebuild Signed-off-by: Vanshika <[email protected]> * Rebuild Signed-off-by: Vanshika <[email protected]> --------- Signed-off-by: Vanshika <[email protected]> * Enabled Navbar to automatically close on navigation (#6656) * Enabled Navbar to automatically close on navigation Signed-off-by: Vanshika <[email protected]> * Rebuild Signed-off-by: Vanshika <[email protected]> --------- Signed-off-by: Vanshika <[email protected]> * Force Tracing : checkbox in query frontend to force a trace to be collected (#6770) * force tracing Signed-off-by: Vanshika <[email protected]> * force tracing Signed-off-by: Vanshika <[email protected]> * Rebuild Signed-off-by: Vanshika <[email protected]> * changes force Tracing Signed-off-by: Vanshika <[email protected]> --------- Signed-off-by: Vanshika <[email protected]> * Store: Add tenant label to exported metrics (#6690) * Store: Add tenant label to exported metrics With this commit we add a tenant label to relevant metrics exported by the store gateway. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Query: Don't hide tenant related cmd args As we now have some value of these args, with store metrics being enhanced with tenant information, we no longer hide these tenant flags. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Query: Make default-tenant flag match receive Ensure that the commandline flag matches what we currently have on receive. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Promclient: Use http.header type for headers Instead of using `map[string]string` for adding additional headers to requests in `req2xx`. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Store: Add warning about tenant label to changelog Adds a more clear warning to the Changelog regarding that the added tenant label could potentially cause issues for custom dashboards. Signed-off-by: Jacob Baungard Hansen <[email protected]> --------- Signed-off-by: Jacob Baungard Hansen <[email protected]> * StoreGateway: Partition index-header download (#6747) * Partition index-header download Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Use int division instead of float Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Ignore errors in close() Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Fix e2e Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Use disk to buffer parts of index-header Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Fix lint Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Renaming variables Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Increase partition size Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Fix e2e failures Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Refactoring Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Fix e2e Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Fix lint Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Fix e2e Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Cosmetic changes Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Address review comments Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> --------- Signed-off-by: 🌲 Harry 🌊 John 🏔 <[email protected]> * Support filtered index cache (#6765) * support filtered index cache Signed-off-by: Ben Ye <[email protected]> * changelog Signed-off-by: Ben Ye <[email protected]> * fix doc Signed-off-by: Ben Ye <[email protected]> * fix unit test failure Signed-off-by: Ben Ye <[email protected]> * add item type validation Signed-off-by: Ben Ye <[email protected]> * lint Signed-off-by: Ben Ye <[email protected]> * change enabled_items to []string type Signed-off-by: Ben Ye <[email protected]> * generate docs Signed-off-by: Ben Ye <[email protected]> * separate validation code Signed-off-by: Ben Ye <[email protected]> * fix lint Signed-off-by: Ben Ye <[email protected]> * update doc Signed-off-by: Ben Ye <[email protected]> * fix interface Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * use rwmutex for value symbols cache (#6778) Signed-off-by: Ben Ye <[email protected]> * *: bump prometheus and promql-engine (#6772) Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Ben Ye <[email protected]> * fix nil pointer bug when closing reader (#6781) Signed-off-by: Ben Ye <[email protected]> * Store Gateway: Allow skipping resorting (#6779) * allow skipping resorting in thanos eager respSet Signed-off-by: Ben Ye <[email protected]> * address comments Signed-off-by: Ben Ye <[email protected]> * fix unit test Signed-off-by: Ben Ye <[email protected]> * address review feedback Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * make index cache ttl configurable (#6773) Signed-off-by: Ben Ye <[email protected]> * bump prometheus to latest main (#6783) Signed-off-by: Ben Ye <[email protected]> * check context cancel in inmemory cache (#6788) Signed-off-by: Ben Ye <[email protected]> * Query Analysis (#6515) * Return Query Analysis in API A param is added to QueryAPI, if true then query analysis is returned by the method of the query having structure is returned in response. Signed-off-by: nishchay-veer <[email protected]> * Added analyze checkbox in Thanos UI A analyze checkbox is added to the thanos query api, that requests for operator telemetry which includes CPU Time Signed-off-by: nishchay-veer <[email protected]> * Return Query Analysis in API A param is added to QueryAPI, if true then query analysis is returned by the method of the query having structure is returned in response. Signed-off-by: nishchay-veer <[email protected]> * Added analyze checkbox in Thanos UI A analyze checkbox is added to the thanos query api, that requests for operator telemetry which includes CPU Time Signed-off-by: nishchay-veer <[email protected]> * Add query explain API Signed-off-by: Saswata Mukherjee <[email protected]> * Hooked queryTelemetry data into UI Signed-off-by: nishchay-veer <[email protected]> * /query_explain and /query_range_explain for explain-tree Signed-off-by: nishchay-veer <[email protected]> * update promql-engine Signed-off-by: nishchay-veer <[email protected]> * Execution time shows 0s Signed-off-by: nishchay-veer <[email protected]> * Show execution time of operators Signed-off-by: nishchay-veer <[email protected]> * Removing QueryExplainParam from query api Signed-off-by: nishchay-veer <[email protected]> * bad request format in Explain Signed-off-by: nishchay-veer <[email protected]> * Showing Expalin and Analyze Output Signed-off-by: nishchay-veer <[email protected]> * Added tooltip and different enpoints for table and graph queries Signed-off-by: nishchay-veer <[email protected]> * Linters pass Signed-off-by: nishchay-veer <[email protected]> * disable Explain when engine is 'prometheus' Signed-off-by: nishchay-veer <[email protected]> * passing query params to explain endpoints Signed-off-by: nishchay-veer <[email protected]> * fixed react test case failing Signed-off-by: nishchay-veer <[email protected]> * fix ui tests Signed-off-by: nishchay-veer <[email protected]> * fix some e2e test fails Signed-off-by: nishchay-veer <[email protected]> * added customised tooltip in place of Tooltip component Signed-off-by: nishchay-veer <[email protected]> * removed Tooltip from Panel Signed-off-by: nishchay-veer <[email protected]> * Linters pass Signed-off-by: nishchay-veer <[email protected]> * 4 arguments in QueryInstant Signed-off-by: nishchay-veer <[email protected]> * resolving conflicts -2 Signed-off-by: nishchay-veer <[email protected]> * resolving conflicts in Panel.tsx Signed-off-by: nishchay-veer <[email protected]> * adding checkbox Signed-off-by: nishchay-veer <[email protected]> * fixing linters fail Signed-off-by: nishchay-veer <[email protected]> --------- Signed-off-by: nishchay-veer <[email protected]> Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: Nishchay Veer <[email protected]> Co-authored-by: Saswata Mukherjee <[email protected]> * react-app/ListTree: only show symbol when analyze enabled (#6789) No need to show the symbol if analyze is disabled. It looks weird. Let's not do that. Signed-off-by: Giedrius Statkevičius <[email protected]> * test/e2e: fix same environment names (#6790) Two of the same names are used in e2e environment names. Fix this name clash. Signed-off-by: Giedrius Statkevičius <[email protected]> * Add dialer_timeout field to HTTP TransportConfig (#6786) * set dialer timeout to 5s in NewRoundTripperFromConfig Signed-off-by: Walther Lee <[email protected]> * add dialer_timeout field to HTTP TransportConfig Signed-off-by: Walther Lee <[email protected]> --------- Signed-off-by: Walther Lee <[email protected]> Co-authored-by: Walther Lee <[email protected]> * api/blocks: fix race between get/set (#6791) Running tests with -race shows that there is a race between bapi.blocks() and bapi.SetLoaded/SetGlobal() because the latter is called continuously and asynchronously in a different thread. blocks() is called through the HTTP API. Since block info is immutable, it is enough to add a lock here to fix this problem. Signed-off-by: Giedrius Statkevičius <[email protected]> * Bucket reader: Initialize new query stats struct at each goroutine (#6787) * initialize new query stats struct at each goroutine Signed-off-by: Ben Ye <[email protected]> * remove comment Signed-off-by: Ben Ye <[email protected]> * address feedback Signed-off-by: Ben Ye <[email protected]> * fix lint Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * use larger histogram bucket for thanos_bucket_store_series_result_series metric (#6792) Signed-off-by: Ben Ye <[email protected]> * api/query: create engines once (#6793) Fix a race where GetPrometheusEngine or GetThanosEngine is called twice at the same time from multiple HTTP requests. This fixes the race: ``` 10:29:50 querier-query: ================== 10:29:50 querier-query: WARNING: DATA RACE 10:29:50 querier-query: Write at 0x00c0005fa0f8 by goroutine 285: 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryEngineFactory).GetPrometheusEngine() 10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:105 +0x1f9 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).parseEngineParam() 10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:325 +0x109 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query() 10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:626 +0x605 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm() ... 10:29:50 querier-query: Previous read at 0x00c0005fa0f8 by goroutine 287: 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryEngineFactory).GetPrometheusEngine() 10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:101 +0x13d 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).parseEngineParam() 10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:325 +0x109 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query() 10:29:50 querier-query: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:626 +0x605 10:29:50 querier-query: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm() ... ``` Signed-off-by: Giedrius Statkevičius <[email protected]> * store/proxy: fix label values span (#6795) Each tracing.StartSpan() writes a value into the given context so there's a race if we keep reusing the same context. Fix this by starting a new span in each goroutine. This also makes logical sense. Fixes the following race: ``` 15:21:13 querier-1: WARNING: DATA RACE 15:21:13 querier-1: Read at 0x00c0009c5050 by goroutine 328: 15:21:13 querier-1: context.(*valueCtx).Value() 15:21:13 querier-1: /usr/local/go/src/context/context.go:751 +0x76 15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.newClientSpanFromContext() 15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/interceptors/tracing/client.go:87 +0x241 15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.(*opentracingClientReportable).ClientReporter() 15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/interceptors/tracing/client.go:51 +0x195 15:21:13 querier-1: github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/tracing.UnaryClientInterceptor.UnaryClientInterceptor.func1() 15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/interceptors/client.go:19 +0x1a9 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4.1.1() 15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/chain.go:74 +0x10a 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.(*ClientMetrics).UnaryClientInterceptor.func3() 15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/[email protected]/client_metrics.go:112 +0x126 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4.1.1() 15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/chain.go:74 +0x10a 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/extgrpc.StoreClientGRPCOpts.ChainUnaryClient.func4() 15:21:13 querier-1: /go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware/[email protected]/chain.go:83 +0x17b 15:21:13 querier-1: google.golang.org/grpc.(*ClientConn).Invoke() 15:21:13 querier-1: /go/pkg/mod/google.golang.org/[email protected]/call.go:35 +0x25d 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store/storepb.(*storeClient).LabelValues() 15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1034 +0xe5 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/query.(*endpointRef).LabelValues() 15:21:13 querier-1: <autogenerated>:1 +0xa1 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).LabelValues.func1() 15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:586 +0x323 15:21:13 querier-1: golang.org/x/sync/errgroup.(*Group).Go.func1() 15:21:13 querier-1: /go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x76 15:21:13 querier-1: Previous write at 0x00c0009c5050 by goroutine 325: 15:21:13 querier-1: context.WithValue() 15:21:13 querier-1: /usr/local/go/src/context/context.go:718 +0xce 15:21:13 querier-1: github.com/opentracing/opentracing-go.ContextWithSpan() 15:21:13 querier-1: /go/pkg/mod/github.com/opentracing/[email protected]/gocontext.go:17 +0xec 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/tracing.StartSpan() 15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/tracing/tracing.go:73 +0x238 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).LabelValues() 15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:567 +0xb25 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/query.(*querier).LabelValues() 15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/query/querier.go:422 +0x3f5 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).labelValues() 15:21:13 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:1092 +0x17d1 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).labelValues-fm() 15:21:13 querier-1: <autogenerated>:1 +0x45 15:21:13 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).Register.GetInstr.func1.1() ``` Signed-off-by: Giedrius Statkevičius <[email protected]> * compact: return metas copy from syncer (#6801) Return copy of the map because the compactor runs garbage collector concurrently that deletes entries from the original map. Fixes race: ``` 10:55:35 compact-working-dedup: ================== 10:55:35 compact-working-dedup: WARNING: DATA RACE 10:55:35 compact-working-dedup: Write at 0x00c001822150 by goroutine 220: 10:55:35 compact-working-dedup: runtime.mapdelete() 10:55:35 compact-working-dedup: /usr/local/go/src/runtime/map.go:696 +0x0 10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*Syncer).GarbageCollect() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:201 +0x324 10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*BucketCompactor).Compact() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:1422 +0x60f 10:55:35 compact-working-dedup: main.runCompact.func7() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:426 +0xfa 10:55:35 compact-working-dedup: main.runCompact.func8.1() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:481 +0x69 10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/runutil.Repeat() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3 10:55:35 compact-working-dedup: main.runCompact.func8() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:480 +0x224 10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func1() 10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:38 +0x39 10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func2() 10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:39 +0x4f 10:55:35 compact-working-dedup: Previous read at 0x00c001822150 by goroutine 223: 10:55:35 compact-working-dedup: runtime.mapiternext() 10:55:35 compact-working-dedup: /usr/local/go/src/runtime/map.go:867 +0x0 10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/compact.(*DefaultGrouper).Groups() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/compact/compact.go:289 +0xfd 10:55:35 compact-working-dedup: main.runCompact.func16.1() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:626 +0x4ae 10:55:35 compact-working-dedup: github.com/thanos-io/thanos/pkg/runutil.Repeat() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3 10:55:35 compact-working-dedup: main.runCompact.func16() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/compact.go:591 +0x3f9 10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func1() 10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:38 +0x39 10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run.func2() 10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:39 +0x4f 10:55:35 compact-working-dedup: Goroutine 220 (running) created at: 10:55:35 compact-working-dedup: github.com/oklog/run.(*Group).Run() 10:55:35 compact-working-dedup: /go/pkg/mod/github.com/oklog/[email protected]/group.go:37 +0xad 10:55:35 compact-working-dedup: main.main() 10:55:35 compact-working-dedup: /go/src/github.com/thanos-io/thanos/cmd/thanos/main.go:159 +0x2964 ``` Signed-off-by: Giedrius Statkevičius <[email protected]> * build(deps): bump golang.org/x/net from 0.14.0 to 0.17.0 (#6805) Bumps [golang.org/x/net](https://github.com/golang/net) from 0.14.0 to 0.17.0. - [Commits](https://github.com/golang/net/compare/v0.14.0...v0.17.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Updates busybox SHA (#6808) Signed-off-by: GitHub <[email protected]> Co-authored-by: fpetkovski <[email protected]> * fix head series limiter trigger (#6802) Signed-off-by: Thibault Mange <[email protected]> * preallocate series map size (#6807) Signed-off-by: Ben Ye <[email protected]> * Fix matchersToPostingGroups vals variable shadow bug (#6817) * fix matchersToPostingGroups vals variable shadow bug Signed-off-by: Ben Ye <[email protected]> * update changelog Signed-off-by: Ben Ye <[email protected]> --------- Signed-off-by: Ben Ye <[email protected]> * Store: fix prometheus store label values for matches on external labels (#6816) External Labels should also be tested for matches against the matchers. Signed-off-by: Michael Hoffmann <[email protected]> * optimize inmemory index cache WithLabelValues call (#6806) Signed-off-by: Ben Ye <[email protected]> * add keepalive to EndpointGroupGRPCOpts (#6810) Signed-off-by: Walther Lee <[email protected]> * Cut patch release `v0.32.5` (#6820) (#6822) * Build with Go 1.21 (#6615) * Build with Go 1.21 * Update tools --------- * update go alpine image to 3.18 (#6750) * build(deps): bump golang.org/x/net from 0.14.0 to 0.17.0 (#6805) Bumps [golang.org/x/net](https://github.com/golang/net) from 0.14.0 to 0.17.0. - [Commits](https://github.com/golang/net/compare/v0.14.0...v0.17.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: direct:production ... * Updates busybox SHA (#6808) * Fix matchersToPostingGroups vals variable shadow bug (#6817) * fix matchersToPostingGroups vals variable shadow bug * update changelog --------- * fix head series limiter trigger (#6802) * Store: fix prometheus store label values for matches on external labels (#6816) External Labels should also be tested for matches against the matchers. * Cut patch release v0.32.5 * Revert "Fix matchersToPostingGroups vals variable shadow bug (#6817)" This reverts commit 4ed9bb0317122e9dc31c2548581972c27d4e2e33. --------- Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: Coleen Iona Quadros <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: GitHub <[email protected]> Signed-off-by: Ben Ye <[email protected]> Signed-off-by: Thibault Mange <[email protected]> Signed-off-by: Michael Hoffmann <[email protected]> Co-authored-by: Coleen Iona Quadros <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: fpetkovski <[email protected]> Co-authored-by: Ben Ye <[email protected]> Co-authored-by: Thibault Mange <[email protected]> Co-authored-by: Michael Hoffmann <[email protected]> * go.mod: update promql-engine (#6823) Bring https://github.com/thanos-io/promql-engine/pull/320 into Thanos. Fixes https://github.com/thanos-io/promql-engine/issues/312. Signed-off-by: Giedrius Statkevičius <[email protected]> * receive/handler: fix label names/values race (#6825) * receive/handler: fix label names/values race There is a label name/value race in the current loop because `labelpb.ReAllocZLabelsStrings(&t.Labels, r.opts.Intern)` might be called which overwrites the original labels. At the same time, we might also be forwarding the same request through gRPC to other Receive nodes. Fixes the following race: <details> <summary>Trace of the race</summary> 10:53:51 receive-1: WARNING: DATA RACE 10:53:51 receive-1: Read at 0x00c001097b90 by goroutine 361: 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/labelpb.(*ZLabel).Size() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/labelpb/label.go:273 +0x35 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).MarshalToSizedBuffer() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/prompb/types.pb.go:1499 +0x7c4 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).MarshalToSizedBuffer() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1318 +0x409 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Marshal() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:1286 +0x64 10:53:51 receive-1: google.golang.org/protobuf/internal/impl.legacyMarshal() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/internal/impl/legacy_message.go:402 +0xb1 10:53:51 receive-1: google.golang.org/protobuf/proto.MarshalOptions.marshal() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/proto/encode.go:166 +0x3a2 10:53:51 receive-1: google.golang.org/protobuf/proto.MarshalOptions.MarshalAppend() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/proto/encode.go:125 +0x96 10:53:51 receive-1: github.com/golang/protobuf/proto.marshalAppend() 10:53:51 receive-1: /go/pkg/mod/github.com/golang/[email protected]/proto/wire.go:40 +0xce 10:53:51 receive-1: github.com/golang/protobuf/proto.Marshal() 10:53:51 receive-1: /go/pkg/mod/github.com/golang/[email protected]/proto/wire.go:23 +0x65 10:53:51 receive-1: google.golang.org/grpc/encoding/proto.codec.Marshal() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/encoding/proto/proto.go:45 +0x66 10:53:51 receive-1: google.golang.org/grpc/encoding/proto.(*codec).Marshal() 10:53:51 receive-1: <autogenerated>:1 +0x53 10:53:51 receive-1: google.golang.org/grpc.encode() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/rpc_util.go:594 +0x64 10:53:51 receive-1: google.golang.org/grpc.prepareMsg() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/stream.go:1610 +0x1a8 10:53:51 receive-1: google.golang.org/grpc.(*clientStream).SendMsg() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/stream.go:791 +0x284 10:53:51 receive-1: google.golang.org/grpc.invoke() 10:53:51 receive-1: /go/pkg/mod/google.golang.org/[email protected]/call.go:70 +0xf2 ... 10:53:51 receive-1: Previous write at 0x00c001097b90 by goroutine 357: 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/store/labelpb.ReAllocZLabelsStrings() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/store/labelpb/label.go:69 +0x25e 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Writer).Write() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/writer.go:144 +0x13e4 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func2.1() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:672 +0x153 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/tracing.DoInSpan() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/tracing/tracing.go:95 +0x125 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func2() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:671 +0x1fd 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward.func6() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:682 +0x61 10:53:51 receive-1: Goroutine 361 (running) created at: 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:688 +0x9c7 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).forward() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:612 +0x53a 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).handleRequest() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:417 +0xca8 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).receiveHTTP() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:539 +0x1d89 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).receiveHTTP-fm() 10:53:51 receive-1: <autogenerated>:1 +0x51 10:53:51 receive-1: net/http.HandlerFunc.ServeHTTP() 10:53:51 receive-1: /usr/local/go/src/net/http/server.go:2136 +0x47 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.NewHandler.RequestID.func2() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/server/http/middleware/request_id.go:40 +0x191 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/receive.(*Handler).testReady-fm.(*Handler).testReady.func1() 10:53:51 receive-1: /go/src/github.com/thanos-io/thanos/pkg/receive/handler.go:263 +0x249 10:53:51 receive-1: net/http.HandlerFunc.ServeHTTP() 10:53:51 receive-1: /usr/local/go/src/net/http/server.go:2136 +0x47 10:53:51 receive-1: github.com/thanos-io/thanos/pkg/extprom/http.httpInstrumentationHandler.func1() </details> Signed-off-by: Giedrius Statkevičius <[email protected]> * receive/handler: remove break Signed-off-by: Giedrius Statkevičius <[email protected]> --------- Signed-off-by: Giedrius Statkevičius <[email protected]> * fix devcontainer image (#6828) Signed-off-by: Ben Ye <[email protected]> * Block: Expose fetcher and syncer metrics to be provided by depending projects (#6827) * Expose fetcher and syncer metrics to be provided by depending projects. Signed-off-by: Alex Le <[email protected]> * Updated CHANGELOG Signed-off-by: Alex Le <[email protected]> * Remove CHANGELOG change Signed-off-by: Alex Le <[email protected]> --------- Signed-off-by: Alex Le <[email protected]> * receive: fix limits reloading race (#6826) We are re-reading the limits configuration periodically and also reading it at the same time hence we need a lock around it. Thus, let's make that struct member private and add a getter that returns the limiter under a mutex lock. Fixes: ``` 17:14:45 receive-i3: WARNING: DATA RACE 17:14:45 receive-i3: Read at 0x00c00090aec0 by goroutine 131: 17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*headSeriesLimit).QueryMetaMonitoring() 17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/head_series_limiter.go:109 +0x2fb 17:14:45 receive-i3: main.runReceive.func9.1() 17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/cmd/thanos/receive.go:402 +0x9b 17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/runutil.Repeat() 17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/runutil/runutil.go:74 +0xc3 17:14:45 receive-i3: Previous write at 0x00c00090aec0 by goroutine 138: 17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.NewHeadSeriesLimit() 17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/head_series_limiter.go:41 +0x316 17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*Limiter).loadConfig() 17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/limiter.go:168 +0xd0d 17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/receive.(*Limiter).StartConfigReloader.func1() 17:14:45 receive-i3: /go/src/github.com/thanos-io/thanos/pkg/receive/limiter.go:111 +0x207 17:14:45 receive-i3: github.com/thanos-io/thanos/pkg/extkingpin.(*pollingEngine).start.func1() ``` Signed-off-by: Giedrius Statkevičius <[email protected]> * query: fix hints race (#6831) Fix the following race: ``` 12:36:39 querier-1: ================== 12:36:39 querier-1: WARNING: DATA RACE 12:36:39 querier-1: Read at 0x00c000159540 by goroutine 341: 12:36:39 querier-1: reflect.Value.String() 12:36:39 querier-1: /usr/local/go/src/reflect/value.go:2589 +0xd76 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:563 +0xd86 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:325 +0x19db 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:606 +0xb2a 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:453 +0xdd6 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeAny() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:606 +0xb2a 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).writeStruct() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:453 +0xdd6 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).Marshal() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:896 +0x5c8 12:36:39 querier-1: github.com/gogo/protobuf/proto.(*TextMarshaler).Text() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:908 +0x92 12:36:39 querier-1: github.com/gogo/protobuf/proto.CompactTextString() 12:36:39 querier-1: /go/pkg/mod/github.com/gogo/[email protected]/proto/text.go:930 +0x8e 12:36:39 querier-1: github.com/thanos-io/thanos/pkg/store/storepb.(*SeriesRequest).String() 12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/storepb/rpc.pb.go:316 +0x7b 12:36:39 querier-1: github.com/thanos-io/thanos/pkg/store.(*ProxyStore).Series() 12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/store/proxy.go:277 +0x8f 12:36:39 querier-1: github.com/thanos-io/thanos/pkg/query.(*querier).selectFn() 12:36:39 querier-1: Previous write at 0x00c000159540 by goroutine 339: 12:36:39 querier-1: golang.org/x/exp/slices.insertionSortOrdered[go.shape.string]() 12:36:39 querier-1: /go/pkg/mod/golang.org/x/[email protected]/slices/zsortordered.go:15 +0x357 12:36:39 querier-1: golang.org/x/exp/slices.pdqsortOrdered[go.shape.string]() 12:36:39 querier-1: /go/pkg/mod/golang.org/x/[email protected]/slices/zsortordered.go:75 +0x72f 12:36:39 querier-1: golang.org/x/exp/slices.Sort[go.shape.[]string,go.shape.string]() 12:36:39 querier-1: /go/pkg/mod/golang.org/x/[email protected]/slices/sort.go:19 +0x45a 12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*evaluator).eval() 12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:1352 +0x432 12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*evaluator).Eval() 12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:1052 +0x105 12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*Engine).execEvalStmt() 12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:708 +0xb15 12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*Engine).exec() 12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:646 +0x4c8 12:36:39 querier-1: github.com/prometheus/prometheus/promql.(*query).Exec() 12:36:39 querier-1: /go/pkg/mod/github.com/prometheus/[email protected]/promql/engine.go:235 +0x232 12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query() 12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/query/v1.go:681 +0xdfd 12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).query-fm() 12:36:39 querier-1: <autogenerated>:1 +0x45 12:36:39 querier-1: github.com/thanos-io/thanos/pkg/api/query.(*QueryAPI).Register.GetInstr.func1.1() 12:36:39 querier-1: /go/src/github.com/thanos-io/thanos/pkg/api/api.go:212 +0x62 12:36:39 querier-1: net/http.HandlerFunc.ServeHTTP() 12:36:39 querier-1: /usr/local/go/src/net/http/server.go:2136 +0x47 12:36:39 querier-1: github.com/thanos-io/thanos/pkg/logging.(*HTTPServerMiddleware).HTTPMiddleware.func1() ``` Problem is that the same slice is sorted in the PromQL engine whereas the same hints slice could still be used in other Select() calls where String() is called and then it reads those hints. Signed-off-by: Giedrius Statkevičius <[email protected]> * Adding Grupo Olx as user (#6832) * Adding Grupo Olx as user Signed-off-by: Nelson Almeida <[email protected]> * Adding Grupo OLX logo Signed-off-by: Nelson Almeida <[email protected]> --------- Signed-off-by: Nelson Almeida <[email protected]> * Query: Add tenant label to exported metrics (#6794) * Receive: Add default tenant to HTTP metrics Previously, if the tenant header was empty/not supplied, the exported metrics would have an empty string as tenant. With this commit we instead use the default tenant as can be configured with: `--receive.default-tenant-id`. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Query: Add tenant label to exported metrics With this commit we now add the tenant label to relevant metrics exported by the query component. This includes the HTTP metrics handled by the InstrumentationMiddleware and the query latency metrics. Signed-off-by: Jacob Baungard Hansen <[email protected]> --------- Signed-off-by: Jacob Baungard Hansen <[email protected]> * Nit: allocate slice capacity correctly during intersection (#6819) * Fix: Removes Deprecated ioutil (#6834) * Fix: Removes Deprecated ioutil In Go, io/ioutil has been recently deprecated in favor of the drop in replacements "io" and "os". With the exception of the generated code in the file marked "DO NOT EDIT", this commit addresses those instances of ioutil with the respective function replacements. Happy Hacktoberfest! Thank you for taking a moment to review my PR! Signed-off-by: donuts-are-good <[email protected]> * Adds Changelog entry Completing the request for a changelog entry. Signed-off-by: donuts-are-good <[email protected]> * Removes Changelog Entry This commit removes the ioutil changes in this PR, as they are not user-facing issues Signed-off-by: donuts-are-good <[email protected]> --------- Signed-off-by: donuts-are-good <[email protected]> * vertically shard queries by le if no histogram_quantile function (#6809) Signed-off-by: Ben Ye <[email protected]> * Expose more overridable metrics from fetcher and default grouper (#6836) * Expose more overridable metrics from fetcher and default grouper Signed-off-by: Alex Le <[email protected]> * fix test Signed-off-by: Alex Le <[email protected]> * rename new functions Signed-off-by: …
Changes
Fixes #6416
The idea behind this optimization is trying to fetch less data if possible. Now we always fetch all postings, then intersect and finally fetch series, chunks, etc.
If a matcher is contains a lot of postings, we are trying to download all postings of this matcher. However, this work might be unnecessary because the final number of series matched is bound by highest selectivity matcher in the query. It might be more efficient to only download series and filter, rather than downloading all postings.
So what we could do is to start from min cardinality posting (highest selectivity), download its series and then apply matchers on the series.
If
4 * minPostingCardinality + estimatesSeriesSize * minPostingCardinality < totalPostingCardinality * 4 + estimatesSeriesSize * finalSeriesMatched
, then we know it is cheaper to fetch minPostings.We can know the
minPostingCardinality
andtotalPostingCardinality
by checking posting offset table. We use the max series size added in blockmeta.json
as the estimatesSeriesSize.We don't have a way to know
finalSeriesMatched
but in worst case we can assume it is 0.Then we have a basic cost model to decide which way is better.
In the actual implementation, I used similar idea to implement an algorithm which finds the best way to download postings/series based on heuristics.
Verification
Need to add tests and benchmark performance.