-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tsdb: Document that labels.Labels should always be sorted #5880
Comments
Comparing label sets is rather expensive considering how often we are doing it. Generally it either requires a nested loop which is O(n^2) or building a hashmap of the first label set and then comparing the second against it (allocations, which are even worse). The system invariant is that label sets are always sorted. The must always be created through Lists of labels (or lists of series) are also always sorted when retrieved from the Querier. That's the invariant needed for merging of query results from multiple blocks to work. Does that make sense? |
Yea, should be documented at the outer interfaces. The |
@gouthamve is this still relevant? did you want to document it somewhere in a markdown file or this enough: https://github.com/prometheus/prometheus/blob/master/tsdb/labels/labels.go#L35-L37 |
Currently we {1="2",2="1"} and {2="1",1="2"} are being treated as different series. This is true for
headBlock
and I could not find if we are sorting and merging series while persisting.If this is true, it spreads the same series across different locations. Now while this may look like we are increasing our write throughput, we are never appending to the same series concurrently. This is a definite hit to the query performance as we need to hit different places for the same series.
I am pretty sure this is intentional but not able to understand the rationale behind this.
The text was updated successfully, but these errors were encountered: