-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Put metric samples in time series (immutable tag sets) on creation #1831
Comments
#1832 (or something like it) is a prerequisite for the current issue. |
As an additions: I was also thinking of a simple ordered I am of the opinion that the most bang for our money will be a complete refactoring (through many smaller ones, including probably some of the above) of how metrics are handled. Currently we emit metrics (generate objects and they move through the system), even if that objects will never be processed because of the lack of an output, for example. If we can have a system where metrics are not emitted for no reason and where we know when they are processed, we can also reuse them (and from my proffiling, GC is having problem with those as well). We will also probably want to aggregate metrics before they reach any output, effectively cutting their lifetime, probably removing the need for some of the other optimizations. If you are just traversing the graph of metrics in the "aggregator" to reach the bucket where they will be aggregated. Obviously the aggregator will then emit "new" metrics with "new" tag sets but arguably they will be a much smaller set of those. But doing this work always, even if there isn't an output seems wasteful. But we kind of have to do it as even now we always aggregate metrics https://github.com/loadimpact/k6/blob/46d53d99deba98add23079df93579408fa7ea094/core/engine.go#L405-L428 that will be show in the summary output or are needed for thresholds. This though IMO depends heavily on #1832 . So it might be worth first doing that and seeing how the code linked behaves then. Especially if we change it to always create submetrics for new tag sets |
Not sure about time buckets being part of the tag set. They would just add cardinality to whatever we have needlessly. And time buckets depend on the actual "user" of the metrics - the cloud output will need very different buckets than something like #1136, and the summary and normal thresholds don't need any time buckets at all. So unless I am missing something, time and tag sets seem orthogonal concepts that shouldn't be mixed. They can be easily stacked as they currently are in the cloud aggregation, only where necessary. 👍 for string interning. Having a graph with immutable nodes would achieve that, but so would having a global set of |
Stated another way, this issue is somewhat about assigning every metric sample to a particular time series at the moment of its measurement. Relevant links: |
Just adding some observations from digging into Prometheus code. I confirm that there is hashing of labels involved in Prometheus processing: [As far as I saw] those hashes are involved in some tricky staleness logic and for caching Prometheus is primarily relying on a It seems they bucket hashmaps by ID and keep multiple locks after all. And there are two mechanisms for lookup:
But I suspect that IDs are considered more reliable than labels as I haven't seen This logic appears to be somewhat tied to how TSDB is handled but then, its purpose is to move samples from memory to storage. Still, this can be classified as "clever hashing of the tag set values" with primary difference that k6 doesn't have such IDs now. OTOH, metric names are unique in k6. And yes, this definitely comes up in #1761. It doesn't make sense to add much complex logic there at this stage but it doesn't currently use an efficient way to process labels either. |
I explored a bit this issue (so many things to take into consideration 😱):
@mstoykov executing a basic escape analysis seems they both rely on the required memory size and there isn't per-type dedicated memory management but for sure it would require more complex investigation. If you have a reference for it, it would be helpful. So from my understanding of the previous comments, eventual PR should introduce 3 different types:
@na– yes, I don’t think we can have something faster than
|
@codebien As discussed in private, while implementing support for submetrics in thresholds validation, I had somewhat of an intuition: wouldn't it simplify the I believe it could simplify metrics/submetrics manipulation throughout the codebase. No idea if it's a good idea, and what impact it would have though 🤔 |
Currently, k6 builds the set of tags for every metric measurement it emits in a very ad-hoc manner and as a
map[string]string
. For example, building the tags set for the HTTP metrics happens here and here. It is a gradual accretion of key-value pairs, based on all of these things:tags
optiongroup
tag from thelib.State
, set there bygroup()
tags
parameter for the specific requestParams
systemTags
global optionThis is not too terrible and it mostly works, but it's not very efficient either... One big issue is that all of these string maps put a lot of GC pressure on k6 - we generate unique tag maps for every metric sample, and discard them as soon as we've crunched the data for the end-of-test summary and thresholds in the
Engine
and/or we've sent it to whatever external output is configured. Another performance problem is that we can't efficiently aggregate the metrics back, since that involves heavy string comparisons, both in the outputs and for any sub-metrics we have.And finally, we don't really know the total number of unique metric tag sets we have in a test run. Load tests should generally have a small number of such, for easier data analysis, that's why we have things like the
name
tag and URL grouping. It would be very good UX if k6 could warn users if they exceed a certain number of unique tag sets in their script, since that almost certainly is a mistake.To solve all of these issues, I think we can probably use some graph-based data structure to keep track of the various tag-sets that exist. The process of building the tag set for a specific metric sample would be walking through the graph and adding nodes when new tag keys or values are observed that weren't present in the graph. At the end of that tag accretion process, instead of a string map, the metric sample would contain a pointer to an immutable node from the graph. And every other metric sample with the same tags will have the exact same pointer. So, comparing metrics (for aggregation in outputs or sub-metric partitioning) should become as easy as comparing the pointers of their tag set objects.
The exact data structure we want to use and how we'll ensure thread-safety (given that tag sets can be generated concurrently by multiple VUs) is something that still needs to be determined... Some sort of an STM approach seems better than adding locks to every graph node, but I'm not sure how realistic that would be for Go... There seem to exist some libraries for it (e.g. https://github.com/anacrolix/stm), but it needs a lot of further evaluation.
Also, it's not certain how much we can optimize the
Contains()
check if a tag set is a subset of another tag set, used for sub-metric updates. It won't be slower, given that the graph nodes will probably contain the full tag sets, but it might not be possible to make it faster, if the alternative is a BFS of the graph. This depends solely on the data structure and algorithms we pick.Alternatives to graphs should also be explored. We might be able to achieve most of the same benefits by using clever hashing of the tag set values. Or even having a global set with values being the concatenated
key=value
pairs, so that each metrics sample tag sets is a subset of the global one. For sure, before we start anything here, we should see how other projects (e.g. Prometheus) deal with similar issues in an efficient way...Finally, it's not clear if we should try to enforce an upper limit to the number of unique tag sets. Personally, I think not - we should emit a very stern warning of some number is exceeded, say 1000 unique tag set combinations, but not try anything more fancy than that (like falling back to stringmap based tag sets).
The text was updated successfully, but these errors were encountered: