Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put metric samples in time series (immutable tag sets) on creation #1831

Closed
na-- opened this issue Feb 1, 2021 · 7 comments · Fixed by #2594
Closed

Put metric samples in time series (immutable tag sets) on creation #1831

na-- opened this issue Feb 1, 2021 · 7 comments · Fixed by #2594
Assignees
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 performance refactor
Milestone

Comments

@na--
Copy link
Member

na-- commented Feb 1, 2021

Currently, k6 builds the set of tags for every metric measurement it emits in a very ad-hoc manner and as a map[string]string. For example, building the tags set for the HTTP metrics happens here and here. It is a gradual accretion of key-value pairs, based on all of these things:

This is not too terrible and it mostly works, but it's not very efficient either... One big issue is that all of these string maps put a lot of GC pressure on k6 - we generate unique tag maps for every metric sample, and discard them as soon as we've crunched the data for the end-of-test summary and thresholds in the Engine and/or we've sent it to whatever external output is configured. Another performance problem is that we can't efficiently aggregate the metrics back, since that involves heavy string comparisons, both in the outputs and for any sub-metrics we have.

And finally, we don't really know the total number of unique metric tag sets we have in a test run. Load tests should generally have a small number of such, for easier data analysis, that's why we have things like the name tag and URL grouping. It would be very good UX if k6 could warn users if they exceed a certain number of unique tag sets in their script, since that almost certainly is a mistake.

To solve all of these issues, I think we can probably use some graph-based data structure to keep track of the various tag-sets that exist. The process of building the tag set for a specific metric sample would be walking through the graph and adding nodes when new tag keys or values are observed that weren't present in the graph. At the end of that tag accretion process, instead of a string map, the metric sample would contain a pointer to an immutable node from the graph. And every other metric sample with the same tags will have the exact same pointer. So, comparing metrics (for aggregation in outputs or sub-metric partitioning) should become as easy as comparing the pointers of their tag set objects.

The exact data structure we want to use and how we'll ensure thread-safety (given that tag sets can be generated concurrently by multiple VUs) is something that still needs to be determined... Some sort of an STM approach seems better than adding locks to every graph node, but I'm not sure how realistic that would be for Go... There seem to exist some libraries for it (e.g. https://github.com/anacrolix/stm), but it needs a lot of further evaluation.

Also, it's not certain how much we can optimize the Contains() check if a tag set is a subset of another tag set, used for sub-metric updates. It won't be slower, given that the graph nodes will probably contain the full tag sets, but it might not be possible to make it faster, if the alternative is a BFS of the graph. This depends solely on the data structure and algorithms we pick.

Alternatives to graphs should also be explored. We might be able to achieve most of the same benefits by using clever hashing of the tag set values. Or even having a global set with values being the concatenated key=value pairs, so that each metrics sample tag sets is a subset of the global one. For sure, before we start anything here, we should see how other projects (e.g. Prometheus) deal with similar issues in an efficient way...

Finally, it's not clear if we should try to enforce an upper limit to the number of unique tag sets. Personally, I think not - we should emit a very stern warning of some number is exceeded, say 1000 unique tag set combinations, but not try anything more fancy than that (like falling back to stringmap based tag sets).

@na-- na-- added performance refactor evaluation needed proposal needs to be validated or tested before fully implementing it in k6 labels Feb 1, 2021
@na--
Copy link
Member Author

na-- commented Feb 1, 2021

#1832 (or something like it) is a prerequisite for the current issue.

@mstoykov
Copy link
Contributor

mstoykov commented Feb 1, 2021

As an additions:
While I was thinking of the same thing, I released that at least in the cloud aggregation(and probably everywhere else), we bucket metrics by time. So for that matter the "tag set" can also be per time bucket, as the major reason for this change IMO is that comparing map[string]string is highly unefficient. This will have the benefit that if the tag set is increasing unbounded we at least will be dropping the old tag sets. And the deficiency that the tag set will be rebuild each time bucket. I personally after that decided that the building will probably be more costly and having an unbounded tag set is probably not a useful feature, as I am perfectly certain no system can handle 1million tag sets in a way that it matters that it does.

I was also thinking of a simple ordered [][2]string where the strings are interned. This has the benefit of only the interning of string to be unbounded, but we can do the above trick. This again will require comparing strings, but at least that noes is O(n) where n is the number of elements so ... that is much better ;).
Prometheus is basically having []struct{name, value string} which is better from whatI I've learned as golang is much finer with keeping struct{name, value string} on the stack then [2]string. https://github.com/prometheus/prometheus/blob/adc8807851b87257964fab57e064771520751872/pkg/labels/labels.go#L38-L44 . But as far as I know they use hashes for most of the bucketing later on (citation needed).

I am of the opinion that the most bang for our money will be a complete refactoring (through many smaller ones, including probably some of the above) of how metrics are handled.

Currently we emit metrics (generate objects and they move through the system), even if that objects will never be processed because of the lack of an output, for example. If we can have a system where metrics are not emitted for no reason and where we know when they are processed, we can also reuse them (and from my proffiling, GC is having problem with those as well).

We will also probably want to aggregate metrics before they reach any output, effectively cutting their lifetime, probably removing the need for some of the other optimizations. If you are just traversing the graph of metrics in the "aggregator" to reach the bucket where they will be aggregated. Obviously the aggregator will then emit "new" metrics with "new" tag sets but arguably they will be a much smaller set of those. But doing this work always, even if there isn't an output seems wasteful.

But we kind of have to do it as even now we always aggregate metrics https://github.com/loadimpact/k6/blob/46d53d99deba98add23079df93579408fa7ea094/core/engine.go#L405-L428 that will be show in the summary output or are needed for thresholds. This though IMO depends heavily on #1832 . So it might be worth first doing that and seeing how the code linked behaves then. Especially if we change it to always create submetrics for new tag sets

@na--
Copy link
Member Author

na-- commented Feb 1, 2021

Not sure about time buckets being part of the tag set. They would just add cardinality to whatever we have needlessly. And time buckets depend on the actual "user" of the metrics - the cloud output will need very different buckets than something like #1136, and the summary and normal thresholds don't need any time buckets at all. So unless I am missing something, time and tag sets seem orthogonal concepts that shouldn't be mixed. They can be easily stacked as they currently are in the cloud aggregation, only where necessary.

👍 for string interning. Having a graph with immutable nodes would achieve that, but so would having a global set of key=value tags. Though you are right we should probably use [2]string or a simple struct{name, value string} for that, no need to concatenate tags needlessly.

@na--
Copy link
Member Author

na-- commented Jul 12, 2021

Stated another way, this issue is somewhat about assigning every metric sample to a particular time series at the moment of its measurement. Relevant links:

@na-- na-- changed the title Immutable metric tag sets Immutable metric tag sets (attach every metric sample to a time series) Jul 12, 2021
@na-- na-- changed the title Immutable metric tag sets (attach every metric sample to a time series) Immutable metric tag sets (attach metric samples to time series on creation) Jul 12, 2021
@yorugac
Copy link
Contributor

yorugac commented Oct 12, 2021

Just adding some observations from digging into Prometheus code.

I confirm that there is hashing of labels involved in Prometheus processing:
https://github.com/prometheus/prometheus/blob/adc8807851b87257964fab57e064771520751872/pkg/labels/labels.go#L136-L160

[As far as I saw] those hashes are involved in some tricky staleness logic and for caching Prometheus is primarily relying on a map[string]cacheEntry with the original scrapped string as a key. But probably more interesting here is the stripeSeries structure:
https://github.com/prometheus/prometheus/blob/e245edf5a4c281499060ab1bd6d473b576094c04/tsdb/head.go#L1288-L1304

It seems they bucket hashmaps by ID and keep multiple locks after all. And there are two mechanisms for lookup:

func (s *stripeSeries) getByID(id uint64) *memSeries
func (s *stripeSeries) getByHash(hash uint64, lset labels.Labels) *memSeries

But I suspect that IDs are considered more reliable than labels as I haven't seen getByHash used in main processing.

This logic appears to be somewhat tied to how TSDB is handled but then, its purpose is to move samples from memory to storage. Still, this can be classified as "clever hashing of the tag set values" with primary difference that k6 doesn't have such IDs now. OTOH, metric names are unique in k6.

And yes, this definitely comes up in #1761. It doesn't make sense to add much complex logic there at this stage but it doesn't currently use an efficient way to process labels either.

@na-- na-- changed the title Immutable metric tag sets (attach metric samples to time series on creation) Put metric samples in time series (immutable tag sets) on creation Nov 10, 2021
@codebien
Copy link
Contributor

I explored a bit this issue (so many things to take into consideration 😱):

Prometheus is basically having []struct{name, value string} which is better from whatI I've learned as golang is much finer with keeping struct{name, value string} on the stack then [2]string

@mstoykov executing a basic escape analysis seems they both rely on the required memory size and there isn't per-type dedicated memory management but for sure it would require more complex investigation. If you have a reference for it, it would be helpful.
If we don't find an objective (or at least concrete) value for [2]string then I think a struct could have the value to maintain the code much more readable. I would prefer to read and use tag.Key then tag[0] all around the code, or *Tag if required to have a pointer.

So from my understanding of the previous comments, eventual PR should introduce 3 different types:

  • Tag struct{Key, Value string}: it should allow us to have a pointer of the string pair in one shot
  • TagSet []*Tag: it should be the sorted type used from a Sample and if needed for supporting aggregations it could be able to generate the hash. Would we cache the hash somewhere?

Also, it's not certain how much we can optimize the Contains() check if a tag set is a subset of another tag set, used for sub-metric updates.

@na– yes, I don’t think we can have something faster than O(logn)+O(len(subset)). I’m considering the case ABC.contains(BC) == true.

  • TagMap sync.Map (map[hash()]Tag): it contains the dataset of the set of unique tags and the real allocation. Considering the “immutability” and the read over writes sync.Map should fit better here.

@oleiade
Copy link
Member

oleiade commented Mar 24, 2022

@codebien As discussed in private, while implementing support for submetrics in thresholds validation, I had somewhat of an intuition: wouldn't it simplify the metrics package types if Metric had a Parent that could be nil, instead of having a dedicated Submetric type?

I believe it could simplify metrics/submetrics manipulation throughout the codebase. No idea if it's a good idea, and what impact it would have though 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 performance refactor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants