-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time Series data model #2580
Comments
The |
It's relevant to mention that we are not going probably to implement the entire refactoring in a single iteration. The first expected work should be:
// A Sample is a single measurement.
type Sample struct {
Metric *Metric
Time time.Time
Tags *SampleTags
Value float64
TimeSeries *TimeSeries
}
It enables us to start fixing issues blocked by this specific concept but also to not break change with a bad UX. Only in the next iterations, the redundant fields from the Sample struct will be removed and the |
I have some questions(no particular order):
I would expect this will be addressed first?
Does that mean that we will forever hold all the tags + the name for each timeseries or just the hash?
is there are particular thing from that issue that is intended to be implemented? It doesn't seem clear and having them as After I started looking into this last week and wrote some thoughts down, I did spend some time over the weekend to make https://github.com/mstoykov/atlas (mostly on the benchmarks actually 🙄). I kind of wonder if something similar can not replace the |
We have parallel discussions and I expect we will have parallel implementations. We can set it as an acceptance criteria for merging but I wouldn't set it as a requirement for starting the development. In the eventual case, that we can't achieve the final design for fixing the
We have to keep all the tags in some place otherwise we can't map them from the Outputs. So, it should be convenient to make the TimeSeries the central place where they live. (Note: in this way, the proposal for the first step makes them duplicated between the TimeSeries and the Sample structs).
Yes, the internal |
The design evolved in a better direction, the summary of the current design can be found at #2594 (comment) and #2594 (comment). The issue's description has been updated also. |
The main part has been implemented and it is now available in master. I keep this issue open until I haven't split the remaining parts from this issue into smaller and dedicated issues. |
I opened #2735 which is the missing part from this issue. HDR histogram and metric's structs refactor have already well-known open issues. |
We want to introduce the time series concept to k6 for getting the benefit of efficiently identifying the combination between a metric and the relative tags.
A summary of the benefits it brings across the platform:
Data model
Metrics and Series
Sink
The sinks are implemented by metric types and they keep values up to date by time series and/or aggregated views:
Storage
It will store the root of Atlas in the
metrics.Registry
and it will provide a method for get it for branching out a new TagSet.Samples generation
The current sampling process is controlled by the
metrics.PushIfNotDone
method. All the actual callers should resolve the time series from the storage before pushing a new Sample, or in the event, no one is found the storage will create and insert a new one.It requires the dependency from the time series database for all the callers (e.g. executors, JavaScript modules). Most of them already have the
metrics.Registry
dependency.metrics.Ingester
Ingester is responsible for resolving the entire set of Sinks impacted from the ingested time series then it adds the Sample's value to the resolved Sinks.
Example
In a
t0
where the status of the seen time series is:The Ingester in the case of a collected Sample
http_req_duration{status:200,method:GET}
then it resolves the dependencies with the other seen time series in a unique set where it contains the following time serieshttp_req_duration{status:200}
andhttp_req_duration{method:GET}
. It can now resolve the relative Metric's Sinks and it invokes them passing the Sample's value.Known issues
k6
is tagging HTTP requests with the URL. It will create a high cardinality issue for the time series data model. This should be fixed by adding the possibility to not store all the tags as indexable, having the availability to set them asmetadata
and exclude them from the time series generation. An alternative workaround could be to exclude them from the default set of enabled tags.The text was updated successfully, but these errors were encountered: