-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add an efficient labels serializer #474
base: master
Are you sure you want to change the base?
Conversation
`make_key` is not very efficient itself, so special `labels_serializer` is added with optimized scheme.
This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your PR! I have several question regarding implementation, testing and documentation. My main concern is how it will work with global labels -- for now it seems that one would need to explicilty support them in serializer, which, for example, will break the usage HTTP middleware after your change.
Since we are talking about efficiency here, it is also crucial to provide some performance comparison here. For example, I had used the following script to check your updates performance:
local clock = require('clock')
local metrics = require('metrics')
metrics.cfg{include = 'none', labels = {alias = 'myalias'}}
local custom_serializer = metrics.labels_serializer({'alias', 'l1', 'l2'})
local collector_with_old_serialization = metrics.counter('collector_with_old_serialization')
local bench_old = clock.bench(function()
for i = 1, 1e6 do
collector_with_old_serialization:inc(1, {l1 = 'v1', l2 = 'v' .. tostring(i)})
end
end)
local collector_with_new_serialization = metrics.counter('collector_with_new_serialization')
local bench_new = clock.bench(function()
for i = 1, 1e6 do
collector_with_new_serialization:inc(1, custom_serializer.wrap{l1 = 'v1', l2 = 'v' .. tostring(i)})
end
end)
print(('old: %.4f s, new: %.4f s'):format(bench_old[1], bench_new[1]))
old: 3.6006 s, new: 3.0873 s
@@ -140,4 +142,5 @@ return { | |||
unregister_callback = unregister_callback, | |||
invoke_callbacks = invoke_callbacks, | |||
set_global_labels = set_global_labels, | |||
labels_serializer = serializers.basic_labels_serializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labels_serializer = serializers.basic_labels_serializer | |
labels_serializer = serializers.basic_labels_serializer, |
metrics/api.lua
Outdated
@@ -125,6 +125,68 @@ local function set_global_labels(label_pairs) | |||
registry:set_labels(label_pairs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, squash your commits into the single one with a name like api: support custom labels serialization
@@ -48,15 +49,10 @@ function Shared.make_key(label_pairs) | |||
return "" | |||
end | |||
-- `label_pairs` provides its own serialization scheme, it must be used instead of default one. | |||
if label_pairs.__metrics_make_key then | |||
return label_pairs:__metrics_make_key() | |||
if label_pairs.__metrics_serialize then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's not a metamethod -- it's just some internal API (even though object is internal by itself). Since it is not a metamethod, I think you shouldn't use double underscore in the name: use _metrics_serialize
(or simply serialize
even).
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
- `metrics.cfg{}` `"all"` metasection for array `include` and `exclude` | |||
(`metrics.cfg{include={'all'}}` can be used instead of `metrics.cfg{include='all'}`, | |||
`metrics.cfg{exclude={'all'}}` can be used instead of `metrics.cfg{include='none'}`) | |||
- Allow users to decide on labels serialization scheme themselves. Add `labels_serializer`, which provides alternative efficient labels serialization scheme. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Allow users to decide on labels serialization scheme themselves. Add `labels_serializer`, which provides alternative efficient labels serialization scheme. | |
- Allow users to customize labels serialization with `metrics.labels_serializer` API. |
--- @param labels_keys string[] Label keys for the further use. | ||
--- @return LabelsSerializer | ||
local function basic_labels_serializer(labels_keys) | ||
-- we always add keys that are needed for metrics' internals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it is not internals -- it is histogram labeling approach (based on Prometheus design).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have similar label -- quantile
-- for summary collectors, which are not covered here, but should be.
--- | ||
--- @param labels_keys string[] Label keys for the further use. | ||
--- @return LabelsSerializer | ||
local function basic_labels_serializer(labels_keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is confusing here: default_labels_serializer
is an actual serializer and basic_labels_serializer
is a serializer constructor/builder/factory. Please, use name like build_labels_serializer
or similar.
|
||
-- check that builtin metrics work. | ||
local hist_serializer = metrics.labels_serializer(label_keys) | ||
local hist = metrics.histogram('hist', 'test histogram', {2}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, test with all collector types: counter, gauge, histogram and summary (shared
is an internal implementation base class, so it may be ignored here).
|
||
t.assert_equals(actual, shared.make_key(label_pairs)) | ||
t.assert_equals(actual, shared.make_key(wrapped)) | ||
t.assert_not_equals(wrapped.__metrics_serialize, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a unit test, so it's better to move it to somewhere else (maybe a new file). We also need to have an integration test with a full example:
- build a custom serializer,
- set it through public API,
- observe some values,
- collect the metrics.
It also would be nice ti write an example (at least in tests) for a custom serializer considering default metrics, especially since they are enabled by default in Tarantool 2.11.1 and newer. As a reference, you may refer to Telegraf labels setup here: https://www.tarantool.io/en/doc/latest/book/monitoring/grafana_dashboard/#prepare-a-monitoring-stack (code block starting with [[inputs.http]]
).
We should also do an integration test with global labels setup (AFAIU, one should cover it in serializer too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it seems that I have misunderstood the general approach: this PR introduces a serializer for each label pairs instead of replacing a serializer for each metrics. In that case, default metrics won't be covered. On the other hand, the question about global labels still needs to be answered.
@@ -69,6 +69,8 @@ function export.configure_default_collector(...) | |||
export.set_default_collector(export.build_default_collector(...)) | |||
end | |||
|
|||
local labels_serializer = metrics_api.labels_serializer({ "path", "method", "status" }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would likely break using the global labels.
--- @return LabelsSerializer | ||
local function basic_labels_serializer(labels_keys) | ||
-- we always add keys that are needed for metrics' internals. | ||
local __labels_keys = { "le" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also seems non-optimized: we will have a key that works only for a histogram. Maybe it's worth to have a different factories for counter/gauge, histogram and summary?
What has been done? Why? What problem is being solved?
I found out that observe method works slow. Further investigations showed that
make_key
function sorts keys every run time.make_key
is being called often during, for instance, hist:observe. To optimize this case and both keep the default behavior, I propose two things:labels_pairs
serialization scheme with__metrics_make_key
meta-method. If it is supplied, it would be used, otherwise the default sorting scheme would be used;I didn't forget about