add an efficient labels serializer #474

ftelnov · 2023-11-03T07:51:10Z

What has been done? Why? What problem is being solved?

I found out that observe method works slow. Further investigations showed that make_key function sorts keys every run time. make_key is being called often during, for instance, hist:observe. To optimize this case and both keep the default behavior, I propose two things:

Allow user to override labels_pairs serialization scheme with __metrics_make_key meta-method. If it is supplied, it would be used, otherwise the default sorting scheme would be used;
Provide precaching labels serializer. It would order the labels properly on the client side, and only concat pairs on the metrics side. The complexity is linear then.

I didn't forget about

Tests
Changelog
Documentation (README and rst) (don't think I should alter it for such a small api?)
Rockspec and rpm spec (I don't really know what to add here)

`make_key` is not very efficient itself, so special `labels_serializer` is added with optimized scheme.

github-actions · 2024-01-02T18:03:20Z

This pull request is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days

DifferentialOrange

Thank you for your PR! I have several question regarding implementation, testing and documentation. My main concern is how it will work with global labels -- for now it seems that one would need to explicilty support them in serializer, which, for example, will break the usage HTTP middleware after your change.

Since we are talking about efficiency here, it is also crucial to provide some performance comparison here. For example, I had used the following script to check your updates performance:

local clock = require('clock')

local metrics = require('metrics')

metrics.cfg{include = 'none', labels = {alias = 'myalias'}}

local custom_serializer = metrics.labels_serializer({'alias', 'l1', 'l2'})

local collector_with_old_serialization = metrics.counter('collector_with_old_serialization')
local bench_old = clock.bench(function()
    for i = 1, 1e6 do
        collector_with_old_serialization:inc(1, {l1 = 'v1', l2 = 'v' .. tostring(i)})
    end
end)

local collector_with_new_serialization = metrics.counter('collector_with_new_serialization')
local bench_new = clock.bench(function()
    for i = 1, 1e6 do
        collector_with_new_serialization:inc(1, custom_serializer.wrap{l1 = 'v1', l2 = 'v' .. tostring(i)})
    end
end)

print(('old: %.4f s, new: %.4f s'):format(bench_old[1], bench_new[1]))

old: 3.6006 s, new: 3.0873 s

DifferentialOrange · 2024-03-15T08:01:41Z

metrics/api.lua

@@ -140,4 +142,5 @@ return {
    unregister_callback = unregister_callback,
    invoke_callbacks = invoke_callbacks,
    set_global_labels = set_global_labels,
+    labels_serializer = serializers.basic_labels_serializer


Suggested change

labels_serializer = serializers.basic_labels_serializer

labels_serializer = serializers.basic_labels_serializer,

DifferentialOrange · 2024-03-15T08:13:49Z

metrics/api.lua

@@ -125,6 +125,68 @@ local function set_global_labels(label_pairs)
    registry:set_labels(label_pairs)


Please, squash your commits into the single one with a name like api: support custom labels serialization

DifferentialOrange · 2024-03-15T08:15:59Z

metrics/collectors/shared.lua

@@ -48,15 +49,10 @@ function Shared.make_key(label_pairs)
        return ""
    end
    -- `label_pairs` provides its own serialization scheme, it must be used instead of default one.
-    if label_pairs.__metrics_make_key then
-        return label_pairs:__metrics_make_key()
+    if label_pairs.__metrics_serialize then


Well, it's not a metamethod -- it's just some internal API (even though object is internal by itself). Since it is not a metamethod, I think you shouldn't use double underscore in the name: use _metrics_serialize (or simply serialize even).

DifferentialOrange · 2024-03-15T08:16:39Z

CHANGELOG.md

@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - `metrics.cfg{}` `"all"` metasection for array `include` and `exclude`
  (`metrics.cfg{include={'all'}}` can be used instead of `metrics.cfg{include='all'}`,
  `metrics.cfg{exclude={'all'}}` can be used instead of `metrics.cfg{include='none'}`)
+- Allow users to decide on labels serialization scheme themselves. Add `labels_serializer`, which provides alternative efficient labels serialization scheme.


Suggested change

- Allow users to decide on labels serialization scheme themselves. Add `labels_serializer`, which provides alternative efficient labels serialization scheme.

- Allow users to customize labels serialization with `metrics.labels_serializer` API.

DifferentialOrange · 2024-03-15T08:18:55Z

metrics/serializers.lua

+--- @param labels_keys string[] Label keys for the further use.
+--- @return LabelsSerializer
+local function basic_labels_serializer(labels_keys)
+    -- we always add keys that are needed for metrics' internals.


Well, it is not internals -- it is histogram labeling approach (based on Prometheus design).

We also have similar label -- quantile -- for summary collectors, which are not covered here, but should be.

DifferentialOrange · 2024-03-15T08:22:12Z

metrics/serializers.lua

+---
+--- @param labels_keys string[] Label keys for the further use.
+--- @return LabelsSerializer
+local function basic_labels_serializer(labels_keys)


The naming is confusing here: default_labels_serializer is an actual serializer and basic_labels_serializer is a serializer constructor/builder/factory. Please, use name like build_labels_serializer or similar.

DifferentialOrange · 2024-03-15T08:24:08Z

test/metrics_test.lua

+
+    -- check that builtin metrics work.
+    local hist_serializer = metrics.labels_serializer(label_keys)
+    local hist = metrics.histogram('hist', 'test histogram', {2})


Please, test with all collector types: counter, gauge, histogram and summary (shared is an internal implementation base class, so it may be ignored here).

DifferentialOrange · 2024-03-15T08:31:07Z

test/metrics_test.lua

+
+    t.assert_equals(actual, shared.make_key(label_pairs))
+    t.assert_equals(actual, shared.make_key(wrapped))
+    t.assert_not_equals(wrapped.__metrics_serialize, nil)


It seems like a unit test, so it's better to move it to somewhere else (maybe a new file). We also need to have an integration test with a full example:

build a custom serializer,

set it through public API,

observe some values,

collect the metrics.

It also would be nice ti write an example (at least in tests) for a custom serializer considering default metrics, especially since they are enabled by default in Tarantool 2.11.1 and newer. As a reference, you may refer to Telegraf labels setup here: https://www.tarantool.io/en/doc/latest/book/monitoring/grafana_dashboard/#prepare-a-monitoring-stack (code block starting with [[inputs.http]]).

We should also do an integration test with global labels setup (AFAIU, one should cover it in serializer too).

Oh, it seems that I have misunderstood the general approach: this PR introduces a serializer for each label pairs instead of replacing a serializer for each metrics. In that case, default metrics won't be covered. On the other hand, the question about global labels still needs to be answered.

DifferentialOrange · 2024-03-15T08:40:37Z

metrics/http_middleware.lua

@@ -69,6 +69,8 @@ function export.configure_default_collector(...)
    export.set_default_collector(export.build_default_collector(...))
 end

+local labels_serializer = metrics_api.labels_serializer({ "path", "method", "status" })


It would likely break using the global labels.

DifferentialOrange · 2024-03-15T08:45:07Z

metrics/serializers.lua

+--- @return LabelsSerializer
+local function basic_labels_serializer(labels_keys)
+    -- we always add keys that are needed for metrics' internals.
+    local __labels_keys = { "le" }


It also seems non-optimized: we will have a key that works only for a histogram. Maybe it's worth to have a different factories for counter/gauge, histogram and summary?

ftelnov added 8 commits November 3, 2023 09:59

add performant labels serializer

725b56c

`make_key` is not very efficient itself, so special `labels_serializer` is added with optimized scheme.

fix stackoverflow issue, add hist tests

7765182

satisfy linter

0aeaf76

add changelog note

e0fd167

automatically add needed labels

2a1901c

enhance algo by protecting against repetive labels

270cbe5

add wrapping to http_middleware

1f6d58f

move to the separate module, rename metamethod

961ec61

github-actions bot added the Stale label Jan 2, 2024

yngvar-antonsson removed the Stale label Jan 23, 2024

sergepetrenko assigned DifferentialOrange Mar 15, 2024

sergepetrenko requested a review from DifferentialOrange March 15, 2024 07:57

DifferentialOrange reviewed Mar 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add an efficient labels serializer #474

add an efficient labels serializer #474

ftelnov commented Nov 3, 2023 •

edited

Loading

github-actions bot commented Jan 2, 2024

DifferentialOrange left a comment

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

DifferentialOrange Mar 15, 2024

	labels_serializer = serializers.basic_labels_serializer
	labels_serializer = serializers.basic_labels_serializer,

		@@ -125,6 +125,68 @@ local function set_global_labels(label_pairs)
		registry:set_labels(label_pairs)

	- Allow users to decide on labels serialization scheme themselves. Add `labels_serializer`, which provides alternative efficient labels serialization scheme.
	- Allow users to customize labels serialization with `metrics.labels_serializer` API.

add an efficient labels serializer #474

Are you sure you want to change the base?

add an efficient labels serializer #474

Conversation

ftelnov commented Nov 3, 2023 • edited Loading

github-actions bot commented Jan 2, 2024

DifferentialOrange left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ftelnov commented Nov 3, 2023 •

edited

Loading