Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tarantool Feedback Daemon exporter #424

Merged
merged 5 commits into from
Jan 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Tests
on:
push:
branches:
- '*'
- '**'
paths-ignore:
- 'doc/**'
jobs:
Expand Down
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added

- Handle to clear psutils metrics
- `invoke_callbacks` option for `metrics.collect()`
- Ability to set metainfo for collectors
- Set `metainfo.default` to `true` for all collectors
from `enable_default_metrics()` and psutils collectors
- `default_only` option for `metrics.collect()`

### Fixed

Expand Down
24 changes: 19 additions & 5 deletions doc/monitoring/api_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,13 @@ A collector represents one or more observations that change over time.
counter
~~~~~~~

.. function:: counter(name [, help])
.. function:: counter(name [, help, metainfo])

Register a new counter.

:param string name: collector name. Must be unique.
:param string help: collector description.
:param table metainfo: collector metainfo.
:return: A counter object.
:rtype: counter_obj

Expand Down Expand Up @@ -85,12 +86,13 @@ counter
gauge
~~~~~

.. function:: gauge(name [, help])
.. function:: gauge(name [, help, metainfo])

Register a new gauge.

:param string name: collector name. Must be unique.
:param string help: collector description.
:param table metainfo: collector metainfo.

:return: A gauge object.

Expand Down Expand Up @@ -127,7 +129,7 @@ gauge
histogram
~~~~~~~~~

.. function:: histogram(name [, help, buckets])
.. function:: histogram(name [, help, buckets, metainfo])

Register a new histogram.

Expand All @@ -136,6 +138,7 @@ histogram
:param table buckets: histogram buckets (an array of sorted positive numbers).
The infinity bucket (``INF``) is appended automatically.
Default: ``{.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF}``.
:param table metainfo: collector metainfo.

:return: A histogram object.

Expand Down Expand Up @@ -184,7 +187,7 @@ histogram
summary
~~~~~~~

.. function:: summary(name [, help, objectives, params])
.. function:: summary(name [, help, objectives, params, metainfo])

Register a new summary. Quantile computation is based on the
`"Effective computation of biased quantiles over data streams" <https://ieeexplore.ieee.org/document/1410103>`_
Expand Down Expand Up @@ -217,6 +220,8 @@ summary
and how smooth the time window moves.
Default value: ``{max_age_time = math.huge, age_buckets_count = 1}``.

:param table metainfo: collector metainfo.

:return: A summary object.

:rtype: summary_obj
Expand Down Expand Up @@ -328,6 +333,7 @@ Metrics functions
* ``event_loop``

See :ref:`metrics reference <metrics-reference>` for details.
All metric collectors from the collection have ``metainfo.default = true``.

.. function:: set_global_labels(label_pairs)

Expand All @@ -347,10 +353,15 @@ Metrics functions

Note that both label names and values in ``label_pairs`` are treated as strings.

.. function:: collect()
.. function:: collect([opts])

Collect observations from each collector.

:param table opts: table of collect options:

* ``invoke_callbacks`` -- if ``true``, ``invoke_callbacks()`` is triggerred before actual collect.
* ``default_only`` -- if ``true``, observations contain only default metrics (``metainfo.default = true``).

.. class:: registry

.. method:: unregister(collector)
Expand Down Expand Up @@ -431,6 +442,7 @@ Metrics functions
.. function:: invoke_callbacks()

Invoke all registered callbacks. Has to be called before each ``collect()``.
(Since version **0.16.0**, you may use ``collect{invoke_callbacks = true}`` instead.)
If you're using one of the default exporters,
``invoke_callbacks()`` will be called by the exporter.

Expand Down Expand Up @@ -605,6 +617,8 @@ To enable CPU metrics, first register a callback function:
sum by (thread_name) (idelta(tnt_cpu_thread[$__interval]))
/ scalar(idelta(tnt_cpu_total[$__interval]) / tnt_cpu_count)

All psutils metric collectors have ``metainfo.default = true``.

To clear CPU metrics when you don't need them anymore, remove the callback and clear the collectors with a method:

.. code-block:: lua
Expand Down
5 changes: 4 additions & 1 deletion metrics/cartridge/failover.lua
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ local function update()
utils.set_counter(
'cartridge_failover_trigger_total',
'Count of Cartridge Failover triggers',
trigger_cnt
trigger_cnt,
nil,
nil,
{default = true}
)
end
end
Expand Down
6 changes: 4 additions & 2 deletions metrics/cartridge/issues.lua
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,15 @@ local function update()
for _, level in ipairs(levels) do
local len = fun.iter(issues):filter(function(x) return x.level == level end):length()
collectors_list.cartridge_issues =
utils.set_gauge('cartridge_issues', 'Tarantool Cartridge issues', len, {level = level})
utils.set_gauge('cartridge_issues', 'Tarantool Cartridge issues',
len, {level = level}, nil, {default = true})
end

local global_issues_cnt = rawget(_G, '__cartridge_issues_cnt')
if global_issues_cnt ~= nil then
collectors_list.global_issues =
utils.set_gauge('cartridge_cluster_issues', 'Tarantool Cartridge cluster issues', global_issues_cnt)
utils.set_gauge('cartridge_cluster_issues', 'Tarantool Cartridge cluster issues',
global_issues_cnt, nil, nil, {default = true})
end
end

Expand Down
11 changes: 6 additions & 5 deletions metrics/collectors/histogram.lua
Original file line number Diff line number Diff line change
Expand Up @@ -18,18 +18,19 @@ function Histogram.check_buckets(buckets)
return true
end

function Histogram:new(name, help, buckets)
local obj = Shared.new(self, name, help)
function Histogram:new(name, help, buckets, metainfo)
metainfo = table.copy(metainfo) or {}
local obj = Shared.new(self, name, help, metainfo)

obj.buckets = buckets or DEFAULT_BUCKETS
table.sort(obj.buckets)
if obj.buckets[#obj.buckets] ~= INF then
obj.buckets[#obj.buckets+1] = INF
end

obj.count_collector = Counter:new(name .. '_count', help)
obj.sum_collector = Counter:new(name .. '_sum', help)
obj.bucket_collector = Counter:new(name .. '_bucket', help)
obj.count_collector = Counter:new(name .. '_count', help, metainfo)
obj.sum_collector = Counter:new(name .. '_sum', help, metainfo)
obj.bucket_collector = Counter:new(name .. '_bucket', help, metainfo)

return obj
end
Expand Down
5 changes: 4 additions & 1 deletion metrics/collectors/shared.lua
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@ function Shared:new_class(kind, method_names)
return setmetatable(class, {__index = methods})
end

function Shared:new(name, help)
function Shared:new(name, help, metainfo)
metainfo = table.copy(metainfo) or {}

if not name then
error("Name should be set for %s")
end
Expand All @@ -33,6 +35,7 @@ function Shared:new(name, help)
help = help or "",
observations = {},
label_pairs = {},
metainfo = metainfo,
}, self)
end

Expand Down
9 changes: 5 additions & 4 deletions metrics/collectors/summary.lua
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,13 @@ local fiber = require('fiber')

local Summary = Shared:new_class('summary', {'observe_latency'})

function Summary:new(name, help, objectives, params)
function Summary:new(name, help, objectives, params, metainfo)
params = params or {}
local obj = Shared.new(self, name, help)
metainfo = table.copy(metainfo) or {}
local obj = Shared.new(self, name, help, metainfo)

obj.count_collector = Counter:new(name .. '_count', help)
obj.sum_collector = Counter:new(name .. '_sum', help)
obj.count_collector = Counter:new(name .. '_count', help, metainfo)
obj.sum_collector = Counter:new(name .. '_sum', help, metainfo)
obj.objectives = objectives
obj.max_age_time = params.max_age_time
obj.age_buckets_count = params.age_buckets_count or 1
Expand Down
52 changes: 38 additions & 14 deletions metrics/init.lua
Original file line number Diff line number Diff line change
Expand Up @@ -41,40 +41,64 @@ local function invoke_callbacks()
return registry:invoke_callbacks()
end

local function collect()
return registry:collect()
local function get_collector_values(collector, result)
for _, obs in ipairs(collector:collect()) do
table.insert(result, obs)
end
end

local function collect(opts)
checks({invoke_callbacks = '?boolean', default_only = '?boolean'})
opts = opts or {}

if opts.invoke_callbacks then
registry:invoke_callbacks()
end

local result = {}
for _, collector in pairs(registry.collectors) do
if opts.default_only then
if collector.metainfo.default then
get_collector_values(collector, result)
end
else
get_collector_values(collector, result)
end
end

return result
end

local function clear()
registry:clear()
end

local function counter(name, help)
checks('string', '?string')
local function counter(name, help, metainfo)
checks('string', '?string', '?table')

return registry:find_or_create(Counter, name, help)
return registry:find_or_create(Counter, name, help, metainfo)
end

local function gauge(name, help)
checks('string', '?string')
local function gauge(name, help, metainfo)
checks('string', '?string', '?table')

return registry:find_or_create(Gauge, name, help)
return registry:find_or_create(Gauge, name, help, metainfo)
end

local function histogram(name, help, buckets)
checks('string', '?string', '?table')
local function histogram(name, help, buckets, metainfo)
checks('string', '?string', '?table', '?table')
if buckets ~= nil and not Histogram.check_buckets(buckets) then
error('Invalid value for buckets')
end

return registry:find_or_create(Histogram, name, help, buckets)
return registry:find_or_create(Histogram, name, help, buckets, metainfo)
end

local function summary(name, help, objectives, params)
local function summary(name, help, objectives, params, metainfo)
checks('string', '?string', '?table', {
age_buckets_count = '?number',
max_age_time = '?number',
})
}, '?table')
if objectives ~= nil and not Summary.check_quantiles(objectives) then
error('Invalid value for objectives')
end
Expand All @@ -91,7 +115,7 @@ local function summary(name, help, objectives, params)
error('Age buckets count and max age must be present only together')
end

return registry:find_or_create(Summary, name, help, objectives, params)
return registry:find_or_create(Summary, name, help, objectives, params, metainfo)
end

local function set_global_labels(label_pairs)
Expand Down
9 changes: 5 additions & 4 deletions metrics/psutils/cpu.lua
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,10 @@ local threads = {}

local function update_cpu_metrics()
collectors_list.cpu_number = utils.set_gauge('cpu_number', 'The number of processors',
psutils.get_cpu_count())
psutils.get_cpu_count(), nil, nil, {default = true})

collectors_list.cpu_time = utils.set_gauge('cpu_time', 'Host CPU time', psutils.get_cpu_time())
collectors_list.cpu_time = utils.set_gauge('cpu_time', 'Host CPU time',
psutils.get_cpu_time(), nil, nil, {default = true})

local new_threads = {}
for _, thread_info in ipairs(psutils.get_process_cpu_time()) do
Expand All @@ -29,12 +30,12 @@ local function update_cpu_metrics()
local utime_labels = table.copy(labels)
utime_labels.kind = 'user'
collectors_list.cpu_thread = utils.set_gauge('cpu_thread', 'Tarantool thread cpu time',
thread_info.utime, utime_labels)
thread_info.utime, utime_labels, nil, {default = true})

local stime_labels = table.copy(labels)
stime_labels.kind = 'system'
collectors_list.cpu_thread = utils.set_gauge('cpu_thread', 'Tarantool thread cpu time',
thread_info.stime, stime_labels)
thread_info.stime, stime_labels, nil, {default = true})

threads[thread_info.pid] = nil
new_threads[thread_info.pid] = labels
Expand Down
6 changes: 4 additions & 2 deletions metrics/tarantool/clock.lua
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@ local function update_clock_metrics()
end
end

collectors_list.clock_delta = utils.set_gauge('clock_delta', 'Clock difference', min_delta * 1e-6, {delta = 'min'})
collectors_list.clock_delta = utils.set_gauge('clock_delta', 'Clock difference', max_delta * 1e-6, {delta = 'max'})
collectors_list.clock_delta = utils.set_gauge('clock_delta', 'Clock difference',
min_delta * 1e-6, {delta = 'min'}, nil, {default = true})
collectors_list.clock_delta = utils.set_gauge('clock_delta', 'Clock difference',
max_delta * 1e-6, {delta = 'max'}, nil, {default = true})
end

return {
Expand Down
6 changes: 4 additions & 2 deletions metrics/tarantool/cpu.lua
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,10 @@ end
local function update_info_metrics()
local cpu_time = ss_get_rusage()
if cpu_time then
collectors_list.cpu_user_time = utils.set_gauge('cpu_user_time', 'CPU user time usage', cpu_time.ru_utime)
collectors_list.cpu_system_time = utils.set_gauge('cpu_system_time', 'CPU system time usage', cpu_time.ru_stime)
collectors_list.cpu_user_time = utils.set_gauge('cpu_user_time', 'CPU user time usage',
cpu_time.ru_utime, nil, nil, {default = true})
collectors_list.cpu_system_time = utils.set_gauge('cpu_system_time', 'CPU system time usage',
cpu_time.ru_stime, nil, nil, {default = true})
end
end

Expand Down
9 changes: 6 additions & 3 deletions metrics/tarantool/event_loop.lua
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,12 @@ end

local function update_info_metrics()
local ev_once_time, ev_prolog, ev_epilog = evloop_time()
collectors_list.ev_loop_time = utils.set_gauge('ev_loop_time', 'Event loop time (ms)', ev_once_time)
collectors_list.ev_prolog_time = utils.set_gauge('ev_loop_prolog_time', 'Event loop prolog time (ms)', ev_prolog)
collectors_list.ev_epilog_time = utils.set_gauge('ev_loop_epilog_time', 'Event loop epilog time (ms)', ev_epilog)
collectors_list.ev_loop_time = utils.set_gauge('ev_loop_time', 'Event loop time (ms)',
ev_once_time, nil, nil, {default = true})
collectors_list.ev_prolog_time = utils.set_gauge('ev_loop_prolog_time', 'Event loop prolog time (ms)',
ev_prolog, nil, nil, {default = true})
collectors_list.ev_epilog_time = utils.set_gauge('ev_loop_epilog_time', 'Event loop epilog time (ms)',
ev_epilog, nil, nil, {default = true})
end

return {
Expand Down
12 changes: 8 additions & 4 deletions metrics/tarantool/fibers.lua
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,14 @@ local function update_fibers_metrics()
fused = fused + f.memory.used
end

collectors_list.fiber_amount = utils.set_gauge('fiber_amount', 'Amount of fibers', fibers)
collectors_list.fiber_csw = utils.set_gauge('fiber_csw', 'Fibers csw', csws)
collectors_list.fiber_memalloc = utils.set_gauge('fiber_memalloc', 'Fibers memalloc', falloc)
collectors_list.fiber_memused = utils.set_gauge('fiber_memused', 'Fibers memused', fused)
collectors_list.fiber_amount = utils.set_gauge('fiber_amount', 'Amount of fibers',
fibers, nil, nil, {default = true})
collectors_list.fiber_csw = utils.set_gauge('fiber_csw', 'Fibers csw',
csws, nil, nil, {default = true})
collectors_list.fiber_memalloc = utils.set_gauge('fiber_memalloc', 'Fibers memalloc',
falloc, nil, nil, {default = true})
collectors_list.fiber_memused = utils.set_gauge('fiber_memused', 'Fibers memused',
fused, nil, nil, {default = true})
end

return {
Expand Down
Loading