Skip to content

Commit

Permalink
stats: integrate with metrics rock
Browse files Browse the repository at this point in the history
If `metrics` [1] found, you can use metrics collectors to store
statistics. `metrics >= 0.10.0` is required to use metrics driver.
(`metrics >= 0.9.0` is required to use summary quantiles with
age buckets. `metrics >= 0.5.0, < 0.9.0` is unsupported
due to quantile overflow bug [2]. `metrics == 0.9.0` has bug that do
not permits to create summary collector without quantiles [3].
In fact, user may use `metrics >= 0.5.0`, `metrics != 0.9.0`
if he wants to use metrics without quantiles, and `metrics >= 0.9.0`
if he wants to use metrics with quantiles. But this is confusing,
so let's use a single restriction for both cases.)

The metrics are part of global registry and can be exported together
(e.g. to Prometheus) with default tools without any additional
configuration. Disabling stats destroys the collectors.

Metrics collectors are used by default if supported. To explicitly set
driver, call `crud.cfg{ stats = true, stats_driver = driver }`
('local' or 'metrics'). To enable quantiles, call
```
crud.cfg{
    stats = true,
    stats_driver = 'metrics',
    stats_quantiles = true,
}
```
With quantiles, `latency` statistics are changed to 0.99 quantile
of request execution time (with aging). Quantiles computations increases
performance overhead up to 10% when used in statistics.

Add CI matrix to run tests with `metrics` installed. To get full
coverage on coveralls, #248 must be resolved.

1. https://github.com/tarantool/metrics
2. tarantool/metrics#235
3. tarantool/metrics#262

Closes #224
  • Loading branch information
DifferentialOrange committed Feb 24, 2022
1 parent 27d379e commit 2966f4c
Show file tree
Hide file tree
Showing 12 changed files with 1,192 additions and 136 deletions.
21 changes: 20 additions & 1 deletion .github/workflows/test_on_push.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,21 @@ jobs:
matrix:
# We need 1.10.6 here to check that module works with
# old Tarantool versions that don't have "tuple-keydef"/"tuple-merger" support.
tarantool-version: ["1.10.6", "1.10", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7"]
tarantool-version: ["1.10.6", "1.10", "2.2", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8"]
metrics-version: [""]
remove-merger: [false]
include:
- tarantool-version: "1.10"
metrics-version: "0.12.0"
- tarantool-version: "2.7"
remove-merger: true
- tarantool-version: "2.8"
metrics-version: "0.1.8"
- tarantool-version: "2.8"
metrics-version: "0.10.0"
- tarantool-version: "2.8"
coveralls: true
metrics-version: "0.12.0"
fail-fast: false
runs-on: [ubuntu-latest]
steps:
Expand Down Expand Up @@ -47,6 +55,10 @@ jobs:
tarantool --version
./deps.sh
- name: Install metrics
if: matrix.metrics-version != ''
run: tarantoolctl rocks install metrics ${{ matrix.metrics-version }}

- name: Remove external merger if needed
if: ${{ matrix.remove-merger }}
run: rm .rocks/lib/tarantool/tuple/merger.so
Expand All @@ -71,6 +83,7 @@ jobs:
strategy:
matrix:
bundle_version: [ "1.10.11-0-gf0b0e7ecf-r422", "2.7.3-0-gdddf926c3-r422" ]
metrics-version: ["", "0.12.0"]
fail-fast: false
runs-on: [ ubuntu-latest ]
steps:
Expand All @@ -86,6 +99,12 @@ jobs:
tarantool --version
./deps.sh
- name: Install metrics
if: matrix.metrics-version != ''
run: |
source tarantool-enterprise/env.sh
tarantoolctl rocks install metrics ${{ matrix.metrics-version }}
# This server starts and listen on 8084 port that is used for tests
- name: Stop Mono server
run: sudo kill -9 $(sudo lsof -t -i tcp:8084) || true
Expand Down
1 change: 1 addition & 0 deletions .luacheckrc
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ globals = {'box', 'utf8', 'checkers', '_TARANTOOL'}
include_files = {'**/*.lua', '*.luacheckrc', '*.rockspec'}
exclude_files = {'**/*.rocks/', 'tmp/', 'tarantool-enterprise/'}
max_line_length = 120
max_comment_line_length = 150
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

### Added
* Statistics for CRUD operations on router (#224).
* Integrate CRUD statistics with [`metrics`](https://github.com/tarantool/metrics) (#224).

### Changed

Expand Down
60 changes: 57 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -694,11 +694,28 @@ crud.cfg{ stats = false }
crud.reset_stats()
```

If [`metrics`](https://github.com/tarantool/metrics) `0.10.0` or greater
found, metrics collectors will be used by default to store statistics
instead of local collectors. Quantiles in metrics summary collections
are disabled by default. You can manually choose driver and enable quantiles.
```lua
-- Use metrics collectors. (Default if metrics found).
crud.cfg{ stats = true, stats_driver = 'metrics' }

-- Use metrics collectors with 0.99 quantiles.
crud.cfg{ stats = true, stats_driver = 'metrics', stats_quantiles = true }

-- Use simple local collectors.
crud.cfg{ stats = true, stats_driver = 'local' }
```

You can use `crud.cfg` to check current stats state.
```lua
crud.cfg
---
- stats: true
- stats_quantiles: true
stats: true
stats_driver: local
...
```
Beware that iterating through `crud.cfg` with pairs is not supported yet,
Expand Down Expand Up @@ -750,9 +767,39 @@ and `borders` (for `min` and `max` calls).
Each operation section contains of different collectors
for success calls and error (both error throw and `nil, err`)
returns. `count` is total requests count since instance start
or stats restart. `latency` is average time of requests execution,
or stats restart. `latency` is 0.99 quantile of request execution
time if `metrics` driver used and quantiles enabled,
otherwise `latency` is total average.
`time` is the total time of requests execution.

In [`metrics`](https://www.tarantool.io/en/doc/latest/book/monitoring/)
registry statistics are stored as `tnt_crud_stats` metrics
with `operation`, `status` and `name` labels.
```
metrics:collect()
---
- - label_pairs:
status: ok
operation: insert
name: customers
value: 221411
metric_name: tnt_crud_stats_count
- label_pairs:
status: ok
operation: insert
name: customers
value: 10.49834896344692
metric_name: tnt_crud_stats_sum
- label_pairs:
status: ok
operation: insert
name: customers
quantile: 0.99
value: 0.00023606420935973
metric_name: tnt_crud_stats
...
```

`select` section additionally contains `details` collectors.
```lua
crud.stats('my_space').select.details
Expand All @@ -769,6 +816,10 @@ looked up on storages while collecting responses for calls (including
scrolls for multibatch requests). Details data is updated as part of
the request process, so you may get new details before `select`/`pairs`
call is finished and observed with count, latency and time collectors.
In [`metrics`](https://www.tarantool.io/en/doc/latest/book/monitoring/)
registry they are stored as `tnt_crud_map_reduces`,
`tnt_crud_tuples_fetched` and `tnt_crud_tuples_lookup` metrics
with `{ operation = 'select', name = space_name }` labels.

Since `pairs` request behavior differs from any other crud request, its
statistics collection also has specific behavior. Statistics (`select`
Expand All @@ -780,7 +831,10 @@ collector.

Statistics are preserved between package reloads. Statistics are preserved
between [Tarantool Cartridge role reloads](https://www.tarantool.io/en/doc/latest/book/cartridge/cartridge_api/modules/cartridge.roles/#reload)
if you use CRUD Cartridge roles.
if you use CRUD Cartridge roles. Beware that metrics 0.12.0 and below do not
support preserving stats between role reload
(see [tarantool/metrics#334](https://github.com/tarantool/metrics/issues/334)),
thus this feature will be unsupported for `metrics` driver.

## Cartridge roles

Expand Down
70 changes: 59 additions & 11 deletions crud/cfg.lua
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,49 @@ local function set_defaults_if_empty(cfg)
cfg.stats = false
end

if cfg.stats_driver == nil then
cfg.stats_driver = stats.get_default_driver()
end

if cfg.stats_quantiles == nil then
cfg.stats_quantiles = false
end

return cfg
end

local cfg = set_defaults_if_empty(stash.get(stash.name.cfg))

local function configure_stats(cfg, opts)
if (opts.stats == nil)
and (opts.stats_driver == nil)
and (opts.stats_quantiles == nil) then
return
end

if opts.stats == nil then
opts.stats = cfg.stats
end

if opts.stats_driver == nil then
opts.stats_driver = cfg.stats_driver
end

if opts.stats_quantiles == nil then
opts.stats_quantiles = cfg.stats_quantiles
end

if opts.stats == true then
stats.enable{ driver = opts.stats_driver, quantiles = opts.stats_quantiles }
else
stats.disable()
end

rawset(cfg, 'stats', opts.stats)
rawset(cfg, 'stats_driver', opts.stats_driver)
rawset(cfg, 'stats_quantiles', opts.stats_quantiles)
end

--- Configure CRUD module.
--
-- @function __call
Expand All @@ -34,22 +72,32 @@ local cfg = set_defaults_if_empty(stash.get(stash.name.cfg))
-- Enable or disable statistics collect.
-- Statistics are observed only on router instances.
--
-- @string[opt] opts.stats_driver
-- `'local'` or `'metrics'`.
-- If `'local'`, stores statistics in local registry (some Lua tables)
-- and computes latency as overall average. `'metrics'` requires
-- `metrics >= 0.10.0` installed and stores statistics in
-- global metrics registry (integrated with exporters).
-- `'metrics'` driver supports computing latency as 0.99 quantile with aging.
-- If `'metrics'` driver is available, it is used by default,
-- otherwise `'local'` is used.
--
-- @bool[opt] opts.stats_quantiles
-- Enable or disable statistics quantiles (only for metrics driver).
-- Quantiles computations increases performance overhead up to 10%.
--
-- @return Configuration table.
--
local function __call(self, opts)
checks('table', { stats = '?boolean' })
checks('table', {
stats = '?boolean',
stats_driver = '?string',
stats_quantiles = '?boolean'
})

opts = opts or {}
opts = table.deepcopy(opts) or {}

if opts.stats ~= nil then
if opts.stats == true then
stats.enable()
else
stats.disable()
end

rawset(cfg, 'stats', opts.stats)
end
configure_stats(cfg, opts)

return self
end
Expand Down
6 changes: 5 additions & 1 deletion crud/common/stash.lua
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,14 @@ local stash = {}
-- @tfield string stats_local_registry
-- Stash for local metrics registry.
--
-- @tfield string stats_metrics_registry
-- Stash for metrics rocks statistics registry.
--
stash.name = {
cfg = '__crud_cfg',
stats_internal = '__crud_stats_internal',
stats_local_registry = '__crud_stats_local_registry'
stats_local_registry = '__crud_stats_local_registry',
stats_metrics_registry = '__crud_stats_metrics_registry'
}

--- Setup Tarantool Cartridge reload.
Expand Down
Loading

0 comments on commit 2966f4c

Please sign in to comment.