Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export time-series metrics re: LSM state broken down by level #65276

Closed
joshimhoff opened this issue May 14, 2021 · 1 comment
Closed

Export time-series metrics re: LSM state broken down by level #65276

joshimhoff opened this issue May 14, 2021 · 1 comment
Labels
A-storage Relating to our storage engine (Pebble) on-disk storage. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-sre For issues SRE opened or otherwise cares about tracking. T-storage Storage Team

Comments

@joshimhoff
Copy link
Collaborator

joshimhoff commented May 14, 2021

Stats like count & size make fantastic time-series metrics:

[n1,s1] 5660606 +__level_____count____size___score______in__ingest(sz_cnt)____move(sz_cnt)___write(sz_cnt)____read___r-amp___w-amp
[n1,s1] 5660606 +    WAL         1    53 M       -    32 G       -       -       -       -    33 G       -       -       -     1.0
[n1,s1] 5660606 +      0        13    22 M    3.14    33 G    39 M       3     0 B       0   4.4 G   2.6 K     0 B       3     0.1
[n1,s1] 5660606 +      1         0     0 B    0.00     0 B     0 B       0     0 B       0     0 B       0     0 B       0     0.0
[n1,s1] 5660606 +      2         4    16 M    0.96   4.4 G    26 M       1    16 M       8    20 G   7.7 K    21 G       1     4.5
[n1,s1] 5660606 +      3        10    47 M    0.59   3.1 G     0 B       0   375 M      99    28 G    11 K    29 G       1     9.1
[n1,s1] 5660606 +      4       189   119 M    0.16   2.7 G     0 B       0   550 M      95   110 G    12 K   110 G       1    40.3
[n1,s1] 5660606 +      5      1154    33 G    1.00   6.5 G    10 K       8   471 M      84   148 G   5.3 K   148 G       1    22.8
[n1,s1] 5660606 +      6      5438   280 G       -   6.7 G     0 B       0     0 B       0    40 G     701    42 G       1     5.9
[n1,s1] 5660606 +  total      6808   313 G       -    33 G    65 M      12   1.4 G     286   383 G    39 K   350 G       8    11.6
[n1,s1] 5660606 +  flush       590
[n1,s1] 5660606 +compact      3758    38 M            19 G  (size == estimated-debt, in = in-progress-bytes)
[n1,s1] 5660606 + memtbl         1    64 M
[n1,s1] 5660606 +zmemtbl         0     0 B
[n1,s1] 5660606 +   ztbl         0     0 B
[n1,s1] 5660606 + bcache     180 K   3.0 G   81.3%  (score == hit-rate)
[n1,s1] 5660606 + tcache     6.8 K   4.0 M  100.0%  (score == hit-rate)
[n1,s1] 5660606 + titers        13
[n1,s1] 5660606 + filter         -       -   91.6%  (score == utility)

My understanding is the number of levels doesn't grow forever, so we can break the metrics down by level without worry about carnality limits.

It'll be much quicker to make basic sense of what is going with such metrics. Logs are good too but metrics can be made sense of at higher speed. Also, it is easier to correlate such metrics with some impact based metrics (SQL latencies or kvprober error rate) than it is to do so with logs.

@petermattis points out pebble exports such metrics even if CRDB doesn't export em: https://github.com/cockroachdb/pebble/blob/master/metrics.go#L125.

Here is an example of a metric that can broken down by level:

image

This will compliment #65277.

Jira issue: CRDB-7558

@joshimhoff joshimhoff added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-storage Relating to our storage engine (Pebble) on-disk storage. T-observability labels May 14, 2021
@joshimhoff joshimhoff changed the title Export time-series metrics re: LSM stae Export time-series metrics re: LSM state May 14, 2021
@joshimhoff joshimhoff added the O-sre For issues SRE opened or otherwise cares about tracking. label May 14, 2021
@joshimhoff joshimhoff changed the title Export time-series metrics re: LSM state Export time-series metrics re: LSM state broken down by level May 14, 2021
@exalate-issue-sync exalate-issue-sync bot added T-storage Storage Team and removed T-observability-inf labels Apr 11, 2023
@jbowens
Copy link
Collaborator

jbowens commented Apr 17, 2023

Fixed by #88504.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-storage Relating to our storage engine (Pebble) on-disk storage. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) O-sre For issues SRE opened or otherwise cares about tracking. T-storage Storage Team
Projects
Archived in project
Development

No branches or pull requests

3 participants