Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add new metric request_throughput #619

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tinitiuset
Copy link

What this PR does:

This PR adds two configuration parameters server.throughput-config.slow-request-cutoff and server.throughput-config.unit, exposes a new metric slow_request_server_throughput and calculates throughput in units/s for the metric using information from header Server-Timing.

If server.throughput-config.slow-request-cutoff is 0 no throughput will be calculated.

Implemented code is really similar to this branch by @krajorama. But adds some flexibility to measure throughput based on different signals. It will default to total_samples as processed samples are easier to explain to users. Discussed with @dimitarvdimitrov.

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@CLAassistant
Copy link

CLAassistant commented Nov 14, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening!

The only thing I'm not sure about is the cutoff in milliseconds. I don't think that's what the original issue discussed, can you double-check?

It would also be nice to see some tests, the logic is getting a bit more complicated and shouldn't stay untested.

CHANGELOG.md Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
server/metrics.go Show resolved Hide resolved
middleware/instrument.go Outdated Show resolved Hide resolved
middleware/instrument.go Outdated Show resolved Hide resolved
middleware/instrument.go Outdated Show resolved Hide resolved
server/metrics.go Outdated Show resolved Hide resolved
middleware/instrument.go Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
@tinitiuset tinitiuset changed the title Feature: Add new metric slow_request_server_throughput Feature: Add new metric slow_request_throughput Nov 15, 2024
middleware/instrument.go Outdated Show resolved Hide resolved
middleware/instrument.go Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
@tinitiuset tinitiuset changed the title Feature: Add new metric slow_request_throughput Feature: Add new metric request_throughput Nov 21, 2024
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly LGTM, although looking at the Server-Timing spec, i think we're missing some cases in how it can be formatted.

Comment on lines +27 to +29
{"WithoutSleep", false, "unit=0, other_unit=2", false},
{"WithoutSleep", true, "", false},
{"WithoutSleep", false, "", false},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the names of these tests?

middleware/instrument_test.go Show resolved Hide resolved
middleware/instrument_test.go Show resolved Hide resolved
expected int64
err bool
}{
{"key0=0, key1=1", "key0", 0, false},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add keys to each struct initialization?

Suggested change
{"key0=0, key1=1", "key0", 0, false},
{header: "key0=0, key1=1", key: "key0", expected: 0, err: false},

maybe putting them on separate lines might make this more readable

Suggested change
{"key0=0, key1=1", "key0", 0, false},
{
header: "key0=0, key1=1",
key: "key0",
expected: 0,
err: false,
},

Comment on lines +143 to +145
if value != tt.expected {
t.Errorf("expected value: %d, got: %d", tt.expected, value)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we normally use testify for these kind of comparions:

Suggested change
if value != tt.expected {
t.Errorf("expected value: %d, got: %d", tt.expected, value)
}
assert.Equal(t, tt.expected, value)

it reduces the boilerplate

@@ -209,6 +216,8 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.StringVar(&cfg.LogRequestExcludeHeadersList, "server.log-request-headers-exclude-list", "", "Comma separated list of headers to exclude from loggin. Only used if server.log-request-headers is true.")
f.BoolVar(&cfg.LogRequestAtInfoLevel, "server.log-request-at-info-level-enabled", false, "Optionally log requests at info level instead of debug level. Applies to request headers as well if server.log-request-headers is enabled.")
f.BoolVar(&cfg.ProxyProtocolEnabled, "server.proxy-protocol-enabled", false, "Enables PROXY protocol.")
f.DurationVar(&cfg.Throughput.RequestCutoff, "server.throughput.request-cutoff", 0, "Duration after which a request will be observed. For requests taking longer than this duration to finish, the throughput will be calculated. If set to 0, the throughput will not be calculated.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when reading this in the context of all the config in server.Config I am not sure what request cutoff does. Can you clarify how it interacts with the Server-Timing header, the unit, and the request_throughput histogram?

@@ -209,6 +216,8 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.StringVar(&cfg.LogRequestExcludeHeadersList, "server.log-request-headers-exclude-list", "", "Comma separated list of headers to exclude from loggin. Only used if server.log-request-headers is true.")
f.BoolVar(&cfg.LogRequestAtInfoLevel, "server.log-request-at-info-level-enabled", false, "Optionally log requests at info level instead of debug level. Applies to request headers as well if server.log-request-headers is enabled.")
f.BoolVar(&cfg.ProxyProtocolEnabled, "server.proxy-protocol-enabled", false, "Enables PROXY protocol.")
f.DurationVar(&cfg.Throughput.RequestCutoff, "server.throughput.request-cutoff", 0, "Duration after which a request will be observed. For requests taking longer than this duration to finish, the throughput will be calculated. If set to 0, the throughput will not be calculated.")
f.StringVar(&cfg.Throughput.Unit, "server.throughput.unit", "total_samples", "Unit of the server throughput metric, for example 'processed_bytes' or 'total_samples'. Observed values will be gathered from Server-Timing header with defined key. If set, it is appended to the request_server_throughput metric name.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing - can you given an example value for the header and also mention that it's expected it is an integer value? Server-Timing: total_sample=20

@@ -73,5 +75,15 @@ func NewServerMetrics(cfg Config) *Metrics {
Name: "inflight_requests",
Help: "Current number of inflight requests.",
}, []string{"method", "route"}),
RequestThroughput: reg.NewHistogramVec(prometheus.HistogramOpts{
Namespace: cfg.MetricsNamespace,
Name: "slow_request_throughput_" + cfg.Throughput.Unit,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you suggested we change this to request_throughput_samples_total, are you still making the change?

}{
{"key0=0, key1=1", "key0", 0, false},
{"key0=0, key1=1", "key1", 1, false},
{"key0=0, key1=1", "key2", 0, true},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i checked the format for Server-Timing according to mozilla https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Server-Timing

can you add a few more test cases to make sure we're handling all of them correctly? Maybe one with dur, one with a ;, one with repeated keys.

Name: "request_duration_seconds",
Help: "Time (in seconds) spent serving HTTP requests.",
Buckets: instrument.DefBuckets,
NativeHistogramBucketFactor: metricsNativeHistogramFactor,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the native histogram config for all of these histograms? it's only test and we only export the text-based format, which doesn't include native histograms.

I'm trying to keep the function as simple as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants