-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k6runner: use check context for http request #715
Conversation
internal/k6runner/k6runner.go
Outdated
// The context above carries the check timeout, which will be eventually passed to k6 by the runner at the other end | ||
// of this request. To account for network overhead, we add a second of grace time to the script timeout to form | ||
// the timeout of this request. | ||
reqCtx, cancel := context.WithTimeout(ctx, k6Timeout+time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this actually work? The parent context ctx
will timeout first, causing the child context to timeout, too, and ignore the extra second...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, I was thinking "if this is not timing out before it won't now" but that does not make any sense... 🤦🏻♀️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've worked around this by creating a different context entirely: 409666a
k6runner: add context to http request
* Chore(deps): Bump golang.org/x/net from 0.24.0 to 0.25.0 * Chore(deps): Bump the prometheus-go group across 1 directory with 2 updates * Chore(deps): Bump github.com/mccutchen/go-httpbin/v2 * Update to grafana-build-tools v0.11.0 (#705) * Remove adhoc and traceroute feature flags. (#707) * Chore(deps): Bump github.com/KimMachineGun/automemlimit * checks/test: make timer big enough for context cancel to be picked up * Fix: Interpolate variables into MultiHTTP request bodies (#713) * Chore(deps): Bump github.com/prometheus/prometheus (#716) * Chore(deps): Bump github.com/rs/zerolog from 1.32.0 to 1.33.0 * --- updated-dependencies: - dependency-name: kernel.org/pub/linux/libs/security/libcap/cap dependency-type: direct:production update-type: version-update:semver-patch ... * MultiHttp script should decode the payload as a string and replace the variable placeholders with the values. The result should be assigned to the request body. (#717) * Build(deps): Bump golang.org/x/net from 0.25.0 to 0.26.0 * Enable K6 by default in agent deployments (#722) * Fix: deprecate --features and warn user (#726) * k6runner: use check context for http request (#715) Signed-off-by: Anant Sharma <[email protected]>
* Chore(deps): Bump golang.org/x/net from 0.24.0 to 0.25.0 * Chore(deps): Bump the prometheus-go group across 1 directory with 2 updates * Chore(deps): Bump github.com/mccutchen/go-httpbin/v2 * Update to grafana-build-tools v0.11.0 (#705) * Remove adhoc and traceroute feature flags. (#707) * Chore(deps): Bump github.com/KimMachineGun/automemlimit * checks/test: make timer big enough for context cancel to be picked up * Fix: Interpolate variables into MultiHTTP request bodies (#713) * Chore(deps): Bump github.com/prometheus/prometheus (#716) * Chore(deps): Bump github.com/rs/zerolog from 1.32.0 to 1.33.0 * --- updated-dependencies: - dependency-name: kernel.org/pub/linux/libs/security/libcap/cap dependency-type: direct:production update-type: version-update:semver-patch ... * MultiHttp script should decode the payload as a string and replace the variable placeholders with the values. The result should be assigned to the request body. (#717) * Build(deps): Bump golang.org/x/net from 0.25.0 to 0.26.0 * Enable K6 by default in agent deployments (#722) * Fix: deprecate --features and warn user (#726) * k6runner: use check context for http request (#715) Signed-off-by: Anant Sharma <[email protected]>
This PR instructs the HTTP-backed k6 runner (Grafana Cloud) to use the context carrying the script timeout as the request context for HTTP requests made to that k6 runner. An extra second of grace time is added to account for network latency is added to the check timeout.
This fixes a potential deadlock where the agent would wait for the HTTP request to complete forever, even longer than the maximum time the script is allowed to run. This does not occur frequently, but could happen on certain scenarios such as when GC runners are abruptly terminated.
Note that this is a stopgap measure to prevent deadlocks in the agent. Ideally, we should have a separate settings for the time we wait for the runner to respond, and the time that the runner allows k6 to wait for. See details about the long-term proposal here: https://docs.google.com/document/d/1kq5xC4izHyJ9WOonS308J8sE3uDG48c6gj4ZPHsWIx4/edit.