-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout and unit tests for agent healthcheck #2437
Conversation
2646fcd
to
402c05e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to have the more granular view here.
) | ||
|
||
func TestHealthcheck_Sunny(t *testing.T) { | ||
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
} | ||
|
||
func TestHealthcheck_InvalidURL2(t *testing.T) { | ||
// leading space in url is invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it’s supposed to be there but it still really bothers me.
96b9674
to
435b1f1
Compare
@@ -87,6 +87,11 @@ func StartSession(params *TelemetrySessionParams, statsEngine stats.Engine) erro | |||
seelog.Errorf("Error: lost websocket connection with ECS Telemetry service (TCS): %v", tcsError) | |||
params.time().Sleep(backoff.Duration()) | |||
} | |||
select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this change, but this is a fix for when TestDoStartCgroupInitHappyPath has a failure after the test goroutine has already exited.
435b1f1
to
21812bb
Compare
cad4d90
to
cbe81e3
Compare
@@ -172,6 +172,7 @@ func createVolumeTask(scope, arn, volume string, autoprovision bool) (*apitask.T | |||
DriverOpts: map[string]string{ | |||
"device": tmpDirectory, | |||
"o": "bind", | |||
"type": "tmpfs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was added for compatibility with newer docker versions, see docker-archive/engine@d5b271c
Summary
The docker healthcheck default timeout is 30s. We don't currently set any http timeouts on our internal healthcheck so if the request hangs then we don't log anything in agent or explicitly return a failed healthcheck to docker.
Adding a timeout of 25s means we can timeout in the agent codebase, log the error, and then explicitly return the failed healthcheck error code.
Also did a little refactor to allow adding unit tests.
New tests cover the changes: yes
test output:
Description for the changelog
Log agent healthcheck timeouts.
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.