Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container stats: ignore context.Canceled errors #2813

Merged
merged 1 commit into from
Feb 24, 2021

Conversation

sparrc
Copy link
Contributor

@sparrc sparrc commented Feb 23, 2021

we often receive context.Canceled errors when a container exits during
docker stats collection.

there is a benign race condition where if we process the error first
before processing that the context is "Done", then we will log this
canceled error as a warning message here:

case <-container.ctx.Done():
return nil
case err := <-errC:
seelog.Warnf("Error encountered processing metrics stream from docker, this may affect cloudwatch metric accuracy: %s", err)
returnError = true

this change ignores these context.Canceled errors so that we don't log
them. This will eliminate log messages that look like this when a
container exits:

level=warn time=2020-12-25T07:51:33Z msg="Error encountered processing metrics stream from docker, this may affect cloudwatch metric accuracy: DockerGoClient: Unable to decode stats for container REDACTED: context canceled" module=container.go

Testing

Tested by running a large number of tasks with lots of container and task state changes and exits. Before this change I was seeing the log message around 20% of the time that containers exited. After the change I do not see this message.

New tests cover the changes: no

Description for the changelog

enhancement: eliminate benign docker stats "context canceled" warning messages from logs

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sparrc sparrc force-pushed the ignore-stats-context-canceled branch from 4505743 to 78b9029 Compare February 23, 2021 00:06
sharanyad
sharanyad previously approved these changes Feb 23, 2021
mythri-garaga
mythri-garaga previously approved these changes Feb 23, 2021
@@ -1393,6 +1393,10 @@ func (dg *dockerGoClient) Stats(ctx context.Context, id string, inactivityTimeou
stream := true
resp, err = client.ContainerStats(subCtx, id, stream)
if err != nil {
if errors.Is(err, context.Canceled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the client is the right place to ignore this error.
Can we still return it and ignore it where this method is invoked instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess we can have the client wrapping this error with %w....and then we could call this at a higher level, I will try it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works. Alternatively you could just use the plain error when it's context.Canceled

if errors.Is(err, context.Canceled) {
    errC <- err
    return
}

we often receive context.Canceled errors when a container exits during
docker stats collection.

there is a benign race condition where if we process the error first
before processing that the context is "Done", then we will log this
canceled error as a warning message here:
https://github.com/aws/amazon-ecs-agent/blob/5be7aa08bed215a557f48c16d8201ad3db59a9be/agent/stats/container.go#L118-L122

this change ignores these context.Canceled errors so that we don't log
them. This will eliminate log messages that look like this when a
container exits:

level=warn time=2020-12-25T07:51:33Z msg="Error encountered processing metrics stream from docker, this may affect cloudwatch metric accuracy: DockerGoClient: Unable to decode stats for container REDACTED: context canceled" module=container.go
@sparrc sparrc dismissed stale reviews from mythri-garaga and sharanyad via 2e52f9d February 23, 2021 01:25
@sparrc sparrc force-pushed the ignore-stats-context-canceled branch from 78b9029 to 2e52f9d Compare February 23, 2021 01:25
Copy link
Contributor

@sharanyad sharanyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@sparrc sparrc merged commit 59b62e0 into aws:dev Feb 24, 2021
@sparrc sparrc deleted the ignore-stats-context-canceled branch February 24, 2021 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants