Fix a logmon goroutine and memory leak #11261
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix a logmon leak causing high goroutine and memory usage when a task
restarts.
Logmon
FileRotator
buffers the task stdout/stderr streams andperiodically flushing them to log files. Logmon creates a new
FileRotator for each stream for each task run. However, the
flushPeriodically
goroutine is leaked when a task restarts,holding a reference to a no-longer-needed
FileRotator
instancealong with its 64kb buffer.
The cause is that the code assumed
time.Ticker.Stop()
closes theticker channel, thereby terminating the goroutine, but the documentation
says otherwise:
The PR adds https://github.com/uber-go/goleak dependency for detecting leaked goroutines. You can see the failures in CI about leaked goroutines in https://app.circleci.com/pipelines/github/hashicorp/nomad/17716/workflows/8fde2a15-09ae-4e21-b9a5-8e5de326f3f4/jobs/175847 but succeed with the fix commit in https://app.circleci.com/pipelines/github/hashicorp/nomad/17717/workflows/58804f90-428e-4c99-91f9-8d857b6fb979/jobs/175864.
Note, that I refactored the tests as well to use testify and ensure that file and rotator resources are closed properly for each test.
Fixes #9858