-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promtail: fix TargetManager.run() not exit after stop is called #5238
Promtail: fix TargetManager.run() not exit after stop is called #5238
Conversation
132cb0d
to
12b61a6
Compare
Great find! I'm wondering why it is not closed on the side of Prometheus. |
12b61a6
to
8a16043
Compare
@littlepangdi I've talk to maintainers from Prometheus. Would you mind proposing a change there? I think the channel should be closed when the goroutine is done. |
@jeschkies It does make sense to close on the source, I'll be glad to follow up on this issue there tomorrow. Also, about this PR , I think following the same
What do you think? |
@littlepangdi, that's a very good question. Is there a chance we loose a message when our goroutine finishes before the manager shuts down? |
This reminds me that there might be another goroutine leakage in this particular scenario, if a message comes in while/after In other words, |
8a16043
to
cf54848
Compare
@jeschkies Yes, it does when shutting down our goroutine via About another goroutine leakage proposed by @RangerCD , I just added a loki/clients/pkg/promtail/targets/file/filetarget.go Lines 123 to 127 in 91d837e
Or should I do this in another PR? Let me know if anything I can do here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think this is cleaner. We're not at risk of what Prometheus is doing.
LGTM on my side.
@jeschkies can you merge unless you really disagree here ?
What this PR does / why we need it:
This PR fix a possible goroutine leak in promtail:
FileTargetManager.run()
will not exit afterFileTargetManager.stop()
is called.manager.SyncCh()
inrun()
is a read-only channel copy ofmanager.syncCh
, and there is noclose(ch)
operation in manager's whole lifecycle. See code search below:So it would be stuck in
range ch
operation.By passing the
context
related toTargetManager.cancelfunc
, which is also passed to themanager
above to control its lifecycle, the problem is fixed and tested in our own project.Which issue(s) this PR fixes:
Fixes #5237
Special notes for your reviewer:
Checklist
CHANGELOG.md
about the changes.