-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Envoy does not release healthcheck log fd on SIGUSR1 #8249
Labels
Comments
That looks like a bug. |
So the only thing I see is that we loop on the flush timer event. We'll break out of the loop if there's data to be written or if an exit was requested. Seems like it should also check for reopen-requested. |
zuercher
added a commit
to zuercher/envoy
that referenced
this issue
Sep 16, 2019
Checks for the reopen flag when the log flush timer fires and issues the reopen even if no data is pending. This prevents Envoy from holding a file descriptor on rotated but seldom written log files until the next write. Risk Level: low Testing: add unit test Docs Changes: n/a Release Notes: n/a Fixes: envoyproxy#8249 Signed-off-by: Stephan Zuercher <[email protected]>
mattklein123
pushed a commit
that referenced
this issue
Sep 19, 2019
Checks for the reopen flag when the log flush timer fires and issues the reopen even if no data is pending. This prevents Envoy from holding a file descriptor on rotated but seldom written log files until the next write. Risk Level: low Testing: add unit test Docs Changes: n/a Release Notes: n/a Fixes: #8249 Signed-off-by: Stephan Zuercher <[email protected]>
danzh2010
pushed a commit
to danzh2010/envoy
that referenced
this issue
Sep 24, 2019
Checks for the reopen flag when the log flush timer fires and issues the reopen even if no data is pending. This prevents Envoy from holding a file descriptor on rotated but seldom written log files until the next write. Risk Level: low Testing: add unit test Docs Changes: n/a Release Notes: n/a Fixes: envoyproxy#8249 Signed-off-by: Stephan Zuercher <[email protected]>
danzh2010
pushed a commit
to danzh2010/envoy
that referenced
this issue
Oct 4, 2019
Checks for the reopen flag when the log flush timer fires and issues the reopen even if no data is pending. This prevents Envoy from holding a file descriptor on rotated but seldom written log files until the next write. Risk Level: low Testing: add unit test Docs Changes: n/a Release Notes: n/a Fixes: envoyproxy#8249 Signed-off-by: Stephan Zuercher <[email protected]>
danzh2010
pushed a commit
to danzh2010/envoy
that referenced
this issue
Oct 4, 2019
Checks for the reopen flag when the log flush timer fires and issues the reopen even if no data is pending. This prevents Envoy from holding a file descriptor on rotated but seldom written log files until the next write. Risk Level: low Testing: add unit test Docs Changes: n/a Release Notes: n/a Fixes: envoyproxy#8249 Signed-off-by: Stephan Zuercher <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Title:
http_health_check
log file descriptor is not being released on SIGUSR1Description:
After rotating the logs and sending a
SIGUSR1
to theenvoy
process (technically, to the hot restarter script), we are still seeing healthcheck logs being written to the old (rotated) file. This is validated by looking at the list of open files by theenvoy
process:Repro steps:
envoy --version
is e349fb6/1.11.1/clean-getenvoy-930d4a5/RELEASE/BoringSSLThe relevant part of the configuration looks roughly like this:
My logrotate configuration file looks like this:
An interesting thing to note: I do get an error from
logrotate
when performing the log rotation:error: error running shared postrotate script for '/var/log/envoy/*.log '
The syslog indicates that the
SIGUSR1
was correctly received by Envoy, though:Sep 16 11:43:40 some-server envoy-hot-restarter.py[8679]: [2019-09-16 11:43:40.005][8743][warning][main] [external/envoy/source/server/server.cc:473] caught SIGUSR1
The timestamps here also correlate well with the time when the other
*.log.1
files were created:Any ideas? It feels superficially similar to #4060 but I don't think we ever concluded that there was a problem with Envoy itself back then.
The text was updated successfully, but these errors were encountered: