-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix docker hanging when container killed #3612
Conversation
f3a071d
to
80ef5c2
Compare
@@ -49,6 +49,9 @@ metricbeat.modules: | |||
cpu_ticks: true | |||
---- | |||
|
|||
It is strongly recommend to not run docker metricsets with a period smaller then 3 seconds. The request to the docker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"It is strongly recommended"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout. This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout. Further changes: * Containers without names will be ignored, as these are containers for which the data could not be fetched. * Period was set to 1s by default instead of the period as document. This was changed. * Add documentation node about minimal period. Closes elastic#3610 The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang. This needs also backport to 5.2.2
80ef5c2
to
7278aa3
Compare
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout. This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout. Further changes: * Containers without names will be ignored, as these are containers for which the data could not be fetched. * Period was set to 1s by default instead of the period as document. This was changed. * Add documentation node about minimal period. Closes elastic#3610 The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang. This needs also backport to 5.2.2 (cherry picked from commit 99f17d6)
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout. This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout. Further changes: * Containers without names will be ignored, as these are containers for which the data could not be fetched. * Period was set to 1s by default instead of the period as document. This was changed. * Add documentation node about minimal period. Closes elastic#3610 The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang. This needs also backport to 5.2.2 (cherry picked from commit 99f17d6)
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout. This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout. Further changes: * Containers without names will be ignored, as these are containers for which the data could not be fetched. * Period was set to 1s by default instead of the period as document. This was changed. * Add documentation node about minimal period. Closes #3610 The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang. This needs also backport to 5.2.2 (cherry picked from commit 99f17d6)
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout. This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout. Further changes: * Containers without names will be ignored, as these are containers for which the data could not be fetched. * Period was set to 1s by default instead of the period as document. This was changed. * Add documentation node about minimal period. Closes elastic#3610 The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang. This needs also backport to 5.2.2 (cherry picked from commit 99f17d6)
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout. This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout. Further changes: * Containers without names will be ignored, as these are containers for which the data could not be fetched. * Period was set to 1s by default instead of the period as document. This was changed. * Add documentation node about minimal period. Closes #3610 The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang. This needs also backport to 5.2.2 (cherry picked from commit 99f17d6)
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout. This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout. Further changes: * Containers without names will be ignored, as these are containers for which the data could not be fetched. * Period was set to 1s by default instead of the period as document. This was changed. * Add documentation node about minimal period. Closes elastic#3610 The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang. This needs also backport to 5.2.2 (cherry picked from commit 06ecd67)
No timeout was passed to the docker client. It seems in case of a killed container it can happen that the connection is hanging. To interrupt this connection, the timeout from the metricset is passed to the client. That means in case info for a container cannot be fetched, it will timeout.
This change requires that the docker module is not run with a timeout of 3s seconds, which indirectly means a period of 3s. The reason is that already the http request waits ~2s for the response. So if 1s is set as timeout, all requests will timeout.
Further changes:
Closes #3610
The issue with this PR was introduce in 5.2.1 by fixing the memory leak. Before go routines just piled up, but now they caused filebeat to hang.
This needs also backport to 5.2.2