-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kubeadm control plane]: etcd communication errors are being swallowed #2454
Comments
/milestone v0.3.0 |
/milestone v0.3.x |
Bumping this given that it seems the upstream grpc fix might not go in soon |
@sethp-nr Not sure if you saw this response, should we leave things as they are if the upstream fix isn't merged? |
It ended up going in as an option under a different PR after some discussion in the linked thread (culminating here: grpc/grpc-go#2031 (comment) ). There's still some work in getting it into a released version & getting etcd to be compatible with that version & picking that version up here and then we can finally replace I'm working that in fits and starts when I have time, but if someone else wanted to do it I would not get in their way. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/lifecycle frozen |
The PR changes with /milestone v0.4.0 |
/help |
@vincepri: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/priority important-longterm |
/milestone v1.0 |
/assign @killianmuldoon |
FWIW I submitted #4997 some time ago that addressed at least one occurrence of error hiding / swallowing. Not sure if this bug report is about more though. |
/close until we get more evidence that there are still other occurrences of this error after #4997 merged |
@fabriziopandini: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What steps did you take and what happened:
While testing an upgrade the etcd health checks were failing repeatedly. With the code from #2451 in place I could resolve it down one level:
After some work, I found that my etcd ca secret was regenerated, changing the private key (see: #2454). It seems that GRPC has exactly one error message when the connection is misconfigured, and that's "context deadline exceeded." I haven't yet found a way to get more information on what happened via the API, but I'm continuing to dig.
What did you expect to happen:
When I set up the same condition with
etcdctl
andk port-forward
I got a helpful error message:Note the
Error: context deadline exceeded
is what came back fromclientv3.New
, and the other is a log statement being printed to stderr.Anything else you would like to add:
I found that any error sent back from the proxy dial function was being swallowed in the same way. It also looks like we're not using the
errorStream
we set up with the API Server, so it's possible that we'd miss important information about the proxy connection.Environment:
kubectl version
): a mix of v1.15 and v1.16 control plane nodes/etc/os-release
): ubuntu/kind bug
/assign
/lifecycle active
The text was updated successfully, but these errors were encountered: