-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io.fabric8.kubernetes.client.KubernetesClientException: not ready after 5000 MILLISECONDS #3795
Comments
Can you provide what the dsl call looks like leading up to the exec? I can reproduce similar behavior if I omit handling for stderr:
works, but
fails with the timeout |
Hi Shawkins, Thanks for responding to this. How can i check that dsl call? Can you please help me on that? |
It's probably this: https://github.com/jenkinsci/kubernetes-plugin/blame/master/src/main/java/org/csanchez/jenkins/plugins/kubernetes/pipeline/ContainerExecDecorator.java#L373 That has handling for error, so it's not the cause. |
Ok got it thanks, i never checked this. I'm using https://github.com/jenkinsci/helm-charts helm chart to Deploy this service. how do we fix this Error? it's blocking almost 4 Jenkins servers. since 3 weeks. |
I'm not sure.
|
Shawkins,
Custom values for jenkins.controller: output Error: io.fabric8.kubernetes.client.KubernetesClientException: not ready after 70000 MILLISECONDS
Nothing worked out. Not sure if i'm missing anything here? I can share few more logs, when this pods/job fail i see this Error. always across all pods hostname will be different. `Jan 24, 2022 11:57:25 AM hudson.remoting.jnlp.Main$CuiListener status Jan 24, 2022 11:57:25 AM hudson.remoting.jnlp.Main$CuiListener error ` |
I'm also hitting this on a regular basis error after updating Jenkins + installed plugins. Unfortunately don't have a records of the old versions, but it was fairly old, so would probably not be of much use either way :) |
We started receiving this error a lot after bumping Jenkins server and Kubernetes plugin: Jenkins: 2.320 -> 2.337 Stacktrace:
|
hey, @rg970197 were you able to fix it? Update:
Jenkins Server is in 2.338 but I think this is pointless |
Hey, |
@mateusvtt and @rg970197 indeed, the issue can be on different location. So, that would help us to ensure that there were no configuration change in the way the |
hey @sunix I'm following this issue https://issues.jenkins.io/browse/JENKINS-67664 |
I also encountered this problem, please solve it |
Similar issues being found on a k8s helm deployment of jenkins for my company also. We've hit a stack trace like below:
This started happening out of the blue after deploying a firewall rule to block some k8s ports, but none of which are pertinent (as far as I am aware) to jenkins running.
Any advice appreciated. |
I have been trying to troubleshoot this for some time. There are a few things to note:
When collecting data in impacted environments, the problem seems to happen even if a single connection is being made from the client. So not necessarily related to the Okhttp thread pool managing concurrent connections for example. @shawkins your comment about the stderr handling is interesting. I am curious about why you thought about this ? It is interesting as the Jenkins kubernetes plugin does indeed handles stdout / stderr with:
Do you think this has an impact on the socket connection response ? If yes, I am wondering if 6.0.0 refactors such as #4115 and the |
There have been some additional changes in 6, have you tried that as well? And possibly with an alternative client - JDK or Jetty? That would help determine if it's an okhttp specific problem.
It's been a while since I looked at this. In my local test had no stdout / stderr handling and seemed to produce a similar exception - however at least on 5.12/6.0 you can clearly see that's not due to the timeout, but a bad request. In the exception chain: ... It seems that the api server wants to see that you are handling at least one of the streams to accept the request - which is not the case here based upon the Jenkins code.
If you are still getting a timeout in 5.12, then I still am not quite sure what is going on. It sounds like something highly related was already happening on the 5.4 kubernetes client - that would suggest to me that it's more due to an environmental issue. |
We've been troubleshooting this issue for days now and so far we have tried:
None of these have made any difference. So we tried rolling the kubernetes plugin back to
Does anyone know how we can force kubernetes:1.30.0 to accept kubernetes-client-api:5.4.1 (as it should)? because this appears to be the cause of everyone's trouble if versions of kubernetes-client-api higher than 5.4.1 is the real problem here. EDIT: it was installing the wrong version because of the order that the Jenkins helm chart installs dependencies, we had to explicitly state |
I was getting this error recently on a corporate server. I have no access to the jenkins configurations because of that fact but I was able to get my builds to work using an additional CPU and a little extra memory. Probably not a solution to the issue but definitely a workaround for me. You may want to consider this if you need to get your builds around this issue. Although if this fixes the issue for many of the people experiencing it. Devs may need to consider this a resource starvation problem |
Hi, we are facing same issue. We are using Our pipelines are failing with above error.
|
Does this issue still happen on the latest version? |
Update to latest Kubernetes plugin, which introduces retry on the timeouts. |
Jenkins Kubernetes Client API is using Fabric8 Kubernetes Client v6.4.1 at the moment. Is this issue still reproducible on recent versions of Jenkins Kubernetes Plugin (> 3893.v73d36f3b_9103) ? |
I asked around and I here that we haven't seen this issue lately. |
Same here, I think this one can be closed :) |
@shawkins This is my version information Jenkins version: 2.375.1 Kubernetes version: 3923.v294a_d4250b_91 kubernetes-cli:1.12.0 |
|
@lin-ket I am facing same issue with |
3883.v4d70a_a_a_df034 is 2 years old, if you are running a version of Kubernetes on your cluster that is more recent then you should update your Kubernetes plugin - they changed something a while back with the liveness/readiness probes that made deployments using the old kubernes plugin fail. I think your Jenkins version is fine though. |
Describe the bug
This is happening in my 4 Jenkins servers after upgrading Jenkins and AKS.
`Af of sudden all Jenkins agent pods started giving Errors as below, Few pods are working and few are giving Errors. This is Happening 1-2 times out of 4-5 attempts.
AKS Version : 1.20.13
Jenkins Version, which clusters is having different version. I can reproduce this Error in all versions.
AKS-1:
kubernetes:1.30.1
kubernetes-client-api:5.10.1-171.vaa0774fb8c20
kubernetes-credentials:0.8.0
AKS-2:
kubernetes:1.31.3
kubernetes-client-api:5.11.2-182.v0f1cf4c5904e
kubernetes-credentials:0.9.0
AKS-3:
kubernetes:1.30.1
kubernetes-client-api:5.10.1-171.vaa0774fb8c20
kubernetes-credentials:0.8.0
AKS-4:
kubernetes:1.31.3
workflow-job:1145.v7f2433caa07f
workflow-aggregator:2.6`
Troubleshooting steps i did.
Not sure how to resolve this Error. Please let me know if i'm missing anything.
Fabric8 Kubernetes Client version
5.10.1@latest
Steps to reproduce
Create/run job will give this Error.
Expected behavior
Pod should get successfully executed .
Runtime
Kubernetes (vanilla)
Kubernetes API Server version
1.21.6
Environment
Linux
Fabric8 Kubernetes Client Logs
Additional context
No response
The text was updated successfully, but these errors were encountered: