Enables HTTP/2 health check #95981

caesarxuchao · 2020-10-29T08:14:42Z

This PR enables the HTTP/2 connection health check for all the clients that call SetTransportDefaults. This includes all clients created via k8s.io/client-go. Without the health check, a broken transport layer connection can stay in the HTTP/2 connection pool for a long time, causing requests to the same host to fail. For example, a broken TCP connection can linger for 15 minutes before being closed.

The HTTP/2 feature exposes two parameters,

ReadIdleTimeout: if the HTTP/2 client has not received any frame from the connection for this amount of time, it starts sending periodic pings to the server. The period of the pings is also ReadIdleTimeout.
PingTimeout: if an ACK to the ping is not received by the client after this amount of time, the connection is considered broken and will be closed by the client.

For Kubernetes, I default the ReadIdleTimeout to 30s, which I think is not going to cause performance issues even in large clusters, because

the ping is carried by persisted connections,
the ping is handled at the http/2 layer, never surfaced to k8s layer.

I'll wait for results from pull-kubernetes-e2e-gce-large-performance to confirm.

I default the PingTimeout to 15s, which is relaxed enough in most network conditions.

Although I think most users don't need to tune these two parameters, I add environment variables to allow user to tune them, or even disable the health check, by setting HTTP2_READ_IDLE_TIMEOUT_SECONDS=0.

Fixes kubernetes/client-go#374 and
#87615.

Ref. #95898
#94844

/kind bug
/sig api-machinery

HTTP/2 connection health check is enabled by default in all Kubernetes clients. The feature should work out-of-the-box. If needed, users can tune the feature via the HTTP2_READ_IDLE_TIMEOUT_SECONDS and HTTP2_PING_TIMEOUT_SECONDS environment variables. The feature is disabled if HTTP2_READ_IDLE_TIMEOUT_SECONDS is set to 0.

caesarxuchao · 2020-10-29T08:24:52Z

I disagree with setting ReadIdleTimeout to 1/2 * IdleTimeout in #95898. The connection is considered "idle" if there is no stream using the connection (See here). And the connection is considered "readIdle" if there is no frame read from the connection (See here). The two concepts are unique to each other so we shouldn't use one value to determine the other value.

update: I think the IdleTimeout in http/1 behaves more like what's described in #95898. A connection is closed if there is no traffic for the IdleTimeout duration.

caesarxuchao · 2020-10-29T08:28:02Z

@cheftako, this health check is very similar to the GRPC health check we want to add to the apiserver-network-proxy to solve kubernetes-sigs/apiserver-network-proxy#152. Their performance implication is also similar.

caesarxuchao · 2020-10-29T18:55:16Z

@cheftako
I'm not too worried about the performance cost of sending the periodic pings, because

the ping is carried by persisted connections
the ping is handled at the http/2 layer, never surfaced to k8s layer
the ping is only sent if there is no traffic on a connection for the ReadIdleTimeout (30s) duration.

It would be more convincing if we can test the behavior in a large cluster. @kubernetes/sig-scalability is there an easy way to test this PR in a large cluster?

@JensErat do you mind sharing the size of the clusters where you turned on the http/2 health check feature? Thanks.

BenTheElder · 2020-10-29T18:57:07Z

/test
[intentionally triggering list of available tests]

k8s-ci-robot · 2020-10-29T18:57:09Z

@BenTheElder: The /test command needs one or more targets.
The following commands are available to trigger jobs:

/test pull-kubernetes-bazel-build
/test pull-kubernetes-bazel-test
/test pull-kubernetes-conformance-image-test
/test pull-kubernetes-conformance-kind-ipv6-parallel
/test pull-kubernetes-dependencies
/test pull-kubernetes-dependencies-canary
/test pull-kubernetes-e2e-ipvs-azure-dualstack
/test pull-kubernetes-e2e-iptables-azure-dualstack
/test pull-kubernetes-e2e-aws-eks-1-13-correctness
/test pull-kubernetes-files-remake
/test pull-kubernetes-e2e-gce
/test pull-kubernetes-e2e-gce-no-stage
/test pull-kubernetes-e2e-gce-kubetest2
/test pull-kubernetes-e2e-gce-canary
/test pull-kubernetes-e2e-gce-ubuntu
/test pull-kubernetes-e2e-gce-ubuntu-containerd
/test pull-kubernetes-e2e-gce-ubuntu-containerd-canary
/test pull-kubernetes-e2e-gce-rbe
/test pull-kubernetes-e2e-gce-alpha-features
/test pull-kubernetes-e2e-gce-device-plugin-gpu
/test pull-kubernetes-integration
/test pull-kubernetes-cross
/test pull-kubernetes-e2e-kind
/test pull-kubernetes-e2e-kind-canary
/test pull-kubernetes-e2e-kind-ipv6
/test pull-kubernetes-e2e-kind-ipv6-canary
/test pull-kubernetes-conformance-kind-ga-only
/test pull-kubernetes-conformance-kind-ga-only-parallel
/test pull-kubernetes-e2e-kops-aws
/test pull-kubernetes-bazel-build-canary
/test pull-kubernetes-bazel-test-canary
/test pull-kubernetes-bazel-test-integration-canary
/test pull-kubernetes-local-e2e
/test pull-publishing-bot-validate
/test pull-kubernetes-e2e-gce-network-proxy-http-connect
/test pull-kubernetes-e2e-gce-network-proxy-grpc
/test pull-kubernetes-e2e-gci-gce-autoscaling
/test pull-kubernetes-e2e-aks-engine-azure
/test pull-kubernetes-e2e-azure-disk
/test pull-kubernetes-e2e-azure-disk-vmss
/test pull-kubernetes-e2e-azure-file
/test pull-kubernetes-e2e-kind-dual-canary
/test pull-kubernetes-e2e-kind-ipvs-dual-canary
/test pull-kubernetes-e2e-gci-gce-ipvs
/test pull-kubernetes-node-e2e
/test pull-kubernetes-e2e-containerd-gce
/test pull-kubernetes-node-e2e-containerd
/test pull-kubernetes-node-e2e-alpha
/test pull-kubernetes-node-kubelet-serial-cpu-manager
/test pull-kubernetes-node-kubelet-serial-topology-manager
/test pull-kubernetes-node-kubelet-serial-hugepages
/test pull-kubernetes-node-crio-cgrpv2-e2e
/test pull-kubernetes-node-crio-e2e
/test pull-kubernetes-e2e-gce-100-performance
/test pull-kubernetes-e2e-gce-big-performance
/test pull-kubernetes-e2e-gce-correctness
/test pull-kubernetes-e2e-gce-large-performance
/test pull-kubernetes-kubemark-e2e-gce-big
/test pull-kubernetes-kubemark-e2e-gce-scale
/test pull-kubernetes-e2e-gce-storage-slow
/test pull-kubernetes-e2e-gce-storage-snapshot
/test pull-kubernetes-e2e-gce-storage-slow-rbe
/test pull-kubernetes-e2e-gce-csi-serial
/test pull-kubernetes-e2e-gce-iscsi
/test pull-kubernetes-e2e-gce-iscsi-serial
/test pull-kubernetes-e2e-gce-storage-disruptive
/test pull-kubernetes-e2e-aks-engine-azure-windows
/test pull-kubernetes-e2e-azure-disk-windows
/test pull-kubernetes-e2e-azure-file-windows
/test pull-kubernetes-typecheck
/test pull-kubernetes-verify
/test pull-kubernetes-e2e-windows-gce

Use /test all to run the following jobs:

pull-kubernetes-bazel-build
pull-kubernetes-bazel-test
pull-kubernetes-dependencies
pull-kubernetes-e2e-gce-ubuntu-containerd
pull-kubernetes-integration
pull-kubernetes-e2e-kind
pull-kubernetes-e2e-kind-ipv6
pull-kubernetes-conformance-kind-ga-only-parallel
pull-kubernetes-node-e2e
pull-kubernetes-e2e-gce-100-performance
pull-kubernetes-e2e-azure-disk-windows
pull-kubernetes-e2e-azure-file-windows
pull-kubernetes-typecheck
pull-kubernetes-verify

In response to this:

/test
[intentionally triggering list of available tests]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

BenTheElder · 2020-10-29T18:59:30Z

I think https://github.com/kubernetes/test-infra/blob/b40bcac24d0ec96cc7d440ca5ece690cce7a62fe/config/jobs/kubernetes/sig-scalability/sig-scalability-presubmit-jobs.yaml#L67 is what you want, there are some optional scale presubmit.

On second thought though, even if that were the right one, we should check with someone from the scale team before using these resources 🙃

caesarxuchao · 2020-10-29T19:12:31Z

Thanks, @BenTheElder ! The pull-kubernetes-e2e-gce-large-performance uses 2000 nodes, that seems to be the right test. I'll check with the scalability sig.

jkaniuk · 2020-10-29T19:39:44Z

Seems relatively harmless from scalability point of view.

I'm fine with running 2k presubmit to verify, however our approval is not needed:
/test pull-kubernetes-e2e-gce-large-performance
(fingers crossed, it was a while since I've run this)

@tosi3k - for visibility as next oncaller

liggitt · 2021-03-18T20:31:02Z

Backport to 1.18 is riskier, as it is using v0.0.0-20191004110552-13f9640d40b9, not sure if a backport will introduce incompatible changes.

I think we should backport to 1.18, given:

how unpredictable the stuck connection issue is
how many components it affects
how problematic a stuck connection is for informer clients
that we've successfully used these sys and net levels in 1.19 for O(months) now
that other consumers have successfully backported these changes to 1.18 and used them in production

I opened #100376 to pick this to 1.18

deads2k · 2021-03-24T19:56:18Z

Given the consequences when this fails, I'm in favor of backporting this change if we can get an ack (ideally from @caesarxuchao) that the backport is fairly safe. My knowledge of the golang stack at this depth is fairly limited.

Is there a particular minimum golang level that is required or was it just the golang.org/x/net level?

liggitt · 2021-03-24T22:14:50Z

Given the consequences when this fails, I'm in favor of backporting this change if we can get an ack (ideally from @caesarxuchao) that the backport is fairly safe.

@caesarxuchao acked the backport PR (#100376)

Is there a particular minimum golang level that is required or was it just the golang.org/x/net level?

It was contained to golang.org/x/{net,sys}

Cherrypick #95981 to 1.18, Enables HTTP/2 health check

Following up from cilium#19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]>

Following up from #19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]>

[ upstream commit 365c788 ] Following up from cilium#19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]> Signed-off-by: Paul Chaignon <[email protected]>

[ upstream commit 365c788 ] Following up from #19259, this commit implements the same logic as kubelet [1] to close all idle connections. We can rely on HTTP/2 to determine which connections are idle (HTTP/2 pings), instead of closing all connections indiscriminately (when the heartbeat fails) via the custom dialer in setDialer(). We can assume HTTP/2 support as it has been supported since K8s 1.20 [2]. Users can disable HTTP/2 by setting the DISABLE_HTTP2 environment variable, however this only really meant from very advanced users, hence no documentation. Kubernetes has been running with HTTP/2 as the default for a long time (~since Dec 2020) without major issues. CloseIdleConnectionsFor() was introduced in K8s 1.23 [3], hence this commit cannot be backported to any earlier version of Cilium which doesn't support K8s 1.23. [1]: kubernetes/kubernetes#104844 [2]: kubernetes/kubernetes#95981 [3]: kubernetes/kubernetes@b9d865a Suggested-by: André Martins <[email protected]> Signed-off-by: Chris Tarazi <[email protected]> Signed-off-by: Paul Chaignon <[email protected]>

The env vars were implemented in kubernetes#95981. Effectively no impacts if using the default values here. BUG=255296578 Change-Id: I2441ef1a181a0fb202af7893c25cff5ff15ef17a

The env vars were implemented in kubernetes#95981. Effectively no impacts if using the default values here. BUG=255296578 Change-Id: I2441ef1a181a0fb202af7893c25cff5ff15ef17a Remove GODEBUG=x509sha1=1 workaround Bug: 227456358a Revert changes made in I5165e6c2fe73e8e1b2a617ced591133228b6d275 Change-Id: I7dd1ca8f9796c542c89074703e44e6b6f4b6edab

k8s-ci-robot requested review from andrewsykim, andyzhangx and a team October 29, 2020 08:15

cheftako mentioned this pull request Oct 29, 2020

Added code to track tainting of a tunnel. kubernetes-sigs/apiserver-network-proxy#155

Closed

caseydavenport mentioned this pull request Dec 31, 2020

Update k8s library versions projectcalico/libcalico-go#1356

Merged

3 tasks

roycaihw mentioned this pull request Feb 5, 2021

Admission webhooks affected by dead tcp connections #80313

Open

aramase mentioned this pull request Mar 18, 2021

chore: update k8s lib versions Azure/aad-pod-identity#1010

Merged

6 tasks

liggitt mentioned this pull request Mar 24, 2021

Cherrypick #95981 to 1.18, Enables HTTP/2 health check #100376

Merged

k8s-ci-robot added a commit that referenced this pull request Mar 25, 2021

Merge pull request #100376 from liggitt/http2-1.18

e680574

Cherrypick #95981 to 1.18, Enables HTTP/2 health check

hzxuzhonghu mentioned this pull request Apr 12, 2021

Unable to detect if a watch is active #65012

Closed

axeal mentioned this pull request Apr 19, 2021

Provide workaround for kube-apiserver HA connection loss causing nginx-ingress to not update its endpoints for 15mins rancher/rancher#19558

Closed

szuecs mentioned this pull request Dec 17, 2021

✨ AWS CNI Mode zalando-incubator/kube-ingress-aws-controller#461

Merged

shonecyx mentioned this pull request Jan 17, 2022

The ingress gateway won't close the downstream connection when upstream backend server crashes istio/istio#36818

Closed

youyongsong mentioned this pull request Feb 9, 2022

Failed to export events after running for a while opsgenie/kubernetes-event-exporter#121

Open

aojea mentioned this pull request Feb 14, 2022

kubelet apiserver: be gentle closing connections on heartbeat failures #108107

Merged

christarazi mentioned this pull request Mar 31, 2022

k8s: Use kubelet's logic to close all idle connections cilium/cilium#19290

Merged

hakman mentioned this pull request Jun 29, 2022

calico-kube-controllers pod stuck in not Ready for 13 min projectcalico/calico#3751

Closed

aojea mentioned this pull request May 9, 2023

Make net.ipv4.tcp_keepalive_time a safe sysctl #117873

Closed

dongjiang1989 mentioned this pull request Sep 18, 2023

allow configuration of PingTimeout/ReadIdleTimeout prometheus/common#512

Open

nijave mentioned this pull request May 5, 2024

Update haproxy and keepalived config kubernetes/kubeadm#3060

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enables HTTP/2 health check #95981

Enables HTTP/2 health check #95981

caesarxuchao commented Oct 29, 2020 •

edited

Loading

caesarxuchao commented Oct 29, 2020 •

edited

Loading

caesarxuchao commented Oct 29, 2020

caesarxuchao commented Oct 29, 2020

BenTheElder commented Oct 29, 2020

k8s-ci-robot commented Oct 29, 2020

BenTheElder commented Oct 29, 2020

caesarxuchao commented Oct 29, 2020

jkaniuk commented Oct 29, 2020

liggitt commented Mar 18, 2021

deads2k commented Mar 24, 2021

liggitt commented Mar 24, 2021

Enables HTTP/2 health check #95981

Enables HTTP/2 health check #95981

Conversation

caesarxuchao commented Oct 29, 2020 • edited Loading

caesarxuchao commented Oct 29, 2020 • edited Loading

caesarxuchao commented Oct 29, 2020

caesarxuchao commented Oct 29, 2020

BenTheElder commented Oct 29, 2020

k8s-ci-robot commented Oct 29, 2020

BenTheElder commented Oct 29, 2020

caesarxuchao commented Oct 29, 2020

jkaniuk commented Oct 29, 2020

liggitt commented Mar 18, 2021

deads2k commented Mar 24, 2021

liggitt commented Mar 24, 2021

caesarxuchao commented Oct 29, 2020 •

edited

Loading

caesarxuchao commented Oct 29, 2020 •

edited

Loading