-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-43745: Add IdleCloseOnResponse field to IngressControllerSpec #2102
base: master
Are you sure you want to change the base?
OCPBUGS-43745: Add IdleCloseOnResponse field to IngressControllerSpec #2102
Conversation
Hello @frobware! Some important instructions when contributing to openshift/api: |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: frobware The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e70b743
to
41b507e
Compare
@frobware: This pull request references Jira Issue OCPBUGS-43745, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
41b507e
to
162d02c
Compare
/jira refresh |
@frobware: This pull request references Jira Issue OCPBUGS-43745, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
162d02c
to
708ac88
Compare
Add an `IdleConnectionTerminationPolicy` field to control whether HAProxy keeps idle frontend connections open during a soft stop (router reload). Allow users to prevent errors in clients or load balancers that do not properly handle connection resets.
708ac88
to
84689bf
Compare
Pickup openshift/api#2102 % go mod edit -replace github.com/openshift/api=github.com/frobware/api@84689bf6752251547541a87d3cfb891f9c6add29 % go mod tidy % go mod vendor
@frobware: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow why backports would set the default to Deferred rather than Immediate, if the change in behaviour has already been made as of 4.14 (HAProxy 2.2 to 2.4), doesn't this break a change that was done a while back? What did I miss?
@@ -258,6 +258,74 @@ type IngressControllerSpec struct { | |||
// | |||
// +optional | |||
HTTPCompression HTTPCompressionPolicy `json:"httpCompression,omitempty"` | |||
|
|||
// IdleConnectionTerminationPolicy maps directly to HAProxy's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, godoc should start with the serialised version of the string. We do this so that oc explain
looks correct
// frequent reloads to prevent resource exhaustion. | ||
// | ||
// +optional | ||
// +kubebuilder:default:="Immediate" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that OpenAPI also has the default
// +kubebuilder:default:="Immediate" | |
// +kubebuilder:default:="Immediate" | |
// +default="Immediate" |
Introduce a new knob,
IdleConnectionTerminationPolicy
, in the IngressController configuration to control how idle connections are handled during router reloads.Context
In OCPBUGS-32044, the
idle-close-on-response
option was unconditionally added to the HAProxy confuguration to address issues with incoming HTTP requests failing during router reloads. This issue primarily affected Apache HTTPClient versions prior to 5.0, which do not gracefully handle connection resets. Adding the option ensured that idle connections were left open to handle one final request before being closed.Historically, HAProxy 2.2 maintained idle connections during router reloads by default, allowing requests on those connections to complete even when routing configuration changes were applied. Starting with HAProxy 2.4, the default behaviour changed to close idle connections immediately during soft reloads.
To accommodate existing clients dependent on the HAProxy 2.2 behaviour, the unconditional addition of idle-close-on-response restored the previous OpenShift status quo, where customers upgrading their OpenShift clusters experienced a behaviour change due to the jump from HAProxy 2.2 to 2.6, which altered the default handling of idle connections during router reloads.
However, unconditionally enabling
idle-close-on-response
has now led to issues (OCPBUGS-43745) with Route backend switching. When a Route switches its service backend, requests on persistent connections could continue being routed to the previously active backend due to HAProxy handling these connections in the old process. This behaviour occurs until the connection is closed, either by a new request, the expiration of the client keep-alive, or the expiration of the HAProxytimeout http-keep-alive 300s
. While this behaviour is desirable in some cases (e.g., for clients sensitive to connection resets), it can lead to temporary inconsistencies and unexpected routing behaviour during backend switching.This PR addresses these regressions by making the behaviour configurable through a new knob.
Changes
IdleConnectionTerminationPolicy
, to the IngressController configuration.Behavioural Differences
Immediate (New Default in OpenShift 4.19+):
Deferred (Default for backports to 4.14–4.18):
timeout http-keep-alive
(300 seconds in OpenShift).References: