-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add setting for keep-alive duration for oidc back-channel #87773
Conversation
In some environment, the back-channel connection can be dropped without sending a TCP RST to ES. When that happens, reusing the same connection results into timeout error. This PR adds a new http.connection_pool_ttl setting to control how long a connection in the OIDC back-channel pool can be idle before it is closed. This allows ES to more actively close idle connections to avoid the timeout issue. Resolves: elastic#75515
Pinging @elastic/es-security (Team:Security) |
Hi @ywangd, I've created a changelog YAML for you. |
@tvernum This might be worth to backport to earlier versions (including 7.17?) if it proves to work. For now, I just tagged |
...ty/src/main/java/org/elasticsearch/xpack/security/authc/oidc/OpenIdConnectAuthenticator.java
Outdated
Show resolved
Hide resolved
Configure this setting to `-1` to let the server dictates this value using the `Keep-Alive` HTTP | ||
response header. If the header is not set by the server, the time-to-live is infinite meaning | ||
that connections never expire. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should reword this, but I'll come back with a suggestion later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reword'd this paragraph and also made it consistent to the new behaviour of handling keep-alive response header.
...c/main/java/org/elasticsearch/xpack/core/security/authc/oidc/OpenIdConnectRealmSettings.java
Show resolved
Hide resolved
...ty/src/main/java/org/elasticsearch/xpack/security/authc/oidc/OpenIdConnectAuthenticator.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Tim Vernum <[email protected]>
…ecurity/authc/oidc/OpenIdConnectAuthenticator.java Co-authored-by: Tim Vernum <[email protected]>
var serverKeepAlive = DefaultConnectionKeepAliveStrategy.INSTANCE.getKeepAliveDuration(response, context); | ||
final long actualKeepAlive; | ||
if (serverKeepAlive == -1) { | ||
actualKeepAlive = userConfiguredKeepAlive; | ||
} else if (userConfiguredKeepAlive == -1) { | ||
actualKeepAlive = serverKeepAlive; | ||
} else { | ||
actualKeepAlive = Math.min(serverKeepAlive, userConfiguredKeepAlive); | ||
} | ||
LOGGER.debug("effective HTTP connection keep-alive: [{}]ms", actualKeepAlive); | ||
return actualKeepAlive; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tweaked this part a bit more and added a debug logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything that's here LGTM, but we do need a plan for automated testing.
} | ||
LOGGER.debug("effective HTTP connection keep-alive: [{}]ms", actualKeepAlive); | ||
return actualKeepAlive; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to come up with a test strategy for this.
We could do it with a mock HTTP Server and ensure that the connection is pulled down, but that seems like a lot of work.
Is it possible to check the value configured on the client itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came up a way for testing the changes for both the configured TTL and the actual behaviour of "not reusing the same connection". Please see 058872f
Co-authored-by: Tim Vernum <[email protected]>
httpServer.createContext("/", exchange -> { | ||
try { | ||
final int currentPort = exchange.getRemoteAddress().getPort(); | ||
// Either set the first port number, otherwise the current (2nd) port number should be different from the 1st one | ||
if (false == firstClientPort.compareAndSet(null, currentPort)) { | ||
assertThat(currentPort, not(equalTo(firstClientPort.get()))); | ||
} | ||
final byte[] bytes = randomByteArrayOfLength(2); | ||
exchange.sendResponseHeaders(200, bytes.length); | ||
exchange.getResponseBody().write(bytes); | ||
} finally { | ||
exchange.close(); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing the behaviour that two requests with interval greater than the TTL (1s) should use different connections.
appender.addExpectation( | ||
new MockLogAppender.PatternSeenEventExpectation( | ||
"log", | ||
logger.getName(), | ||
Level.DEBUG, | ||
".*Connection .* can be kept alive for 1.0 seconds" | ||
) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing for the configured TTL in action.
🚀 |
This PR adds a new setting to enable tcp keepalive probes for the connections used by the oidc back-channel communication. It defaults to false to keep the existing behaviour. Relates: elastic#87773
This PR adds a new setting to enable tcp keepalive probes for the connections used by the oidc back-channel communication. It defaults to true as tcp keepalive is generally useful for ES. Relates: #87773
…7773) In some environment, the back-channel connection can be dropped without sending a TCP RST to ES. When that happens, reusing the same connection results into timeout error. This PR adds a new http.connection_pool_ttl setting to control how long a connection in the OIDC back-channel pool can be idle before it is closed. This allows ES to more actively close idle connections to avoid the timeout issue. NOTE: This is a "safe" backport of elastic#87773. The key difference here is that the new setting is by default not configured, which means the PR introduces *zero* behaviour change by default. Users need to actively configure the new setting to enable the new behaviour ("automatically closing idle connections). Backport: elastic#87773
…88363) * New setting to close idle connections in OIDC back-channel (#87773) In some environment, the back-channel connection can be dropped without sending a TCP RST to ES. When that happens, reusing the same connection results into timeout error. This PR adds a new http.connection_pool_ttl setting to control how long a connection in the OIDC back-channel pool can be idle before it is closed. This allows ES to more actively close idle connections to avoid the timeout issue. NOTE: This is a "safe" backport of #87773. The key difference here is that the new setting is by default not configured, which means the PR introduces *zero* behaviour change by default. Users need to actively configure the new setting to enable the new behaviour ("automatically closing idle connections). Backport: #87773 * Make it safe by keeping default behaviour * tweak * Update x-pack/plugin/security/src/test/java/org/elasticsearch/xpack/security/authc/oidc/OpenIdConnectAuthenticatorTests.java Co-authored-by: Tim Vernum <[email protected]> * address feedback Co-authored-by: Tim Vernum <[email protected]>
…7773) In some environment, the back-channel connection can be dropped without sending a TCP RST to ES. When that happens, reusing the same connection results into timeout error. This PR adds a new http.connection_pool_ttl setting to control how long a connection in the OIDC back-channel pool can be idle before it is closed. This allows ES to more actively close idle connections to avoid the timeout issue. NOTE: This is a "safe" backport of elastic#87773. The key difference here is that the new setting is by default not configured, which means the PR introduces *zero* behaviour change by default. Users need to actively configure the new setting to enable the new behaviour ("automatically closing idle connections). Backport: elastic#87773
…88412) In some environment, the back-channel connection can be dropped without sending a TCP RST to ES. When that happens, reusing the same connection results into timeout error. This PR adds a new http.connection_pool_ttl setting to control how long a connection in the OIDC back-channel pool can be idle before it is closed. This allows ES to more actively close idle connections to avoid the timeout issue. NOTE: This is a "safe" backport of #87773. The key difference here is that the new setting is by default not configured, which means the PR introduces *zero* behaviour change by default. Users need to actively configure the new setting to enable the new behaviour ("automatically closing idle connections). Backport: #87773
In some environment, the back-channel connection can be dropped
without sending a TCP RST to ES. When that happens, reusing the same
connection results into timeout error.
This PR adds a new http.connection_pool_ttl setting to control how long
a connection in the OIDC back-channel pool can be idle before it is
closed. This allows ES to more actively close idle connections to avoid
the timeout issue.
The new setting has a 3min default which means idle connections are
closed every 3 min if server response does not specify a shorter keep-alive.
Resolves: #75515