-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New setting to close idle connections in OIDC back-channel (#87773) #88363
Conversation
…7773) In some environment, the back-channel connection can be dropped without sending a TCP RST to ES. When that happens, reusing the same connection results into timeout error. This PR adds a new http.connection_pool_ttl setting to control how long a connection in the OIDC back-channel pool can be idle before it is closed. This allows ES to more actively close idle connections to avoid the timeout issue. NOTE: This is a "safe" backport of elastic#87773. The key difference here is that the new setting is by default not configured, which means the PR introduces *zero* behaviour change by default. Users need to actively configure the new setting to enable the new behaviour ("automatically closing idle connections). Backport: elastic#87773
A few things for a "safe" backport:
|
public void testKeepAliveStrategyNotConfigured() throws IllegalAccessException, URISyntaxException { | ||
final Settings.Builder settingsBuilder = getBasicRealmSettings(); | ||
if (randomBoolean()) { | ||
// Only negative time value allowed is -1, but it can take many forms, e.g. -1, -01, -1s, -001nanos etc. | ||
settingsBuilder.put( | ||
getFullSettingKey(REALM_NAME, OpenIdConnectRealmSettings.HTTP_CONNECTION_POOL_TTL), | ||
"-" | ||
+ Strings.collectionToDelimitedString(randomList(0, 10, () -> "0"), "") | ||
+ "1" | ||
+ randomFrom("", "nanos", "micros", "ms", "s", "m", "h", "d") | ||
); | ||
} | ||
final RealmConfig config = buildConfig(settingsBuilder.build(), threadContext); | ||
final Logger logger = LogManager.getLogger(OpenIdConnectAuthenticator.class); | ||
final MockLogAppender appender = new MockLogAppender(); | ||
appender.start(); | ||
Loggers.addAppender(logger, appender); | ||
Loggers.setLevel(logger, Level.DEBUG); | ||
try { | ||
appender.addExpectation( | ||
new MockLogAppender.UnseenEventExpectation( | ||
"configure keep-alive strategy", | ||
logger.getName(), | ||
Level.DEBUG, | ||
"configuring keep-alive strategy for http client used by oidc back-channel" | ||
) | ||
); | ||
authenticator = new OpenIdConnectAuthenticator(config, getOpConfig(), getDefaultRpConfig(), new SSLService(env), null); | ||
appender.assertAllExpectationsMatched(); | ||
assertThat(authenticator.getKeepAliveStrategy(), nullValue()); | ||
} finally { | ||
Loggers.removeAppender(logger, appender); | ||
appender.stop(); | ||
Loggers.setLevel(logger, (Level) null); | ||
authenticator.close(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new test for keeping the existing behaviour, i.e. no keep-alive strategy is configured.
...c/test/java/org/elasticsearch/xpack/security/authc/oidc/OpenIdConnectAuthenticatorTests.java
Outdated
Show resolved
Hide resolved
"configure keep-alive strategy", | ||
logger.getName(), | ||
Level.DEBUG, | ||
"configuring keep-alive strategy for http client used by oidc back-channel" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern here is that testing Unseen
is high risk. A tiny change in the message in the authenticator will mean this test no longer achieves anything.
The best suggestion I have is to extract this test into a method that takes a boolean shouldSeeLogMessage
parameter. And then call it with true
in the first test, and false
here. Then we know that changes to the log message will cause one test to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR effectively does this already. The postive assertion for the logging message is in another test method https://github.com/elastic/elasticsearch/pull/88363/files#diff-85fbc5dd474fc40cca2ada8babe33ea07b5e0db7a432d2424534f74235a5d7ffR1184
Alternatively, I can also add another log message in main code when keep-alive strategy is null. So testing this new message will be a postive assertion. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another two things to consider:
- The test also asserts getKeepaliveStrategy returns null. So it's not completely useless if the logging message is changed.
- This is only for the patch release. So it probably won't see much change in this area if at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be happy with dropping the logging based check entirely. I don't think it's really needed when we have the null
check. But I think in its current state it's fragile, so if we want to keep it we should make it reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped logging check as suggested. Thanks!
…ecurity/authc/oidc/OpenIdConnectAuthenticatorTests.java Co-authored-by: Tim Vernum <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticmachine run elasticsearch-ci/docs |
In some environment, the back-channel connection can be dropped
without sending a TCP RST to ES. When that happens, reusing the same
connection results into timeout error.
This PR adds a new http.connection_pool_ttl setting to control how long
a connection in the OIDC back-channel pool can be idle before it is
closed. This allows ES to more actively close idle connections to avoid
the timeout issue.
NOTE: This is a "safe" backport of #87773. The key difference here is
that the new setting is by default not configured, which means the PR
introduces zero behaviour change by default. Users need to actively
configure the new setting (to a value other than -1) to enable the new
behaviour ("automatically closing idle connections).
Backport: #87773