Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New setting to close idle connections in OIDC back-channel (#87773) #88363

Merged
merged 5 commits into from
Jul 8, 2022

Conversation

ywangd
Copy link
Member

@ywangd ywangd commented Jul 8, 2022

In some environment, the back-channel connection can be dropped
without sending a TCP RST to ES. When that happens, reusing the same
connection results into timeout error.

This PR adds a new http.connection_pool_ttl setting to control how long
a connection in the OIDC back-channel pool can be idle before it is
closed. This allows ES to more actively close idle connections to avoid
the timeout issue.

NOTE: This is a "safe" backport of #87773. The key difference here is
that the new setting is by default not configured, which means the PR
introduces zero behaviour change by default. Users need to actively
configure the new setting (to a value other than -1) to enable the new
behaviour ("automatically closing idle connections).

Backport: #87773

ywangd added 2 commits July 8, 2022 10:44
…7773)

In some environment, the back-channel connection can be dropped
without sending a TCP RST to ES. When that happens, reusing the same
connection results into timeout error.

This PR adds a new http.connection_pool_ttl setting to control how long
a connection in the OIDC back-channel pool can be idle before it is
closed. This allows ES to more actively close idle connections to avoid
the timeout issue.

NOTE: This is a "safe" backport of elastic#87773. The key difference here is
that the new setting is by default not configured, which means the PR
introduces *zero* behaviour change by default. Users need to actively
configure the new setting to enable the new behaviour ("automatically
closing idle connections).

Backport: elastic#87773
@ywangd
Copy link
Member Author

ywangd commented Jul 8, 2022

A few things for a "safe" backport:

  1. Default of the new setting is now -1
  2. No behaviour change if either the setting is not configured or configured to be something like -1.
  3. Removed docs since the behavior is temporary on the patch
  4. Changed PR title and description as well as the change log summary.

Comment on lines 1055 to 1091
public void testKeepAliveStrategyNotConfigured() throws IllegalAccessException, URISyntaxException {
final Settings.Builder settingsBuilder = getBasicRealmSettings();
if (randomBoolean()) {
// Only negative time value allowed is -1, but it can take many forms, e.g. -1, -01, -1s, -001nanos etc.
settingsBuilder.put(
getFullSettingKey(REALM_NAME, OpenIdConnectRealmSettings.HTTP_CONNECTION_POOL_TTL),
"-"
+ Strings.collectionToDelimitedString(randomList(0, 10, () -> "0"), "")
+ "1"
+ randomFrom("", "nanos", "micros", "ms", "s", "m", "h", "d")
);
}
final RealmConfig config = buildConfig(settingsBuilder.build(), threadContext);
final Logger logger = LogManager.getLogger(OpenIdConnectAuthenticator.class);
final MockLogAppender appender = new MockLogAppender();
appender.start();
Loggers.addAppender(logger, appender);
Loggers.setLevel(logger, Level.DEBUG);
try {
appender.addExpectation(
new MockLogAppender.UnseenEventExpectation(
"configure keep-alive strategy",
logger.getName(),
Level.DEBUG,
"configuring keep-alive strategy for http client used by oidc back-channel"
)
);
authenticator = new OpenIdConnectAuthenticator(config, getOpConfig(), getDefaultRpConfig(), new SSLService(env), null);
appender.assertAllExpectationsMatched();
assertThat(authenticator.getKeepAliveStrategy(), nullValue());
} finally {
Loggers.removeAppender(logger, appender);
appender.stop();
Loggers.setLevel(logger, (Level) null);
authenticator.close();
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new test for keeping the existing behaviour, i.e. no keep-alive strategy is configured.

"configure keep-alive strategy",
logger.getName(),
Level.DEBUG,
"configuring keep-alive strategy for http client used by oidc back-channel"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern here is that testing Unseen is high risk. A tiny change in the message in the authenticator will mean this test no longer achieves anything.

The best suggestion I have is to extract this test into a method that takes a boolean shouldSeeLogMessage parameter. And then call it with true in the first test, and false here. Then we know that changes to the log message will cause one test to fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR effectively does this already. The postive assertion for the logging message is in another test method https://github.com/elastic/elasticsearch/pull/88363/files#diff-85fbc5dd474fc40cca2ada8babe33ea07b5e0db7a432d2424534f74235a5d7ffR1184

Alternatively, I can also add another log message in main code when keep-alive strategy is null. So testing this new message will be a postive assertion. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another two things to consider:

  1. The test also asserts getKeepaliveStrategy returns null. So it's not completely useless if the logging message is changed.
  2. This is only for the patch release. So it probably won't see much change in this area if at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be happy with dropping the logging based check entirely. I don't think it's really needed when we have the null check. But I think in its current state it's fragile, so if we want to keep it we should make it reliable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped logging check as suggested. Thanks!

ywangd and others added 2 commits July 8, 2022 12:32
…ecurity/authc/oidc/OpenIdConnectAuthenticatorTests.java

Co-authored-by: Tim Vernum <[email protected]>
@ywangd ywangd requested a review from tvernum July 8, 2022 03:58
Copy link
Contributor

@tvernum tvernum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ywangd ywangd added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jul 8, 2022
@ywangd
Copy link
Member Author

ywangd commented Jul 8, 2022

@elasticmachine run elasticsearch-ci/docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport v8.3.3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants