-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SearchGroupsResolverInMemoryTests leaking threads #80305
Comments
Pinging @elastic/es-security (Team:Security) |
This one has me stumped so far. I suspect it's a real issue (though maybe only in test code), but I cannot make sense of it. The I cannot work out which of those steps is failing, and since I can't reproduce it, I'm having trouble getting any diagnostic on it. |
Update: I successfully reproduced the error after 100 attempts on a Debian 10 CI Worker. |
Update: After 3000 executions it failed. It appears to be some form of locking contention (possibly a deadlock, but I haven't gotten deep enough to know). I've added more logging and will keep trying. |
Had a PR build fail with this leak error today. |
Another failure: https://gradle-enterprise.elastic.co/s/3wjqo4lrylpqc |
I think I've tracked it down and raised an upstream issue pingidentity/ldapsdk#120 (it looks like a real, though very infrequent thread leak in LDAP SDK). I'll see if there's something we can do locally to avoid this until we can get a fix from upstream. |
LDAP SDK has a race condition where closing a connection while an async search is still executing could lead to a Timer thread being orphaned. See: pingidentity/ldapsdk#120 This commit changes SearchGroupsResolverInMemoryTests so that it waits for the pending async search to complete (or timeout) before returning. This ensures that when the close the connection the Timer thread is cancelled (and stays cancelled). Resolves: #80305
LDAP SDK has a race condition where closing a connection while an async search is still executing could lead to a Timer thread being orphaned. See: pingidentity/ldapsdk#120 This commit changes SearchGroupsResolverInMemoryTests so that it waits for the pending async search to complete (or timeout) before returning. This ensures that when the close the connection the Timer thread is cancelled (and stays cancelled). Resolves: elastic#80305
LDAP SDK has a race condition where closing a connection while an async search is still executing could lead to a Timer thread being orphaned. See: pingidentity/ldapsdk#120 This commit changes SearchGroupsResolverInMemoryTests so that it waits for the pending async search to complete (or timeout) before returning. This ensures that when the close the connection the Timer thread is cancelled (and stays cancelled). Resolves: #80305
There was another failure for this test: |
Another failure from today: |
I don't think there's anyway to reliably work around the upstream issue. I'm going to have to mute the test until there's a new release of the SDK. |
The new release contains fixes for leaking threads (see elastic#80305) and bias in round robin server sets, both of which are relevant to Elasticsearch security. Resolves: elastic#80305
The new release contains fixes for leaking threads (see elastic#80305) and bias in round robin server sets, both of which are relevant to Elasticsearch security. Resolves: elastic#80305
I believe this is still failing in 8.0 |
we just need to merge #81581, I kicked the CI tires again |
Sorry about the delayed backport. Log4j, and all that jazz. |
The new release contains fixes for leaking threads (see #80305) and bias in round robin server sets, both of which are relevant to Elasticsearch security. Resolves: #80305 Co-authored-by: Elastic Machine <[email protected]>
Resolved by #81581 |
Build scan: sample scan
Repro line: None given
Reproduces locally?: n/a, see above
Applicable branches:
8.0
,master
Failure history: build-stats - looks like failures with this error go back to Oct. 18 at least.
Failure excerpt:
The text was updated successfully, but these errors were encountered: