You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In load tests performed on the most recent snapshot of the OpenSearch main branch, we sporadically encountered this exception:
2024-08-14T09:10:30,251][ERROR][o.o.s.f.SecurityFilter ] [os-cluster-data-0] Unexpected exception java.util.ConcurrentModificationException
java.util.ConcurrentModificationException: null
at java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1605) ~[?:?]
at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1628) ~[?:?]
at java.base/java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1078) ~[?:?]
at java.base/java.util.AbstractCollection.addAll(AbstractCollection.java:337) ~[?:?]
at java.base/java.util.HashSet.<init>(HashSet.java:121) ~[?:?]
at org.opensearch.security.securityconf.ConfigModelV7$RoleMappingHolder.map(ConfigModelV7.java:1201) ~[opensearch-security-3.0.0.0.jar:3.0.0.0]
at org.opensearch.security.securityconf.ConfigModelV7.mapSecurityRoles(ConfigModelV7.java:1265) ~[opensearch-security-3.0.0.0.jar:3.0.0.0]
at org.opensearch.security.privileges.PrivilegesEvaluator.mapRoles(PrivilegesEvaluator.java:563) ~[opensearch-security-3.0.0.0.jar:3.0.0.0]
at org.opensearch.security.privileges.PrivilegesEvaluator.createContext(PrivilegesEvaluator.java:236) ~[opensearch-security-3.0.0.0.jar:3.0.0.0]
at org.opensearch.security.filter.SecurityFilter.apply0(SecurityFilter.java:382) [opensearch-security-3.0.0.0.jar:3.0.0.0]
So, it seems that the set returned by user.getSecurityRoles() was concurrently modified while the call new HashSet<>() was iterating through it to create a copy.
This is first surprising because the User class wraps the securityRoles HashSet in Collections.synchronizedSet():
That means that iteration through synchronized collection elements is not automatically synchronized.
However, the new HashSet<>() constructor under the hood creates an iterator on the securityRoles HashSet without further synchronization. Thus, there is the chance that it stumbles over a concurrent modification.
A quick fix for this would be not to use the copy constructor of HashSet but to manually add the roles to the new HashSet one-by-one in a synchronized block. However, this will cause a performance penalty, as the object will be locked for a significantly longer time and during that time not available for concurrent access.
How can one reproduce the bug?
Very difficult to reliably reproduce as this is a race conditions. Appears in load tests after a while.
What is the expected behavior?
No ConcurrentModificationException
What is your host/environment?
OS: Linux
Version: master snapshot
Plugins: security plugin
The text was updated successfully, but these errors were encountered:
nibix
added
bug
Something isn't working
untriaged
Require the attention of the repository maintainers and may need to be prioritized
labels
Aug 14, 2024
In the end, a fundamental object like User should really be immutable. Then, the need for all the synchronization will vanish, as no concurrent modifications can occur.
This has many advantages:
Faster due to no restrictions on concurrent access
More reliable due to the impossibility of race conditions
Safer because erroneous modifications of the User object are prevented by the compiler
Allows further optimization for serialization, such as caching/pre-calculating the serialized object instead of re-building every time it is needed.
Possibly, we can make such a change in the context of #3870
[Triage] Hi @nibix, thanks for filing this issue. This seems similar to the previous ticket: #4441. Since there is a lot of open discussion on the previous issue, I am going to close this ticket but feel free to contribute your findings to the previous issue.
stephen-crawford
added
triaged
Issues labeled as 'Triaged' have been reviewed and are deemed actionable.
and removed
untriaged
Require the attention of the repository maintainers and may need to be prioritized
labels
Aug 19, 2024
What is the bug?
In load tests performed on the most recent snapshot of the OpenSearch main branch, we sporadically encountered this exception:
Line 1201 in
ConfigModelV7
is this:security/src/main/java/org/opensearch/security/securityconf/ConfigModelV7.java
Line 1201 in a6fb2d4
So, it seems that the set returned by
user.getSecurityRoles()
was concurrently modified while the callnew HashSet<>()
was iterating through it to create a copy.This is first surprising because the
User
class wraps thesecurityRoles
HashSet inCollections.synchronizedSet()
:security/src/main/java/org/opensearch/security/user/User.java
Line 76 in a6fb2d4
However, the docs for
Collections.synchronizedSet()
contain a note which often gets overlooked:https://github.com/openjdk/jdk/blob/38bd8a36704a962f0ad1052fd2ec150a61663256/src/java.base/share/classes/java/util/Collections.java#L2382-L2394
That means that iteration through synchronized collection elements is not automatically synchronized.
However, the
new HashSet<>()
constructor under the hood creates an iterator on thesecurityRoles
HashSet without further synchronization. Thus, there is the chance that it stumbles over a concurrent modification.A quick fix for this would be not to use the copy constructor of
HashSet
but to manually add the roles to the new HashSet one-by-one in a synchronized block. However, this will cause a performance penalty, as the object will be locked for a significantly longer time and during that time not available for concurrent access.How can one reproduce the bug?
Very difficult to reliably reproduce as this is a race conditions. Appears in load tests after a while.
What is the expected behavior?
No ConcurrentModificationException
What is your host/environment?
The text was updated successfully, but these errors were encountered: