-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DLS performance has regressed with new serialization format #3776
Comments
I've tracked the problem to this serialization code that seems to take ~1ms in 2.3, whereas it takes ~13ms in 2.11 security/src/main/java/org/opensearch/security/configuration/DlsFlsValveImpl.java Lines 455 to 461 in 6b8a3e4
|
Is there JDK version difference between 2.3 and 2.11 in the test? |
It uses the default JDK on your machine to specify, set |
Previous testing wasn't properly switching to using only the custom vs jdk (de)serializers which lead to incomplete results, its clear that the custom serialization process is producing the issue with the 981ms respond time compared to 63ms. Test methodologyModified Ran with
Custom Serialization Enabled Results
JDK Serialization Enabled Results
|
@parasjain1 - can you look into this can be setting can be disabled or fixed? I think this would justify a patch release - happy to help drive that process. |
Test carried out for a map with duplicate values: https://gist.github.com/mgodwan/52801a25c13a9b0a00c6de0e7cbb0a73 The size of the data is smaller for jdk serialization as it uses ObjectOutpuStream which does not write duplicate objects again, but only handles. I agree that it will help to have a config to control the serialization format on the fly, while also seeing if we can have same optimization of similar sense in OpenSearch core streams we use. I can create an issue for the same. |
Would it make sense to update the i.e.
|
I had an idea to de-dupe the strings before serializing the objects and I ran few serialization benchmarks with this optimization for Map<String, Set> object that holds the DLS/FLS queries. The results showed that although it brought in major improvement but was still a lot slower than JDK serialization. Sharing results below -
|
In response to @cwperks's suggestion to use JDK serde in case of I've concluded that we can introduce a way to enable / disable custom serialization. The idea is to allow users with limited or no DLS/FLS use case to continue leveraging the faster custom serialization protocol other scenarios. Below is the approach that I'll be implementing. This assumes that the changes will be part of a patch release version
|
Have raised a PR - #3789 |
[Triage] Marking this as triaged since it is being addressed by Paras' PR. |
It's not a quick fix, but I just wanted to point out that #3870 would fix this, as the concept proposes a new DLS/FLS implementation which no longer relies on the serialization of DLS/FLS configurations. |
I confirm the performance problem for DLS is still existing in version 2.16. We had to rollback to 2.10, this is a blocker for our scenarios in production, the opensearch cluster completely unusable after upgrade once non administrator start using the application. We are able to reproduce the problem locally on a vanilla opensearch docker Using 100 indexes (sample ecommerce clones with 2K documents) , we confirmed with JMeter the same behavior that explained in this bug Version | Admin Rights (Call / Sec) | Limited Rights via DLS (Call / Sec) dls filter : POST path /sample_data_ecommerce/_search?request_cache=false |
@pmarjou22 Thanks for posting, could you create a new issue in this repo with these details? It looks like there might have been a second performance impacting issues as on 2.10 -> 2.11 the drop is explained with #2802, however, the drop from 2.13 -> 2.14 might be another issue. Could you help us determine the root cause by run jprofiler on the OpenSearch v2.16+ node processing the request to point towards a root cause? |
What is the bug?
Some users of OpenSearch have seen a performance decrease associated with DLS queries. Larger DLS queries (more characters)
have a larger impact.
How can one reproduce the bug?
git clone https://github.com/peternied/security.git
git checkout dls-perf
./gradlew integrationTest --tests org.opensearch.security.DlsTests.testDlsLargerQueryScenarios -x jacocoTestReport
./build/reports/tests/integrationTest/classes/org.opensearch.security.DlsTests.html
What is the expected behavior?
After the
DLS_ONLY_LONG_VALUE
is added, the AVG should not jump up so much.Do you have any additional context?
You can collect measures from the 2.3 build by running
git checkout dls-perf
from that branch to collect numbers from that versionThis issue was introduced in the following PR which did improve serialization in many scenarios, but seems to be impacting DLS queries as stored in headers.
The text was updated successfully, but these errors were encountered: