Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced writing layer, getting rid of writing logic that uses an absolute path in the filesystem. #2241

Merged

Conversation

0ctopus13prime
Copy link
Contributor

@0ctopus13prime 0ctopus13prime commented Oct 29, 2024

Description

This PR introduces an abstract writing layer into native engines (NMSLIB, Faiss).
In which, made native engines rely on an write interface to do the IO instead of directly depending on File API.
Due to this layer, we can use a different kind of underlying storage to save vector index. The default one that is currently being used is FSDirectory saving the index in the local file system.

Related Issues

RFC : #2033

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@0ctopus13prime
Copy link
Contributor Author

768 dimension 10m data testing result is coming shortly.

Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished first path - overall looks pretty good - have a few questions

jni/include/faiss_stream_support.h Show resolved Hide resolved
jni/include/jni_util.h Outdated Show resolved Hide resolved
jni/include/native_engines_stream_support.h Show resolved Hide resolved
jni/include/native_engines_stream_support.h Outdated Show resolved Hide resolved
jni/src/faiss_index_service.cpp Outdated Show resolved Hide resolved
jni/src/faiss_wrapper.cpp Outdated Show resolved Hide resolved
jni/src/nmslib_wrapper.cpp Show resolved Hide resolved
Copy link
Collaborator

@navneet1v navneet1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add java docs on all public functions and classes. both c++ and java.

CHANGELOG.md Outdated Show resolved Hide resolved
jni/include/native_engines_stream_support.h Show resolved Hide resolved
jni/include/faiss_stream_support.h Show resolved Hide resolved
jni/include/native_engines_stream_support.h Outdated Show resolved Hide resolved
jni/src/nmslib_wrapper.cpp Outdated Show resolved Hide resolved
jni/src/faiss_wrapper.cpp Outdated Show resolved Hide resolved
jni/src/faiss_wrapper.cpp Show resolved Hide resolved
@0ctopus13prime 0ctopus13prime force-pushed the writing-layer-dev-1 branch 2 times, most recently from b0435ee to 16be6d0 Compare October 31, 2024 16:36
@jmazanec15
Copy link
Member

rolling BWC is expected to fail due to pending version release - ignore.

@0ctopus13prime can you signoff the commit? DCO seems to be failing

jni/include/faiss_stream_support.h Outdated Show resolved Hide resolved
jni/include/faiss_stream_support.h Outdated Show resolved Hide resolved
jni/include/faiss_stream_support.h Show resolved Hide resolved
jni/include/faiss_stream_support.h Show resolved Hide resolved
jni/include/jni_util.h Outdated Show resolved Hide resolved
jni/include/native_engines_stream_support.h Show resolved Hide resolved
jni/include/native_engines_stream_support.h Show resolved Hide resolved
jni/include/native_engines_stream_support.h Show resolved Hide resolved
jni/include/native_engines_stream_support.h Show resolved Hide resolved
jni/include/nmslib_stream_support.h Outdated Show resolved Hide resolved
…bsolute path in the filesystem.

Signed-off-by: Dooyong Kim <[email protected]>
@0ctopus13prime
Copy link
Contributor Author

@Vikasht34
Hi vikash, I added nullptr check for each parameters. Please take a look at when you're free.

  explicit FaissOpenSearchIOReader(NativeEngineIndexInputMediator *_mediator)
      : faiss::IOReader(),
        mediator(_mediator) {
        mediator(knn_jni::util::ParameterCheck::require_non_null(_mediator, "mediator")) {
    name = "FaissOpenSearchIOReader";
  }



struct ParameterCheck {
  template<typename PtrType>
  static PtrType *require_non_null(PtrType *ptr, const char *parameter_name) {
    if (ptr == nullptr) {
      throw std::invalid_argument(std::string("Parameter [") + parameter_name + "] should not be null.");
    }
    return ptr;
  }

@0ctopus13prime
Copy link
Contributor Author

Benchmark Environment

  • OpenSearch Version : 2.18 (backporting this change to 2.xx branch)
  • Primary shards : 3
  • No replica
  • Data node : r7gd.4xlarge (16 vCPU, 128 memory)
  • Storage : SSD, EBS, 5000 IOPS
  • JVM Heap : -Xmx32g -Xms32g

@0ctopus13prime
Copy link
Contributor Author

@navneet1v
Will paste this to the RFC issue as well!

Faiss benchmark results conclusion

From the results, it's expected that total cumulative indexing time will be increased up to 2% (67.67min -> 68.98min), thus it means that the bulk indexing throughput can be decreased down to 1.4% (7861 -> 7744).

Faiss benchmark details

<style> </style>
Metric Task Baseline Candidate Diff Unit
Cumulative indexing time of primary shards   67.6766 68.9868 0.01935972 min
Min cumulative indexing time across primary shards   0 0 0 min
Median cumulative indexing time across primary shards   10.92455 11.3572 0.039603462 min
Max cumulative indexing time across primary shards   22.9782 23.4515 0.020597784 min
Cumulative indexing throttle time of primary shards   0 0 0 min
Min cumulative indexing throttle time across primary shards   0 0 0 min
Median cumulative indexing throttle time across primary shards   0 0 0 min
Max cumulative indexing throttle time across primary shards   0 0 0 min
Cumulative merge time of primary shards   455.27 463.033 0.01705142 min
Cumulative merge count of primary shards   628 631 0.00477707  
Min cumulative merge time across primary shards   0 0 0 min
Median cumulative merge time across primary shards   74.92695 75.6642 0.009839584 min
Max cumulative merge time across primary shards   154.315 158.499 0.027113372 min
Cumulative merge throttle time of primary shards   25.78985 22.7165 -0.119168975 min
Min cumulative merge throttle time across primary shards   0 0 0 min
Median cumulative merge throttle time across primary shards   3.979025 3.18752 -0.198919333 min
Max cumulative merge throttle time across primary shards   9.53849 8.29422 -0.130447272 min
Cumulative refresh time of primary shards   0.8357415 0.8457 0.011915766 min
Cumulative refresh count of primary shards   475.5 473 -0.005257624  
Min cumulative refresh time across primary shards   0 0 0 min
Median cumulative refresh time across primary shards   0.134925 0.132867 -0.015252918 min
Max cumulative refresh time across primary shards   0.2874665 0.302833 0.053454924 min
Cumulative flush time of primary shards   15.55735 15.5443 -0.000838832 min
Cumulative flush count of primary shards   350 349 -0.002857143  
Min cumulative flush time across primary shards   0 0 0 min
Median cumulative flush time across primary shards   2.5186 2.5161 -0.000992615 min
Max cumulative flush time across primary shards   5.39728 5.36828 -0.005373077 min
Total Young Gen GC time   1.469 1.706 0.161334241 s
Total Young Gen GC count   160 174 0.0875  
Total Old Gen GC time   0 0 0 s
Total Old Gen GC count   0 0 0  
Store size   340.746 340.744 -5.86947E-06 GB
Translog size   1.43983E-06 1.20E-06 -0.168176688 GB
Heap used for segments   0 0 0 MB
Heap used for doc values   0 0 0 MB
Heap used for terms   0 0 0 MB
Heap used for norms   0 0 0 MB
Heap used for points   0 0 0 MB
Heap used for stored fields   0 0 0 MB
Segment count   5.5 5 -0.090909091  
Min Throughput custom-vector-bulk 6584.245 5835.71 -0.113685776 docs/s
Mean Throughput custom-vector-bulk 7861.24 7744.98 -0.014789015 docs/s
Median Throughput custom-vector-bulk 7596.51 7518.42 -0.010279721 docs/s
Max Throughput custom-vector-bulk 10233.015 9612.79 -0.060610192 docs/s
50th percentile latency custom-vector-bulk 72.44355 73.3195 0.012091484 ms
90th percentile latency custom-vector-bulk 151.5965 155.167 0.023552655 ms
99th percentile latency custom-vector-bulk 260.645 257.376 -0.012541963 ms
99.9th percentile latency custom-vector-bulk 378.015 381.875 0.010211235 ms
99.99th percentile latency custom-vector-bulk 608.632 932.898 0.532778428 ms
100th percentile latency custom-vector-bulk 1080.8535 3556.3 2.290270143 ms
50th percentile service time custom-vector-bulk 72.44355 73.3195 0.012091484 ms
90th percentile service time custom-vector-bulk 151.5965 155.167 0.023552655 ms
99th percentile service time custom-vector-bulk 260.645 257.376 -0.012541963 ms
99.9th percentile service time custom-vector-bulk 378.015 381.875 0.010211235 ms
99.99th percentile service time custom-vector-bulk 608.632 932.898 0.532778428 ms
100th percentile service time custom-vector-bulk 1080.8535 3556.3 2.290270143 ms
error rate custom-vector-bulk 0 0 0 %
Min Throughput force-merge-segments 0 0 0 ops/s
Mean Throughput force-merge-segments 0 0 0 ops/s
Median Throughput force-merge-segments 0 0 0 ops/s
Max Throughput force-merge-segments 0 0 0 ops/s
100th percentile latency force-merge-segments 11015600 1.13E+07 0.026789281 ms
100th percentile service time force-merge-segments 11015600 1.13E+07 0.026789281 ms
error rate force-merge-segments 0 0 0 %
Min Throughput warmup-indices 0.02 0.02 0 ops/s
Mean Throughput warmup-indices 0.02 0.02 0 ops/s
Median Throughput warmup-indices 0.02 0.02 0 ops/s
Max Throughput warmup-indices 0.02 0.02 0 ops/s
100th percentile latency warmup-indices 46760.8 54720 0.170210946 ms
100th percentile service time warmup-indices 46760.8 54720 0.170210946 ms
error rate warmup-indices 0 0 0 %
Min Throughput prod-queries 2.08 1.61 -0.225961538 ops/s
Mean Throughput prod-queries 8.765 10.09 0.151169424 ops/s
Median Throughput prod-queries 8.765 4.43 -0.494580719 ops/s
Max Throughput prod-queries 15.445 24.24 0.569439948 ops/s
50th percentile latency prod-queries 9.361765 10.2061 0.090189724 ms
90th percentile latency prod-queries 11.2586 12.3034 0.092800171 ms
99th percentile latency prod-queries 482.748 557.695 0.155250773 ms
100th percentile latency prod-queries 503.6475 621.016 0.233036995 ms
50th percentile service time prod-queries 9.361765 10.2061 0.090189724 ms
90th percentile service time prod-queries 11.2586 12.3034 0.092800171 ms
99th percentile service time prod-queries 482.748 557.695 0.155250773 ms
100th percentile service time prod-queries 503.6475 621.016 0.233036995 ms
error rate prod-queries 0 0 0 %
Mean recall@k prod-queries 0.34 0.35 0.029411765  
Mean recall@1 prod-queries 0.495 0.4 -0.191919192  

@0ctopus13prime
Copy link
Contributor Author

NMSLIB benchmark conclusion

Unlike Faiss, it is expected there will be a slight improvement in indexing related metrics in NMSLIB.
Cumulative indexing time decreased 5.4% (from 70.26min -> 66.5min), mean bulk indexing throughput has been increased 1.5% (from 7797 docs/sec to 7921 docs/sec)

Benchmark details

<style> </style>
Metric Task Baseline-Value Candidate-Value Diff Unit
Cumulative indexing time of primary shards   70.2665 66.4061 -0.054939409 min
Min cumulative indexing time across primary shards   0 0 0 min
Median cumulative indexing time across primary shards   11.1663 10.4298 -0.06595739 min
Max cumulative indexing time across primary shards   24.5895 23.3949 -0.048581712 min
Cumulative indexing throttle time of primary shards   0 0 0 min
Min cumulative indexing throttle time across primary shards   0 0 0 min
Median cumulative indexing throttle time across primary shards   0 0 0 min
Max cumulative indexing throttle time across primary shards   0 0 0 min
Cumulative merge time of primary shards   547.03 546.824 -0.000376579 min
Cumulative merge count of primary shards   650 617 -0.050769231  
Min cumulative merge time across primary shards   0 0 0 min
Median cumulative merge time across primary shards   88.9109 89.2281 0.003567617 min
Max cumulative merge time across primary shards   188.095 186.289 -0.009601531 min
Cumulative merge throttle time of primary shards   18.9961 20.5987 0.084364685 min
Min cumulative merge throttle time across primary shards   0 0 0 min
Median cumulative merge throttle time across primary shards   2.98061 3.06296 0.027628573 min
Max cumulative merge throttle time across primary shards   6.81933 7.71148 0.130826635 min
Cumulative refresh time of primary shards   0.785367 0.735133 -0.063962453 min
Cumulative refresh count of primary shards   484 477 -0.01446281  
Min cumulative refresh time across primary shards   0 0 0 min
Median cumulative refresh time across primary shards   0.127833 0.110533 -0.135332817 min
Max cumulative refresh time across primary shards   0.273 0.276367 0.012333333 min
Cumulative flush time of primary shards   15.2446 14.0613 -0.077620928 min
Cumulative flush count of primary shards   363 355 -0.022038567  
Min cumulative flush time across primary shards   0 0 0 min
Median cumulative flush time across primary shards   2.37743 2.25187 -0.052813332 min
Max cumulative flush time across primary shards   5.28568 4.94943 -0.063615278 min
Total Young Gen GC time   2.127 2.052 -0.035260931 s
Total Young Gen GC count   179 176 -0.016759777  
Total Old Gen GC time   0 0 0 s
Total Old Gen GC count   0 0 0  
Store size   340.825 340.825 0 GB
Translog size   1.20E-06 1.20E-06 0 GB
Heap used for segments   0 0 0 MB
Heap used for doc values   0 0 0 MB
Heap used for terms   0 0 0 MB
Heap used for norms   0 0 0 MB
Heap used for points   0 0 0 MB
Heap used for stored fields   0 0 0 MB
Segment count   5 5 0  
Min Throughput custom-vector-bulk 4198.96 5499.58 0.309748128 docs/s
Mean Throughput custom-vector-bulk 7797.42 7921.59 0.015924498 docs/s
Median Throughput custom-vector-bulk 7560.44 7625.18 0.008562994 docs/s
Max Throughput custom-vector-bulk 9657.7 9963.47 0.031660747 docs/s
50th percentile latency custom-vector-bulk 68.3138 68.0887 -0.003295088 ms
90th percentile latency custom-vector-bulk 154.12 153.953 -0.001083571 ms
99th percentile latency custom-vector-bulk 258.562 258.702 0.000541456 ms
99.9th percentile latency custom-vector-bulk 370.548 387.22 0.044992821 ms
99.99th percentile latency custom-vector-bulk 506.498 582.752 0.150551434 ms
100th percentile latency custom-vector-bulk 2082.6 1156.62 -0.444626909 ms
50th percentile service time custom-vector-bulk 68.3138 68.0887 -0.003295088 ms
90th percentile service time custom-vector-bulk 154.12 153.953 -0.001083571 ms
99th percentile service time custom-vector-bulk 258.562 258.702 0.000541456 ms
99.9th percentile service time custom-vector-bulk 370.548 387.22 0.044992821 ms
99.99th percentile service time custom-vector-bulk 506.498 582.752 0.150551434 ms
100th percentile service time custom-vector-bulk 2082.6 1156.62 -0.444626909 ms
error rate custom-vector-bulk 0 0 0 %
Min Throughput force-merge-segments 0 0 0 ops/s
Mean Throughput force-merge-segments 0 0 0 ops/s
Median Throughput force-merge-segments 0 0 0 ops/s
Max Throughput force-merge-segments 0 0 0 ops/s
100th percentile latency force-merge-segments 1.50E+07 1.47E+07 -0.024653036 ms
100th percentile service time force-merge-segments 1.50E+07 1.47E+07 -0.024653036 ms
error rate force-merge-segments 0 0 0 %
Min Throughput warmup-indices 0.02 0.02 0 ops/s
Mean Throughput warmup-indices 0.02 0.02 0 ops/s
Median Throughput warmup-indices 0.02 0.02 0 ops/s
Max Throughput warmup-indices 0.02 0.02 0 ops/s
100th percentile latency warmup-indices 47770.6 48938.4 0.024445998 ms
100th percentile service time warmup-indices 47770.6 48938.4 0.024445998 ms
error rate warmup-indices 0 0 0 %
Min Throughput prod-queries 1.68 1.75 0.041666667 ops/s
Mean Throughput prod-queries 7.31 10.71 0.465116279 ops/s
Median Throughput prod-queries 7.31 2.39 -0.673050616 ops/s
Max Throughput prod-queries 12.93 27.99 1.164733179 ops/s
50th percentile latency prod-queries 10.2375 9.81192 -0.041570696 ms
90th percentile latency prod-queries 12.6484 12.481 -0.013234876 ms
99th percentile latency prod-queries 491.371 515.637 0.049384274 ms
100th percentile latency prod-queries 592.629 569.577 -0.03889786 ms
50th percentile service time prod-queries 10.2375 9.81192 -0.041570696 ms
90th percentile service time prod-queries 12.6484 12.481 -0.013234876 ms
99th percentile service time prod-queries 491.371 515.637 0.049384274 ms
100th percentile service time prod-queries 592.629 569.577 -0.03889786 ms
error rate prod-queries 0 0 0 %
Mean recall@k prod-queries 0.51 0.53 0.039215686  
Mean recall@1 prod-queries 0.7 0.79 0.128571429  

@0ctopus13prime
Copy link
Contributor Author

@navneet1v
Hi Navneet!
Could you review the benchmark results? You can find the benchmark environment in the above comment.
Overall, 2% increase in indexing time, -1.5% bulk indexing throughput is expected for FAISS, and the metrics will be the same for NMSLIB.

Thank you!

@navneet1v navneet1v merged commit 64bae92 into opensearch-project:main Nov 5, 2024
30 of 32 checks passed
@navneet1v navneet1v added backport 2.x v2.19.0 Enhancements Increases software capabilities beyond original client specifications indexing-improvements This label should be attached to all the github issues which will help improving the indexing time. labels Nov 5, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-2241-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 64bae9243209a629b19a5253223d7c5b1a5ce5c5
# Push it to GitHub
git push --set-upstream origin backport/backport-2241-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-2241-to-2.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Enhancements Increases software capabilities beyond original client specifications indexing-improvements This label should be attached to all the github issues which will help improving the indexing time. v2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants