Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update diskann config for config check #896

Merged
merged 1 commit into from
Oct 16, 2024

Conversation

foxspy
Copy link
Collaborator

@foxspy foxspy commented Oct 16, 2024

issue: #795
Removed pq_code_budget_gb/search_cache_budget_gb from DiskANN and replaced them with pq_code_budget_gb_ratio/search_cache_budget_gb_ratio. This will reduce the parameter-filling logic by using a fixed ratio instead of dynamically filling GB values

@foxspy
Copy link
Collaborator Author

foxspy commented Oct 16, 2024

/kind improvement

Copy link

codecov bot commented Oct 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.56%. Comparing base (3c46f4c) to head (9d455e0).
Report is 223 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           main     #896       +/-   ##
=========================================
+ Coverage      0   79.56%   +79.56%     
=========================================
  Files         0       81       +81     
  Lines         0     6270     +6270     
=========================================
+ Hits          0     4989     +4989     
- Misses        0     1281     +1281     

see 81 files with indirect coverage changes

Copy link
Collaborator

@liliu-z liliu-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any E2E change?

src/index/diskann/diskann_config.h Show resolved Hide resolved
src/index/diskann/diskann_config.h Show resolved Hide resolved
@foxspy
Copy link
Collaborator Author

foxspy commented Oct 16, 2024

Any E2E change?

Yes, Milvus will remove the calculation logic for the pq_code_budget_gb and search_cache_budget_gb parameters. Any resource-related logic will be bound to vec_field_size_gb. Another reason for this PR is that parameter checks would fail without pq_code_budget_gb/search_cache_budget_gb, so default values were added.

@foxspy
Copy link
Collaborator Author

foxspy commented Oct 16, 2024

Any E2E change?

Yes, Milvus will remove the calculation logic for the pq_code_budget_gb and search_cache_budget_gb parameters. Any resource-related logic will be bound to vec_field_size_gb. Another reason for this PR is that parameter checks would fail without pq_code_budget_gb/search_cache_budget_gb, so default values were added.

This might be another issue. The parameter check during createIndex does not include runtime parameters, and there may be some differences from the build parameters, though they should be minimal. It seems that index parameters should avoid carrying runtime-specific information as much as possible and be more direct (e.g., directly specifying DiskANN PQ parameters rather than limiting via GB resources). However, managing and retracting exposed parameters can be tricky

@alexanderguzhva
Copy link
Collaborator

@foxspy would you consider adding something like 'inverse ratio', maybe, because our typical ratios are 1/8, 1/12, 1/16?

@foxspy
Copy link
Collaborator Author

foxspy commented Oct 16, 2024

@foxspy would you consider adding something like 'inverse ratio', maybe, because our typical ratios are 1/8, 1/12, 1/16?

Do you mean defining it as an int to ensure precision? But it’s not always possible to guarantee that, for example, in cases like 2/3. Generally, for float parameters, slight precision errors are acceptable.

@alexanderguzhva
Copy link
Collaborator

alexanderguzhva commented Oct 16, 2024

@foxspy would you consider adding something like 'inverse ratio', maybe, because our typical ratios are 1/8, 1/12, 1/16?

Do you mean defining it as an int to ensure precision? But it’s not always possible to guarantee that, for example, in cases like 2/3. Generally, for float parameters, slight precision errors are acceptable.

I remember having a funny experience setting budgets with four or five digits after dot in order to ensure that PQ gets exactly the size of 768 for 1536 dim data, in order to avoid PQ 767 or PQ 769.

Copy link
Collaborator

@liliu-z liliu-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: foxspy, liliu-z

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit 538e416 into zilliztech:main Oct 16, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants