Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix dynamic release partition may fail search/query request #35919

Conversation

weiliu1031
Copy link
Contributor

issue: #33550
cause concurrent issue may occur between remove parition in target manager and sync segment list to delegator. when it happens, some segment may be released in delegator, and those segment may also be synced to delegator, which cause delegator become unserviceable due to lack of necessary segments, then search/query fails.

this PR make sure that all write access to target_manager will be executed in serial to avoid the concurrent issues.

@sre-ci-robot sre-ci-robot requested review from sunby and yah01 September 3, 2024 03:58
@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label Sep 3, 2024
@mergify mergify bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels Sep 3, 2024
Copy link
Contributor

mergify bot commented Sep 3, 2024

@weiliu1031 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@weiliu1031
Copy link
Contributor Author

/run-cpu-e2e

Copy link
Contributor

mergify bot commented Sep 3, 2024

@weiliu1031 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@weiliu1031 weiliu1031 force-pushed the fix_search_failed_during_drop_partition branch from 7ee41a8 to 61ae20d Compare September 3, 2024 08:08
@weiliu1031
Copy link
Contributor Author

rerun ut

cause concurrent issue may occur between remove parition in target
manager and sync segment list to delegator. when it happens, some
segment may be released in delegator, and those segment may also be
synced to delegator, which cause delegator become unserviceable due to
lack of necessary segments, then search/query fails.

this PR make sure that all write access to target_manager will be
executed in serial to avoid the concurrent issues.

Signed-off-by: Wei Liu <[email protected]>
@weiliu1031 weiliu1031 force-pushed the fix_search_failed_during_drop_partition branch from 61ae20d to 5fb929c Compare September 4, 2024 07:10
@mergify mergify bot added the ci-passed label Sep 4, 2024
Copy link

codecov bot commented Sep 4, 2024

Codecov Report

Attention: Patch coverage is 81.53846% with 12 lines in your changes missing coverage. Please review.

Project coverage is 72.61%. Comparing base (ea36d13) to head (5fb929c).
Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
internal/querycoordv2/observers/target_observer.go 85.00% 7 Missing and 2 partials ⚠️
...rnal/querycoordv2/observers/collection_observer.go 33.33% 2 Missing ⚠️
internal/querycoordv2/job/undo.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #35919      +/-   ##
==========================================
- Coverage   81.57%   72.61%   -8.96%     
==========================================
  Files        1265     1265              
  Lines      150790   150500     -290     
==========================================
- Hits       123002   109293   -13709     
- Misses      22896    36339   +13443     
+ Partials     4892     4868      -24     
Files with missing lines Coverage Δ
internal/querycoordv2/job/job_release.go 80.89% <100.00%> (-0.42%) ⬇️
internal/querycoordv2/job/undo.go 75.86% <0.00%> (+2.52%) ⬆️
...rnal/querycoordv2/observers/collection_observer.go 87.50% <33.33%> (ø)
internal/querycoordv2/observers/target_observer.go 82.60% <85.00%> (+1.01%) ⬆️

... and 254 files with indirect coverage changes

Copy link
Contributor

@XuanYang-cn XuanYang-cn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@congqixia
Copy link
Contributor

/approve

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: congqixia, weiliu1031, XuanYang-cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit 75676fb into milvus-io:master Sep 5, 2024
15 of 16 checks passed
weiliu1031 added a commit to weiliu1031/milvus that referenced this pull request Sep 5, 2024
…vus-io#35919)

issue: milvus-io#33550
cause concurrent issue may occur between remove parition in target
manager and sync segment list to delegator. when it happens, some
segment may be released in delegator, and those segment may also be
synced to delegator, which cause delegator become unserviceable due to
lack of necessary segments, then search/query fails.

this PR make sure that all write access to target_manager will be
executed in serial to avoid the concurrent issues.

Signed-off-by: Wei Liu <[email protected]>
sre-ci-robot pushed a commit that referenced this pull request Sep 6, 2024
) (#36019)

issue: #33550
pr: #35919
cause concurrent issue may occur between remove parition in target
manager and sync segment list to delegator. when it happens, some
segment may be released in delegator, and those segment may also be
synced to delegator, which cause delegator become unserviceable due to
lack of necessary segments, then search/query fails.

this PR make sure that all write access to target_manager will be
executed in serial to avoid the concurrent issues.

Signed-off-by: Wei Liu <[email protected]>
chyezh pushed a commit to chyezh/milvus that referenced this pull request Sep 11, 2024
…vus-io#35919)

issue: milvus-io#33550
cause concurrent issue may occur between remove parition in target
manager and sync segment list to delegator. when it happens, some
segment may be released in delegator, and those segment may also be
synced to delegator, which cause delegator become unserviceable due to
lack of necessary segments, then search/query fails.

this PR make sure that all write access to target_manager will be
executed in serial to avoid the concurrent issues.

Signed-off-by: Wei Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved ci-passed dco-passed DCO check passed. kind/bug Issues or changes related a bug lgtm size/L Denotes a PR that changes 100-499 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants