-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Search/Query may failed during updating delegator cache #37320
Conversation
casue init query node client is too heavy, so we remove updateShardClient from leader mutex, which cause much more concurrent cornor cases. This PR delay query node client's init operation until `getClient` is called, then use leader mutex to protect updating shard client progress to avoid concurrent issues. Signed-off-by: Wei Liu <wei.liu@zilliz.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: weiliu1031 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@weiliu1031 E2e jenkins job failed, comment |
@weiliu1031 go-sdk check failed, comment |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 2.5 #37320 +/- ##
==========================================
- Coverage 83.15% 80.84% -2.31%
==========================================
Files 1029 1321 +292
Lines 157321 183102 +25781
==========================================
+ Hits 130819 148030 +17211
- Misses 21332 29865 +8533
- Partials 5170 5207 +37
|
/run-cpu-e2e |
rerun go-sdk |
@weiliu1031 E2e jenkins job failed, comment |
/run-cpu-e2e |
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
pr#37116 let proxy retry to get shard leader if error happens, which cause if search/query on a unloaded collection, which will keep retrying until ctx done. This PR add error type check to skip retry on ErrCollectionLoaded. Signed-off-by: Wei Liu <wei.liu@zilliz.com>
@weiliu1031 E2e jenkins job failed, comment |
/run-cpu-e2e |
issue: #37115
pr: #37116
casue init query node client is too heavy, so we remove updateShardClient from leader mutex, which cause much more concurrent cornor cases.
This PR delay query node client's init operation until
getClient
is called, then use leader mutex to protect updating shard client progress to avoid concurrent issues.