Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix slow dist handle and slow observe #38566

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

bigsheeper
Copy link
Contributor

@bigsheeper bigsheeper commented Dec 18, 2024

  1. Provide partition&channel level indexing in the collection target.
  2. Make SegmentAction not wait for distribution.
  3. Remove scheduler and target manager mutex.
  4. Optimize logging to reduce CPU overhead.

issue: #37630

@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label Dec 18, 2024
@mergify mergify bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels Dec 18, 2024
@bigsheeper
Copy link
Contributor Author

slow dist handling:
kZ2dj7hDZj

slow observation:
7h7mK3zFij

czs007 pushed a commit that referenced this pull request Dec 18, 2024
1. Provide partition-level indexing in the collection target.
2. Make SegmentAction not wait for distribution.
3. Optimize logging to reduce CPU overhead.

issue: #37630

pr: #38566

---------

Signed-off-by: bigsheeper <[email protected]>
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 90.75630% with 22 lines in your changes missing coverage. Please review.

Project coverage is 82.94%. Comparing base (d7623ab) to head (66e7e05).

Files with missing lines Patch % Lines
internal/querycoordv2/task/scheduler.go 87.21% 16 Missing and 1 partial ⚠️
...rnal/querycoordv2/observers/collection_observer.go 83.33% 4 Missing ⚠️
internal/querycoordv2/meta/target.go 98.24% 0 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #38566       +/-   ##
===========================================
+ Coverage   69.54%   82.94%   +13.39%     
===========================================
  Files         296     1092      +796     
  Lines       26536   169780   +143244     
===========================================
+ Hits        18455   140819   +122364     
- Misses       8081    23374    +15293     
- Partials        0     5587     +5587     
Components Coverage Δ
Client 79.12% <ø> (∅)
Core ∅ <ø> (∅)
Go 83.09% <90.75%> (∅)
Files with missing lines Coverage Δ
internal/querycoordv2/dist/dist_handler.go 96.13% <100.00%> (ø)
internal/querycoordv2/meta/target_manager.go 88.19% <100.00%> (ø)
internal/querycoordv2/task/action.go 96.66% <100.00%> (ø)
internal/querycoordv2/utils/util.go 85.54% <100.00%> (ø)
internal/querynodev2/services.go 86.79% <100.00%> (ø)
pkg/metrics/querycoord_metrics.go 100.00% <ø> (ø)
internal/querycoordv2/meta/target.go 92.94% <98.24%> (ø)
...rnal/querycoordv2/observers/collection_observer.go 86.52% <83.33%> (ø)
internal/querycoordv2/task/scheduler.go 87.41% <87.21%> (ø)

... and 1379 files with indirect coverage changes

jaime0815 pushed a commit that referenced this pull request Dec 19, 2024
Print observe, dist handing and schedule time.

issue: #37630

pr: #38566

Signed-off-by: bigsheeper <[email protected]>
Copy link
Contributor

mergify bot commented Dec 19, 2024

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Signed-off-by: bigsheeper <[email protected]>
Signed-off-by: bigsheeper <[email protected]>
czs007 pushed a commit that referenced this pull request Dec 26, 2024
Copy link
Contributor

mergify bot commented Dec 26, 2024

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 26, 2024

@bigsheeper cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@bigsheeper
Copy link
Contributor Author

rerun go-sdk

Copy link
Contributor

mergify bot commented Dec 31, 2024

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 1, 2025

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Signed-off-by: bigsheeper <[email protected]>
Copy link
Contributor

mergify bot commented Jan 1, 2025

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

czs007 pushed a commit that referenced this pull request Jan 2, 2025
…ger (#38956)

pr: #38566

Just for test, I'll remove the global mutex latter.

Signed-off-by: bigsheeper <[email protected]>
Copy link
Contributor

mergify bot commented Jan 2, 2025

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

@sre-ci-robot sre-ci-robot added size/XL Denotes a PR that changes 500-999 lines. and removed size/L Denotes a PR that changes 100-499 lines. labels Jan 3, 2025
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bigsheeper
To complete the pull request process, please assign wxyucs after the PR has been reviewed.
You can assign the PR to them by writing /assign @wxyucs in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

mergify bot commented Jan 3, 2025

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 3, 2025

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

czs007 pushed a commit that referenced this pull request Jan 3, 2025
supplement to pr: #38566

Signed-off-by: bigsheeper <[email protected]>
Copy link
Contributor

mergify bot commented Jan 3, 2025

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 3, 2025

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-passed DCO check passed. kind/bug Issues or changes related a bug size/XL Denotes a PR that changes 500-999 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants