Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[m3msg] Increase default writer initial backoff #3820

Merged
merged 5 commits into from
Oct 7, 2021

Conversation

Antanukas
Copy link
Collaborator

@Antanukas Antanukas commented Oct 6, 2021

What this PR does / why we need it:

Increases initial retry backoff from 1s to 5s for all M3Msg message writers.

Special notes for your reviewer:

Increasing initial backoff to 5s reduced M3Msg retries on all of our workloads to almost 0. Under normal conditions retries should be close to 0. On certain workloads we also saw CPU usage reduction in M3 Aggregator.

Motivation to set a new default is mainly because with current setting all clusters will experience more retries than needed.

The following metrics can be used to inspect if retries are happening:

coordinator_downsampler_remote_aggregator_client_message_retry
m3aggregator_aggregator_client_message_retry
m3aggregator_aggregator_flush_handler_message_retry

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

@codecov
Copy link

codecov bot commented Oct 6, 2021

Codecov Report

Merging #3820 (29f95c5) into master (1c72e40) will decrease coverage by 0.3%.
The diff coverage is 100.0%.

❗ Current head 29f95c5 differs from pull request most recent head 734c27c. Consider uploading reports for the commit 734c27c to get more accurate results

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #3820     +/-   ##
========================================
- Coverage    57.1%   56.8%   -0.4%     
========================================
  Files         552     552             
  Lines       63177   63081     -96     
========================================
- Hits        36115   35860    -255     
- Misses      23876   24020    +144     
- Partials     3186    3201     +15     
Flag Coverage Δ
aggregator 63.4% <ø> (-0.3%) ⬇️
cluster ∅ <ø> (∅)
collector 58.4% <ø> (ø)
dbnode 60.4% <ø> (-0.4%) ⬇️
m3em 46.4% <ø> (ø)
metrics 19.7% <ø> (ø)
msg 74.4% <100.0%> (-0.1%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6da0a4a...734c27c. Read the comment docs.

@Antanukas Antanukas changed the title [m3coordinator] [m3aggregator] Default M3Msg writer initial backoff [m3coordinator] [m3aggregator] Increase M3Msg writer initial backoff Oct 6, 2021
@Antanukas Antanukas changed the title [m3coordinator] [m3aggregator] Increase M3Msg writer initial backoff [m3msg] Increase M3Msg writer initial backoff Oct 6, 2021
@Antanukas Antanukas changed the title [m3msg] Increase M3Msg writer initial backoff [m3msg] Increase default writer initial backoff Oct 6, 2021
@Antanukas Antanukas marked this pull request as ready for review October 6, 2021 05:59
@@ -29,6 +29,7 @@ import (
// Configuration configures options for retry attempts.
type Configuration struct {
// Initial retry backoff.
// Defaults to 5s if unspecified.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need for this comment, eventually it would go out of sync with the actual value.

@Antanukas Antanukas enabled auto-merge (squash) October 7, 2021 07:24
@Antanukas Antanukas merged commit b11cf74 into master Oct 7, 2021
@Antanukas Antanukas deleted the antanas/m3msg-default-initial-backoff branch October 7, 2021 07:41
@vdarulis
Copy link
Collaborator

vdarulis commented Oct 7, 2021

m3msg producer retry options is one of the most confusing pieces of config around, thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants