Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ServiceBus] add delay between infinite retry cycles #20316

Merged

Conversation

jeremymeng
Copy link
Member

Errors that are not retryable will be re-thrown out of the normal retry() call.
However, we don't have any delay before restarting the retry cycle.

This PR adds a delay before continuing the infinite retry cycles.

Packages impacted by this PR

@azure/service-bus

Issues associated with this PR

#19883

Describe the problem that is addressed by this PR

Errors that are not retryable will be re-thrown out of the normal retry() call.
However, we don't have any delay before restarting the retry cycle.

What are the possible designs available to address the problem? If there are more than one possible design, why was the one in this PR chosen?

Are there test cases added in this PR? _

To be added

Checklists

  • Added a changelog (if necessary)

Errors that are not retryable will be re-thrown out of the normal retry() call.
However, we don't have any delay before restarting the retry cycle.

This PR adds a delay before continuing the infinite retry cycles.
@@ -293,6 +318,22 @@ export async function retryForever<T>(
retryFn: typeof retry = retry
): Promise<T> {
let numRetryCycles = 0;
const config = args.retryConfig;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I duplicated the code from core-amqp's retry as they are used internally and not really worth to expose to reuse.

delayInMs,
config.operationType
);
await delay<void>(delayInMs, config.abortSignal, "Retry cycle has been cancelled by the user.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changelog?

Copy link
Member

@HarshaNalluru HarshaNalluru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@HarshaNalluru
Copy link
Member

Can we check the resource usage before and after the changes just for sanity?

Perhaps the perf test under sdk/servicebus/service-bus/test/perf-js-libs/service-bus-v7/receive.ts can be used for testing while this #18859 is still being worked upon.

@jeremymeng
Copy link
Member Author

Can we check the resource usage before and after the changes just for sanity?

Does the test cover the retry case?

@HarshaNalluru
Copy link
Member

HarshaNalluru commented Feb 10, 2022

No, it doesn't. Might have to tweak the test.

Edit: or maybe the sdk/servicebus/service-bus/samples-dev/receiveMessagesStreaming.ts sample is better in this case to test.

also fixed the issue where we don't use `maxRetryDelayInMs` in Fixed retry
mode.
@@ -153,7 +153,7 @@ function calculateDelay(
return Math.min(incrementDelta, maxRetryDelayInMs);
}

return retryDelayInMs;
return Math.min(retryDelayInMs, maxRetryDelayInMs);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that we didn't limit the delay in Fixed mode. I think we should respect in Fixed mode too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got me thinking. Should we allow specifying a delay larger than the default max, if maxRetryDelayInMs is not specified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it now. It doesn't make sense for fixed mode because the delay is not changing. Will remove

@@ -278,6 +278,9 @@ describe("shared receiver code", () => {
},
connectionId: "id",
operationType: RetryOperationType.connection,
retryOptions: {
maxRetryDelayInMs: 2000,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default of 30s delay is too high so I reduced it.

@jeremymeng jeremymeng marked this pull request as ready for review February 10, 2022 21:51
@check-enforcer
Copy link

This pull request is protected by Check Enforcer.

What is Check Enforcer?

Check Enforcer helps ensure all pull requests are covered by at least one check-run (typically an Azure Pipeline). When all check-runs associated with this pull request pass then Check Enforcer itself will pass.

Why am I getting this message?

You are getting this message because Check Enforcer did not detect any check-runs being associated with this pull request within five minutes. This may indicate that your pull request is not covered by any pipelines and so Check Enforcer is correctly blocking the pull request being merged.

What should I do now?

If the check-enforcer check-run is not passing and all other check-runs associated with this PR are passing (excluding license-cla) then you could try telling Check Enforcer to evaluate your pull request again. You can do this by adding a comment to this pull request as follows:
/check-enforcer evaluate
Typically evaulation only takes a few seconds. If you know that your pull request is not covered by a pipeline and this is expected you can override Check Enforcer using the following command:
/check-enforcer override
Note that using the override command triggers alerts so that follow-up investigations can occur (PRs still need to be approved as normal).

What if I am onboarding a new service?

Often, new services do not have validation pipelines associated with them, in order to bootstrap pipelines for a new service, you can issue the following command as a pull request comment:
/azp run prepare-pipelines
This will run a pipeline that analyzes the source tree and creates the pipelines necessary to build and validate your pull request. Once the pipeline has been created you can trigger the pipeline using the following comment:
/azp run js - [service] - ci

@@ -279,6 +282,28 @@ export interface RetryForeverArgs<T> {
logPrefix: string;
}

/**
* Calculates delay between retries, in milliseconds.
* @internal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need to tag it as internal if it is not exported.

Copy link
Member

@deyaaeldeen deyaaeldeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes sense.

const fakeRetry = async <T>(): Promise<T> => {
++numRetryCalls;

if (numRetryCalls < 4) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could the constant 3 be factored into a variable everywhere so there is no magic numbers?

Comment on lines +350 to +354
assert.deepEqual(errorMessages, [
"Attempt 1: Force another call of retry<>",
"Attempt 2: Force another call of retry<>",
"Attempt 3: Force another call of retry<>",
]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
assert.deepEqual(errorMessages, [
"Attempt 1: Force another call of retry<>",
"Attempt 2: Force another call of retry<>",
"Attempt 3: Force another call of retry<>",
]);
assert.deepEqual(errorMessages, Array(errorCount).fill("Attempt 1: Force another call of retry<>"));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or not because the message has indices in it, please discard this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. the counter increments

@jeremymeng jeremymeng merged commit d5e4636 into Azure:main Feb 12, 2022
@jeremymeng jeremymeng deleted the sb/delay-between-reconnecting-attemp branch February 12, 2022 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants