Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: update retry function #6451

Merged
merged 7 commits into from
Feb 21, 2024

Conversation

FaybianB
Copy link
Contributor

Motivation

Update retry function.

Description

Updates the retry function to follow a more common pattern used in other libraries such as retry and retry-request, which, if retries=1 it will call the function and on failure retry it once.

Resolves #6447

Closes #6447

@FaybianB FaybianB requested a review from a team as a code owner February 19, 2024 01:28
@CLAassistant
Copy link

CLAassistant commented Feb 19, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@nflaig nflaig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FaybianB Thanks for looking into this. The change looks good to me, but it would be good to check all callers of this function and see if we need to reduce retries passed by 1. I would expect it doesn't matter much for most operation but getting an overview would be great.

@FaybianB FaybianB marked this pull request as draft February 19, 2024 15:33
@FaybianB
Copy link
Contributor Author

@nflaig Here are all the functions I've found that call the retry function. A lot of the supplied retry values are 20 and 60. Do you think any of these should be subtracted by 1?:

  • return retry(
    async (_attempt) => {
    return this.requestWithBodyWithFallbacks<T>(opts, getBody);
    },
    {retries: opts.retryAttempts, retryDelay: 200}
    );
  • await retry(() => keymanagerClient.listRemoteKeys(), {retryDelay: 500, retries: 20});
  • await retry(() => client.node.getHealth().then((res) => ApiError.assert(res)), {retryDelay: 1000, retries: 60});
  • await retry(
    async () => {
    const head = await client.beacon.getBlockHeader("head");
    ApiError.assert(head);
    if (head.response.data.header.message.slot < 1) throw Error("pre-genesis");
    },
    {retryDelay: 1000, retries: 20}
    );
  • await retry(
    async () => {
    const res = await client.beacon.getStateValidator("head", pubkey);
    ApiError.assert(res);
    if (res.response.data.status !== "active_exiting") {
    throw Error("Validator not exiting");
    } else {
    // eslint-disable-next-line no-console
    console.log(`Confirmed validator ${pubkey} = ${res.response.data.status}`);
    }
    },
    {retryDelay: 1000, retries: 20}
    );
  • expect(await retry(fn, opts)).toEqual(result);
  • await retry(
    async () => {
    const head = await client.beacon.getBlockHeader("head");
    ApiError.assert(head);
    if (head.response.data.header.message.slot < 1) throw Error("pre-genesis");
    },
    {retryDelay: 1000, retries: 20}
    );
  • await retry(
    async () => {
    const res = await client.beacon.getStateValidator("head", pubkey);
    ApiError.assert(res);
    if (res.response.data.status !== "active_exiting") {
    throw Error("Validator not exiting");
    } else {
    // eslint-disable-next-line no-console
    console.log(`Confirmed validator ${pubkey} = ${res.response.data.status}`);
    }
    },
    {retryDelay: 1000, retries: 20}
    );
  • await retry(
    async () => {
    const head = await beaconClient.getBlockHeader("head");
    ApiError.assert(head);
    if (head.response.data.header.message.slot < 1) throw Error("pre-genesis");
    },
    {retryDelay: 1000, retries: 20}
    );
  • await retry(
    async () => {
    const res = await beaconClient.getStateValidator("head", pubkeyToExit);
    ApiError.assert(res);
    if (res.response.data.status !== "active_exiting") {
    throw Error("Validator not exiting");
    } else {
    // eslint-disable-next-line no-console
    console.log(`Confirmed validator ${pubkeyToExit} = ${res.response.data.status}`);
    }
    },
    {retryDelay: 1000, retries: 20}
    );
  • await retry(
    async () => {
    const head = await client.beacon.getBlockHeader("head");
    ApiError.assert(head);
    if (head.response.data.header.message.slot < 1) throw Error("pre-genesis");
    },
    {retryDelay: 1000, retries: 60}
    );
  • await retry(
    // async (bail) => {
    async () => {
    const {data, headers} = await axios({
    method: "get",
    url,
    responseType: "stream",
    timeout: 30 * 60 * 1000,
    });
    const totalSize = headers["content-length"] as string;
    log(`Downloading ${url} - ${totalSize} bytes`);
    // extract tar into output directory
    await promisify(stream.pipeline)(data, extractTar({cwd: outputDir}));
    log(`Downloaded ${url}`);
    },
    {
    retries: 3,
    onRetry: (e, attempt) => {
    log(`Download attempt ${attempt} for ${url} failed: ${e.message}`);
    },
    }
    );

@FaybianB FaybianB marked this pull request as ready for review February 20, 2024 03:24
@nflaig
Copy link
Member

nflaig commented Feb 20, 2024

Thanks @FaybianB for looking into it

We should be fine to keep the 20/60 as is, but might wanna update these

const res = await this.api.submitBlindedBlock(signedBlindedBlock, {retryAttempts: 3});

I think 3 requests in total should be sufficient here, and we keep previous behavior

@nflaig
Copy link
Member

nflaig commented Feb 20, 2024

Just noticed we use external dependency in download tests instead of our own retry implementation. Good opportunity to clean up this dependency

import retry from "async-retry";

"async-retry": "^1.3.3",

Edit: Noticed this requires more changes, I will push a PR for this separately (see #6458)

@nflaig
Copy link
Member

nflaig commented Feb 20, 2024

@FaybianB I noticed few more cases, we use the retry function in our http clients which are configured via retryAttempts, need to double check these as well

e.g. default value here should probably be changed to 2

and we need to make sure to set default value here to 0 (meaning no retries by default)

retries: opts?.retryAttempts ?? this.opts?.retryAttempts ?? 1,

(or maybe more elegant would be to not wrap call into retry function if opts?.retryAttempts ?? this.opts?.retryAttempts is not set / undefined

@FaybianB FaybianB marked this pull request as draft February 20, 2024 20:20
@FaybianB FaybianB marked this pull request as ready for review February 20, 2024 22:31
@FaybianB
Copy link
Contributor Author

@nflaig I've updated the other areas referring to retryAttempts. Also, I've added an additional check in the retry function to check if we're on the last attempt and bypasses the shouldRetry logic if so:

https://github.com/ChainSafe/lodestar/pull/6451/files#diff-7425121b21e5d20801cc486ce948c1ceb550d27819e3fd95177b894c20114e4cR39-R43

I thought this was a cleaner solution than updating the retry tests to subtract 1 from the retryCount:

expect(retryCount).toBeWithMessage(retryAttempts, "ENOTFOUND should be retried before failing");

Do you agree or do you think I should update the tests instead?

Copy link

codecov bot commented Feb 21, 2024

Codecov Report

Merging #6451 (ca7dfde) into unstable (0a12b1b) will increase coverage by 0.00%.
Report is 1 commits behind head on unstable.
The diff coverage is 87.50%.

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #6451   +/-   ##
=========================================
  Coverage     61.63%   61.63%           
=========================================
  Files           553      553           
  Lines         57974    57998   +24     
  Branches       1834     1837    +3     
=========================================
+ Hits          35734    35749   +15     
- Misses        22203    22210    +7     
- Partials         37       39    +2     

@@ -85,7 +85,7 @@ export async function downloadGenericSpecTests<TestNames extends string>(
log(`Downloaded ${url}`);
},
{
retries: 2,
retries: 3,
Copy link
Member

@nflaig nflaig Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used async-retry library, keeping retries as is, to not change previous behavior

@nflaig
Copy link
Member

nflaig commented Feb 21, 2024

Do you agree or do you think I should update the tests instead?

Good catch! This makes sense to me, I noticed there is another case where checking if we are at max attempts is important because otherwise, we will also sleep one more time before throwing the last error which is not ideal.

I thought this was a cleaner solution than updating the retry tests to subtract 1 from the retryCount:

The tests are incorrect, incrementing retryCount in the last iteration is wrong as it will not do another retry. Also noticed tests use retryCount to track request count, pushed some updates to the test to fix this and make them pass.

Copy link
Member

@nflaig nflaig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @FaybianB for the thoroughness!

@nflaig nflaig merged commit 53f8f99 into ChainSafe:unstable Feb 21, 2024
18 of 20 checks passed
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.17.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update retry function
4 participants