Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPMAuthenticate fails to retry successfully #15913

Closed
mastrzyz opened this issue Feb 11, 2022 · 4 comments
Closed

NPMAuthenticate fails to retry successfully #15913

mastrzyz opened this issue Feb 11, 2022 · 4 comments

Comments

@mastrzyz
Copy link
Contributor

mastrzyz commented Feb 11, 2022

Required Information

Entering this information will route you directly to the right team and expedite traction.

Question, Bug, or Feature?
Type: Bug

Enter Task Name: NpmAuthenticate

Issue Description

The NPM Authenticate task leverages live services and can be very flaky for us , so we used the retryOnTaskFailure setting in the main task to try and mitigate this.

We encounter the following error when we have a retry :

Adding authentication to the .npmrc file at D:\a\_work\3\s\*********\.npmrc
##[debug]SAVE_NPMRC_PATH=D:\a\_work\3\npmAuthenticate\FyuhLH
##[debug]SAVE_NPMRC_PATH=D:\a\_work\3\npmAuthenticate\FyuhLH
##[debug]NPM_AUTHENTICATE_TEMP_DIRECTORY=D:\a\_work\3\npmAuthenticate
##[debug]NPM_AUTHENTICATE_TEMP_DIRECTORY=D:\a\_work\3\npmAuthenticate
##[debug]rm -rf D:\a\_work\3\npmAuthenticate
##[debug]task result: Failed
##[error]Error: ENOENT: no such file or directory, open 'D:\a\_work\3\npmAuthenticate\FyuhLH\index.json'
##[debug]Processed: ##vso[task.issue type=error;]Error: ENOENT: no such file or directory, open 'D:\a\_work\3\npmAuthenticate\FyuhLH\index.json'
##[debug]Processed: ##vso[task.complete result=Failed;]Error: ENOENT: no such file or directory, open 'D:\a\_work\3\npmAuthenticate\FyuhLH\index.json'
##[warning]RetryHelper encountered task failure, will retry (attempt #: 2 out of 4) after 4000 ms

This looks to be due to the fact that after the first attempt to authenticate we timed out and the following catch code is run :

main().catch(error => {
    if(tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY")) {
        tl.rmRF(tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY"));
    } 
    tl.setResult(tl.TaskResult.Failed, error);
});

This wipes the directory NPM_AUTHENTICATE_TEMP_DIRECTORY but not the variables.

thus on retry the following code gets executed :

    if (tl.getVariable("SAVE_NPMRC_PATH")) {
         saveNpmrcPath = tl.getVariable("SAVE_NPMRC_PATH");
    }
    else {
        let tempPath = tl.getVariable('Agent.BuildDirectory') || tl.getVariable('Agent.TempDirectory');
        tempPath = path.join(tempPath, 'npmAuthenticate');
        tl.mkdirP(tempPath);
        saveNpmrcPath = fs.mkdtempSync(tempPath + path.sep); 
        tl.setVariable("SAVE_NPMRC_PATH", saveNpmrcPath, false);
        tl.setVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY", tempPath, false);
    }

And since SAVE_NPMRC_PATH was previously set, the code think the directory should be there, but it is not.

Simple solution would be to wipe this variable alongside wiping the root directory.

main().catch(error => {
    if(tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY")) {
        tl.rmRF(tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY"));
        // Clear the variales after we rm-rf the main root directory
        tl.setVariable("SAVE_NPMRC_PATH", "", false);
        tl.setVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY", "", false);
    } 
    tl.setResult(tl.TaskResult.Failed, error);
});

Does that seem good? am I missing something important here.

I think this looks good, can you create a PR for this?

Yeah -> #15914

@aasim
Copy link
Member

aasim commented Feb 11, 2022

I think this looks good, can you create a PR for this?

@zhenghao104
Copy link
Contributor

For testing, you can prob manually throw an exception somewhere in the middle for just one time, and check if retries work properly, and without retry, it should fail with the variable issue.

@mastrzyz
Copy link
Contributor Author

I tried the following code to test out my changes :

Commented out my 'fix' and called main immediately after the failure to fake a retry attempt

main().catch(error => {
if (tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY")) {
tl.rmRF(tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY"));
// Clear the variales after we rm-rf the main root directory
// tl.setVariable("SAVE_NPMRC_PATH", "", false);
// tl.setVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY", "", false);
}
main()
tl.setResult(tl.TaskResult.Failed, error);
});
Result ->
image

Uncommented my 'fix'

main().catch(error => {
if (tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY")) {
tl.rmRF(tl.getVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY"));
// Clear the variales after we rm-rf the main root directory
tl.setVariable("SAVE_NPMRC_PATH", "", false);
tl.setVariable("NPM_AUTHENTICATE_TEMP_DIRECTORY", "", false);
}
main()
tl.setResult(tl.TaskResult.Failed, error);
});

No more error!

I tried something like this to repro the exact error in the pipeline we had previously

@github-actions
Copy link

This issue is stale because it has been open for 180 days with no activity. Remove the stale label or comment on the issue otherwise this will be closed in 5 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants