Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Add retry when retrieving agent version #186459

Merged

Conversation

nchaulet
Copy link
Member

Summary

Resolve #184681

The test to run standalone agents are sometime failing when retrieving the agent version for the artifacts API.
That PR should fix it, by adding retries and a fallback value.

@nchaulet nchaulet added bug Fixes for quality problems that affect the customer experience release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team labels Jun 19, 2024
@nchaulet nchaulet self-assigned this Jun 19, 2024
@nchaulet nchaulet requested a review from a team as a code owner June 19, 2024 12:22
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@obltmachine
Copy link

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

Copy link
Contributor

@criamico criamico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚢 Thanks for fixing it!


export async function getLatestVersion(): Promise<string> {
const response: any = await axios('https://artifacts-api.elastic.co/v1/versions');
return last(response.data.versions as string[]) || '8.1.0-SNAPSHOT';
return pRetry(() => axios('https://artifacts-api.elastic.co/v1/versions'), {
Copy link
Member

@cmacknz cmacknz Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The artifacts-api is not stable, don't use it. We had to move away from it in the agent team.

The latest snapshot for each version is available from https://snapshots.elastic.co which does not use the artifacts API.

Examples:

The version to use will be the current version of Kibana in a branch for the next release.

Those will give you a build ID you can use to build the artifact download URL For example 8.15.0-SNAPSHOT gives you "build_id" : "8.15.0-72200925" as the latest build ID which gets used as shown below to get the linux x64_64 .tar.gz artifact directly.

https://snapshots.elastic.co/8.15.0-72200925/downloads/beats/elastic-agent/elastic-agent-8.15.0-SNAPSHOT-linux-x86_64.tar.gz

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmacknz Looks like a bunch of Kibana tests use the artifacts API, do you think even with retry? (and a fallback value) it's not good enough?
I am worried when using https://snapshots.elastic.co/ we will have issues when creating new branch for new version and when the elastic-agent SNAPSHOT is not yet available

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried when using https://snapshots.elastic.co/ we will have issues when creating new branch for new version and when the elastic-agent SNAPSHOT is not yet available

Then we might have to pin the exact version indeed. Retrying might help but stability is not guaranteed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we might have to pin the exact version indeed. Retrying might help but stability is not guaranteed.

The way I implemented this is with retry and a fallback value to 8.15.0-SNAPSHOT if retries failed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can certainly start with a retry and see how it works, if it continues to be flaky the solution is as described above - don't use the artifacts-api.

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #58 / security APIs - OIDC (Authorization Code Flow) OpenID Connect authentication finishing handshake should succeed if both the OpenID Connect response and the cookie are provided

Metrics [docs]

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @nchaulet

@nchaulet nchaulet merged commit 39ac3e1 into elastic:main Jun 19, 2024
32 checks passed
@kibanamachine kibanamachine added v8.15.0 backport:skip This commit does not require backporting labels Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting bug Fixes for quality problems that affect the customer experience release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v8.15.0
Projects
None yet
8 participants