-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix downloading agent manifest from upstream #461
Fix downloading agent manifest from upstream #461
Conversation
…m 7.x instead of using a local static file (elastic#459)" This reverts commit 6531362.
…m 7.x instead of using a local static file (elastic#459)" This reverts commit 6531362.
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪 |
/test |
defer resp.Body.Close() | ||
logger.Debugf("status code when downloading elastic-agent-managed-kubernetes.yaml is %d", resp.StatusCode) | ||
if resp.StatusCode != 200 { | ||
return nil, errors.Wrapf(err, "downloading failed due to status code %d", resp.StatusCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe print also the body here?
return elasticAgentManagedYaml, nil | ||
} | ||
err = fmt.Errorf("bytes downloaded should be more than 2000 but where: %d", len(elasticAgentManagedYaml)) | ||
logger.Debugf("failed because of %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check these debugf calls (for example with an invalid URL)?
} | ||
err = fmt.Errorf("bytes downloaded should be more than 2000 but where: %d", len(elasticAgentManagedYaml)) | ||
logger.Debugf("failed because of %s", err) | ||
logger.Debugf("file downloaded is %s", string(elasticAgentManagedYaml)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discourage printing the doc here. If you really want to check if the content is right, maybe print just MD5?
err = fmt.Errorf("bytes downloaded should be more than 2000 but where: %d", len(elasticAgentManagedYaml)) | ||
logger.Debugf("failed because of %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you combine these two lines together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need the err variable to be set here in order to be returned later
@@ -10,7 +10,7 @@ cleanup() { | |||
|
|||
# Dump kubectl details | |||
kubectl describe pods --all-namespaces > build/kubectl-dump.txt | |||
kubectl logs -l app=kind-fleet-agent-clusterscope -n kube-system >> build/kubectl-dump.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please grep the codebase here if there are not references to kind-fleet-agent-clusterscope
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only references in tests and docs of kubernetes test package
I update the error handling and error messages. It is tested with scenarios where the url is invalid and the bytes are less than expected. The errors are the ones supposed to be. |
This is unrelated to this PR, but maybe worth defining in the beats repo:
As you can see, the code doesn't work for created resources as it does for kube-state-metrics:
I was wondering if this is something we can improve in Beats. |
You mean that the part of checking the readiness of the pods is missing? |
Yes. As you see here, we're using helm internals (same logic) to wait for resources. I'm wondering what's missing that it doesn't wait for it. Is it just lack of healthcheck or deployment? Please compare it with: https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Ingest-manager/pipelines/integrations/branches/master/runs/647/nodes/304/steps/5473/log/?start=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest it sounds like a blocker for this, as with remote YAML elastic-package doesn't wait for agent deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove also this file. I understand that it won't be used anymore.
….com:MichaelKatsoulis/elastic-package into fix_downloading_agent_manifest_from_upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please merge it if CI is happy.
This PR tries to fix bug created with #452
In this PR a retry operation is added when downloading elastic agent manifest from upstream. If the status code of the response is not 200 or the received bytes of the file are less than 2000, the code will retry for 5 times to fetch the file before failing.
This way short network issues can be faced