Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use internal "cache" module to cache libvirt and baremetal images #2595

Closed
wants to merge 1 commit into from

Conversation

Fedosin
Copy link
Contributor

@Fedosin Fedosin commented Oct 30, 2019

No description provided.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 30, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Fedosin
To complete the pull request process, please assign sdodson
You can assign the PR to them by writing /assign @sdodson in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Fedosin
Copy link
Contributor Author

Fedosin commented Oct 30, 2019

/test e2e-libvirt

@stbenjam
Copy link
Member

/label platform/baremetal

@openshift-ci-robot openshift-ci-robot added the platform/baremetal IPI bare metal hosts platform label Oct 30, 2019
@stbenjam
Copy link
Member

(To run Metal3 CI)

@Fedosin
Copy link
Contributor Author

Fedosin commented Oct 30, 2019

/hold
We need to merge the main pr first

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 30, 2019
// then the downloaded data checksum will be compared with the provided value.
func DownloadFile(baseURL string, dataType string) (string, error) {
// Send a request
resp, err := http.Get(baseURL)
Copy link
Member

@stbenjam stbenjam Oct 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to avoid the HTTP query at all? If we know the SHA, could we look to see if it's already in the cache before making any network connection out? Thinking of some specific scenarios with a disconnected install where we might want to prepopulate the image. This looks like it downloads the whole image each time before populating the cache (i.e. the cache doesn't get used across multiple installer runs)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check if a URL starts with "file://" we don't download anything and just reuse the existing file, like it happens now: https://github.com/openshift/installer/blob/master/pkg/tfvars/libvirt/cache.go#L32-L34

Also, we actually don't load any image data and immediately return the path if we found a file with ETag name in the cache. We just make a request and validate the response headers to find ETag there, that's all we do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is any outbound query. For many bare metal environments we have a requirement that there can’t be any outbound internet connectivity at all. It’s possible to override the registry for containers, and for the image used by masters (which is different than bootstrap) we have a solution. The only one remaining is we don’t have a good solution that handles the qemu image for bootstrap. That is always an internet url in rhcos.json. Since we know the SHA and that’s definitive anyway, it makes sense to just see if we have a file named that already instead of using ETags at all.

The idea being we populate the cache manually in a disconnected environment.

Copy link
Member

@stbenjam stbenjam Oct 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe my idea isn’t all that well thought out, I just saw this PR and thought it would be an easy change to just use the SHA all the time. Anyway, I don’t want to hijack your PR and I can always make a proposal separately to handle this case maybe by explicitly overriding the rhcos baseURI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I got you now... you're gonna laugh, but what you're saying was in my original patch! I calculated a filename corresponding to the url's md5 hash. If it helps you, I can bring that solution back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give me 10 minutes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stbenjam done! now you can calculate image filename in the cache by using echo -n <image_url> | md5sum

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome thanks so much!

@metal3ci
Copy link

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1282/

@Fedosin Fedosin force-pushed the image_cache branch 2 times, most recently from 899a57a to ea5296c Compare October 30, 2019 21:43
@metal3ci
Copy link

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1283/

@metal3ci
Copy link

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1284/

@metal3ci
Copy link

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1285/

@stbenjam
Copy link
Member

/unlabel platform/baremetal

@stbenjam
Copy link
Member

Our CI is broken at the moment. Sorry for all the comments from @metal3ci

@Fedosin
Copy link
Contributor Author

Fedosin commented Oct 31, 2019

/test e2e-openstack

@Fedosin
Copy link
Contributor Author

Fedosin commented Nov 6, 2019

/label platform/libvirt

@openshift-ci-robot
Copy link
Contributor

@Fedosin: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-scaleup-rhel7 33d35cf link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@metal3ci
Copy link

metal3ci commented Nov 6, 2019

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1296/

@abhinavdahiya
Copy link
Contributor

will be fixed with #2657

@openshift-ci-robot
Copy link
Contributor

@abhinavdahiya: Closed this PR.

In response to this:

will be fixed with #2657

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Fedosin Fedosin deleted the image_cache branch March 16, 2020 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. platform/baremetal IPI bare metal hosts platform platform/libvirt size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants