Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI fails with License not available to perform the operation #1723

Closed
mariolenz opened this issue May 2, 2023 · 29 comments
Closed

CI fails with License not available to perform the operation #1723

mariolenz opened this issue May 2, 2023 · 29 comments
Assignees

Comments

@mariolenz
Copy link
Collaborator

I'm seeing CI failures again due to License not available to perform the operation, for example here.

@alinabuzachis I think you were able to fix this last time by re-creating the vCenter template. Do you think you could help again, or tell me who I could ping about this?

related: #1051

@alinabuzachis
Copy link
Contributor

@mariolenz sure, let me have a look and will let you know!

@jillr
Copy link
Contributor

jillr commented May 3, 2023

Hi @mariolenz - I will work on refreshing the image this week (we have another CI issue in a different collection that we need to resolve then this will be the next thing I take on).

@jillr
Copy link
Contributor

jillr commented May 3, 2023

This is WIP but I'm having some difficulties. The image build process seems to hang and I can't get a vnc console on the nested esxi image I'm building to see what the trouble is. I'm going to leave it running overnight and see if it completes. Apologies for the delay.

@jillr
Copy link
Contributor

jillr commented May 4, 2023

I've uploaded a new esxi image (after some manual intervention that I think was successful) but have been unable to get the VCSA image built yet. This is unfortunately a somewhat complex and partially manual process to refresh the images.

@jillr
Copy link
Contributor

jillr commented May 6, 2023

I think I have both images successfully created. The VCSA image will take some time to upload however (it's about 7GB). Sometime over the weekend we should be able to retry jobs.

@mariolenz
Copy link
Collaborator Author

@jillr FYI #1705 still fails with License not available to perform the operation.

cc @alinabuzachis

@jillr
Copy link
Contributor

jillr commented May 8, 2023

Looks like I missed an old esxi image when cleaning up Swift. It seems like 1705 is running now.

@mariolenz
Copy link
Collaborator Author

Looks better to me now. Thanks @jillr!

@mariolenz
Copy link
Collaborator Author

@jillr Do you have an idea why #1705 is still failing with License not available to perform the operation? This CI run looked OK to me, but later rechecks fail.

#1727 didn't fail the CI, which makes it even weirder :-/

@jillr
Copy link
Contributor

jillr commented May 9, 2023

@mariolenz I'm not sure. I double checked there is only one image in the image store for each of esxi and VCSA, and the esxi image is the one that was uploaded 2023-05-06T00:34:31Z. Are any other PRs failing the license or just this one?

I can try reuploading the image (I still have the file locally), or rebuilding it again, but I'll wait to make any further changes to hear if you think that's a good idea.

@mariolenz mariolenz reopened this May 10, 2023
@mariolenz
Copy link
Collaborator Author

@jillr I don't understand what's going on at all. As I've said, #1727 didn't fail the CI. So it looks like the images you've updated are OK. On the other hand:

Are any other PRs failing the license or just this one?

I thought that maybe there's something wrong with #1705 and created #1729, but the CI fails with the same error.

Then I thought that the vcenter_license integration tests might break the CI somehow, so I disabled them in #1730... but this didn't help, either :-(

@jillr
Copy link
Contributor

jillr commented May 10, 2023

I've reuploaded the esxi image and will do a recheck, if that has the same failure I'll try rebuilding the image.

@jillr
Copy link
Contributor

jillr commented May 10, 2023

That failed again so I suppose we'll need to try rebuilding the images, though I don't understand why it would have a bad license all of a sudden.

@mariolenz
Copy link
Collaborator Author

That failed again so I suppose we'll need to try rebuilding the images

@jillr Well, everything else I've tried (rechecking, closing / reopening the PR, creating a new PR and disabling the vcenter_license integration tests) didn't help. I'm running out of ideas... so maybe this is the only thing left to try 🤷

though I don't understand why it would have a bad license all of a sudden.

Me neither. This CI run looked OK to me (ansible-test-cloud-integration-vcenter7_2esxi-stable214 succeeded!) and #1727 didn't fail the CI, either. This is really puzzling.

@jillr
Copy link
Contributor

jillr commented May 11, 2023

@mariolenz I uploaded the rebuilt esxi image yesterday, and the vcsa image uploaded overnight. I just deleted the old vcsa image now. (There were 2 images in the image store for the last several hours) We can try again now that there's only one vcsa image but if that doesn't work I have to admit, I don't know nearly enough about vmware to know how to troubleshoot beyond that.

I could copy our build process to a gist in case you can identify anything that might be causing the trouble; I'm not immediately seeing where the license gets regenerated though with just grep and reading through all the involved repos though.

@jillr
Copy link
Contributor

jillr commented May 11, 2023

@mariolenz
Copy link
Collaborator Author

What vCenter and ESXi versions did you use? Maybe there's a mismatch that makes problems. Anyway, I think it's about time we test against 8.0 or 8.0U1. I guess you're still using 7.0U3...?

BTW: Thanks for sharing the build process for images! Do you think we could use an ansible playbook to automate (some) things in order to make it easier for you? We have https://github.com/ansible-collections/community.vmware/tree/main/tools for a reason ;-)

@jillr
Copy link
Contributor

jillr commented May 11, 2023

Yes - images are regenerated from the same source images every time and those are VMware-VCSA-all-7.0.3-20395099.iso and VMware-VMvisor-Installer-7.0U3f-20036589.x86_64.iso.

We should plan to have a larger conversation about the future of the CI / version support / etc. What would be easiest for you - matrix/irc chat, email thread, a call, something else? Are there other collection maintainers who should be included?

@ihumster
Copy link
Collaborator

ansible-test-cloud-integration-vcenter7_1esxi-stable214_2_of_2 still falling In #1722

@mariolenz
Copy link
Collaborator Author

We should plan to have a larger conversation about the future of the CI / version support / etc. What would be easiest for you - matrix/irc chat, email thread, a call, something else? Are there other collection maintainers who should be included?

Sorry for the late reply, but I'm still trying to find out why the CI fails. For example, here ansible-test-cloud-integration-vcenter7_2esxi-stable214 succeeded. And here everything except ansible-test-cloud-integration-vcenter7_1esxi-stable214_2_of_2 succeeded. This doesn't make any sense if it's a problem with the vCenter / ESXi images. I don't understand this yet...

Anyway, I think I'm more or less the only one maintaining this collection at the moment. @Tomorrow9 and @sky-joker worked on this in the past, but I didn't see much maintaining from them recently. But maybe they're still interested and would like to comment.

And then there are @Nina2244 and @p-fruck. They've opened a lot of PRs during the last months, so I think their opinion about the future of the CI / version support / etc. might be important, too.

@Nina2244
Copy link
Contributor

@mariolenz yes I'm interested in the future of the CI and version support for newer versions.

@p-fruck
Copy link
Contributor

p-fruck commented May 16, 2023

I'd be also interested, especially since I would like to keep testing for 7.0U3 (also, I am interested in the actual CI/CD setup and resources). Should we just move the discussion to the matrix channel or do you think a call is easier?

@jillr
Copy link
Contributor

jillr commented May 16, 2023

I think that @alinabuzachis found the issue - the images are mirrored to a second Swift region that was not documented in our process. I've swapped both images in the ca-ymq-1 region and am rechecking on #1722. 🤞

@mariolenz
Copy link
Collaborator Author

@alinabuzachis @jillr Cool! Maybe we're lucky and it finally works :-)

@mariolenz
Copy link
Collaborator Author

I'd be also interested, especially since I would like to keep testing for 7.0U3

@p-fruck I think the integration tests use up a lot of resources, so I'm not sure if we can run them for both 7.0U3 and 8.something. I agree that it would be a great thing to test both 7.0U3 and 8.0U1... but the resources are sponsered by RedHat (I think) and I don't know if they would agree to this.

also, I am interested in the actual CI/CD setup and resources

I don't know very much about this myself, at least not enough. There are two blog posts explaining a bit:

But they are quite old and I'm not sure if they are still relevant. There might have been some changes, or the procedures described might be basically the same 🤷 Anyway, maybe it helps to understand a bit more.

I also lack knowledge about the Zuul CI pipeline itself. I only understand some very basic stuff, not the whole system. Maybe you should have a look at my PRs in ansible-zuul-jobs. They don't explain anything, but I hope they give you a a bit of a feeling about how the CI jobs are defined.

Should we just move the discussion to the matrix channel or do you think a call is easier?

Personally, I would prefer to discuss this asynchronously (that is, not a call). I think this would make it easier for interested parties to join the discussion.

We could use the room or create an issue in this repo. @jillr @Nina2244 what would yoou prefer?

@mariolenz
Copy link
Collaborator Author

I think that @alinabuzachis found the issue - the images are mirrored to a second Swift region that was not documented in our process. I've swapped both images in the ca-ymq-1 region and am rechecking on #1722. crossed_fingers

This CI run failed, too. But I don't see why, Zuul looks stuck at Fetching info... for me at the moment.

I've rechecked #1705 which doesn't introduce new code. It just changes the DOCUMENTATION block in two modules. Maybe that's easier to test with.

@p-fruck
Copy link
Contributor

p-fruck commented May 16, 2023

Seems to work for #1705 🎉

@mariolenz
Copy link
Collaborator Author

Thanks @alinabuzachis and @jillr! The CI seems to work again now 😃

@jillr
Copy link
Contributor

jillr commented May 17, 2023

Thanks everyone. I'm going to be out for a few days but I will start a new issue next week to discuss the CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants