Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test aarch64 distributions in CI #54238

Closed
mark-vieira opened this issue Mar 25, 2020 · 17 comments
Closed

Test aarch64 distributions in CI #54238

mark-vieira opened this issue Mar 25, 2020 · 17 comments
Assignees
Labels
:Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team

Comments

@mark-vieira
Copy link
Contributor

mark-vieira commented Mar 25, 2020

After #53914 has been merged we want to follow up and add the requisite testing now that we have at least one ARM CI worker.

@mark-vieira mark-vieira added the :Delivery/Build Build or test infrastructure label Mar 25, 2020
@mark-vieira mark-vieira self-assigned this Mar 25, 2020
@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@mark-vieira mark-vieira added Team:Delivery Meta label for Delivery team and removed Team:Core/Infra Meta label for core/infra team labels Nov 11, 2020
@mark-vieira
Copy link
Contributor Author

I've added a job for this which is for now just manually triggered.

https://elasticsearch-ci.elastic.co/view/Elasticsearch%20master/job/elastic+elasticsearch+master+arm/

Looks like the build is having trouble resolving BWC artifacts though. @breskeby do you mind looking into this? Perhaps we simply aren't taking the distribution architecture into considering when wiring things up here?

https://gradle-enterprise.elastic.co/s/gjomj2ipdrmkq

@mark-vieira
Copy link
Contributor Author

I think most of the build issues have been sorted out now. I now have the build actually executing tests and I've encounterd the first failure which may still be environmental configuration stuff.

#68936

@mark-vieira
Copy link
Contributor Author

We got a green build on master but it seems we still have some aritfact resolution issues WRT to BWC testing on 7.x.

https://gradle-enterprise.elastic.co/s/ut32yilfyxcp4

@breskeby do you mind taking a look here. I know we did some stuff to restrict BWC versions we test against, perhaps that's not working correctly in 7.x. I suspect the same issue then exists in the 7.12 branch.

@breskeby
Copy link
Contributor

I'll check

@breskeby
Copy link
Contributor

There was a backport pending #69330

@breskeby
Copy link
Contributor

And another one #69351

@mark-vieira
Copy link
Contributor Author

Ok here's the latest failure: https://gradle-enterprise.elastic.co/s/ed4o3vzcpvdns/failure#1

Something with the docker test fixture isn't working right on arm. I suspect we''ll need to actually inspect the fixture container log files to see what's exploding here.

@mark-vieira
Copy link
Contributor Author

Interestingly when running on my M1 mac I get a different issue: https://gradle-enterprise.elastic.co/s/l4f52osn5zzum/console-log?task=:test:fixtures:krb5kdc-fixture:composeUp

This might be oddness with Docker on arm on osx but if I run an ubuntu:14.04 image via docker run I have no problem running that command.

@mark-vieira
Copy link
Contributor Author

Ok, I opened #69583 to address the issue I ran into on my apple silicon mac. I'm hoping maybe it fixes this on Linux too. Strangely I didn't see an error when building on Linux so it might very well be something completely unrelated 🤷

@mark-vieira
Copy link
Contributor Author

Ok, got past those issue on 7.x and ran into a related test failure. I've opened #69640 to address it.

@mark-vieira
Copy link
Contributor Author

Opened and merged #69743 to fix an incompatible test fixture on arm and backported #69164 to 7.x and 7.12 in a way that didn't cause OOMEs on Java 8. Let's see what breaks next.

@mark-vieira
Copy link
Contributor Author

I think our remaining problems are related to the inconsistent docker situation across workers. I've opened an infra issue for this https://github.com/elastic/infra/issues/27195. That said, I was able to to install docker and compose manually on an ARM Ubuntu 18.04 machine and successfully run tests that rely on Docker-based fixtures so I suspect the issue is environment and something specific to the setup on our workers.

@mark-vieira
Copy link
Contributor Author

We've had green builds for all active branches. There were a number of issues that were blocking the ARM packer builds but those have been cleared up so we should be able to start pressing forward with integrating this testing in our normal periodic matrix. I'll begin that work soon.

@mark-vieira
Copy link
Contributor Author

We're nearly there!

I've sorted out the issues running our packer_cache.sh script on ARM workers, however there seems to be some other issue causing the ARM worker image builds to fail. That is blocking https://github.com/elastic/infra/pull/27671 which add dynamic worker support for elasticsearch-ci. Once that is merged we can then add the ARM builds to our normal periodic pipeline.

@mark-vieira
Copy link
Contributor Author

Ok, builds are now configured to use the new immutable workers and we've encountered some new errors :)

I'm working on sorting these out.

@mark-vieira
Copy link
Contributor Author

Ok, we're dealing with an incompatibility with the busybox binary on CentOS #71138.

@mark-vieira
Copy link
Contributor Author

ARM is now part of the normal CI testing matrix. We'll address specific failures in new issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team
Projects
None yet
Development

No branches or pull requests

3 participants