Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: ARM releases #1443

Closed
gclawes opened this issue Feb 24, 2020 · 34 comments · Fixed by #1843
Closed

Request: ARM releases #1443

gclawes opened this issue Feb 24, 2020 · 34 comments · Fixed by #1843
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@gclawes
Copy link
Contributor

gclawes commented Feb 24, 2020

What would you like to be added: Official ARM releases. This was requested previously in #801, it looks like the problems there should have been resolved by moving to go modules (#801 (comment)).

Why is this needed: For use on non-x86 platforms such as Raspberry Pi.

@gclawes gclawes added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 24, 2020
@gclawes
Copy link
Contributor Author

gclawes commented Feb 28, 2020

I've confirmed building on arm works without modification to the Dockerfile/Makefile, I was able to build an arm64 image: registry.gitlab.com/gclawes/arm-ports/exteranl-dns:v0.6.0

@MacLeodMike
Copy link

I'm getting failures when building on arm (not arm64). First error:

go test -v -race sigs.k8s.io/external-dns sigs.k8s.io/external-dns/controller sigs.k8s.io/external-dns/endpoint sigs.k8s.io/external-dns/internal/testutils sigs.k8s.io/external-dns/pkg/apis/externaldns sigs.k8s.io/external-dns/pkg/apis/externaldns/validation sigs.k8s.io/external-dns/pkg/k8sutils/async sigs.k8s.io/external-dns/pkg/tlsutils sigs.k8s.io/external-dns/plan sigs.k8s.io/external-dns/provider sigs.k8s.io/external-dns/registry sigs.k8s.io/external-dns/source
go test: -race is only supported on linux/amd64, linux/ppc64le, linux/arm64, freebsd/amd64, netbsd/amd64, darwin/amd64 and windows/amd64
make: *** [Makefile:41: test] Error 2

If I bypass that my removing -race from the test, I still error out in later tests:

# sigs.k8s.io/external-dns/source [sigs.k8s.io/external-dns/source.test] source/source_test.go:61:72: constant 4294967296 overflows int FAIL sigs.k8s.io/external-dns/source [build failed]

Unfortunately, raspbian is 32bit and is still the default OS for the raspberry pi.

@alexellis
Copy link

+1 for this

I would really appreciate having an arm and arm64 build - just like we have for many of the other Kubernetes components. Even the KinD team are looking to add support - https://groups.google.com/forum/#!topic/kubernetes-sig-architecture/eHBFgfd6Qxg

If the comment above is a blocker, then perhaps the race detection should only be run on the main build?

@Raffo
Copy link
Contributor

Raffo commented Jun 18, 2020

From what I hear, arm64 is easy while arm is kinds involved. If someone is willing to dig into the details of the problems with the arm one, I can deal with the infrastructure bits to release such an image.

@MacLeodMike
Copy link

If I remove the make test line from the Dockerfile, the image builds and runs fine on arm. It's just the tests that don't run on arm.

@Raffo
Copy link
Contributor

Raffo commented Jun 19, 2020

@MacLeodMike thanks for the context. I'll have a look at what we can do in terms of publishing official images for arm, it's uncharted territory for me, but it sounds like a great addition.

@MacLeodMike
Copy link

Thanks @Raffo . I'm not much of a dev, but I'm happy to test any fixes you come up with.

@Raffo
Copy link
Contributor

Raffo commented Jun 28, 2020

I did a bit of research, what looks like it's needed, for future reference, is:

  • Understand if we can push a different tag (i.e. arm-$VERSION) to the GCR repo
  • Figure out if we can make test work on arm architecture
  • Change the cloud builder config to separate the steps from the Dockerfile
  • Add a Makefile rule to build arm
  • Add the rule as a separate step in cloud builder

@gclawes
Copy link
Contributor Author

gclawes commented Jun 28, 2020

GCR supports Manifest Lists (https://cloud.google.com/container-registry/docs/image-formats), so separate tags for ARM architectures shouldn't be necessary.

@Raffo
Copy link
Contributor

Raffo commented Jun 28, 2020

@gclawes do you have any reference on how those work? I'm not familiar with this docker feature.

@gclawes
Copy link
Contributor Author

gclawes commented Jun 28, 2020

Docker images can be built for multiple architectures using the same tag. Under the hood, the registry stores the same tag with multiple manifest lists (one for each arch).

The simplest way to do this is just build on different machines of the given architectures (x86, raspberry pi, AWS Graviton, etc) and just push the same tag, or use docker's cross-building capability: https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/

@timtorChen
Copy link
Contributor

I did a few test of github action on a duplicated repo https://github.com/timtorChen/external-dns-action-test/action

I found there is a unpredicatable error (sometimes pass, and sometimes failed) when action runs the test specific on arm64. Look at buildx #1 I seperate the amd64 and arm64 build into two jobs and it pass. While it did not pass when combines two plaform together on job build #2. For buildx #3 runs the build only for arm64, but it is failed ...

The test on arm64 always fails here

2020-07-19T08:58:03.8320563Z #11 1110. === RUN   TestServiceSource/Endpoints/compatibility_annotated_services_with_tmpl._compatibility_takes_precedence
2020-07-19T08:58:03.8320781Z #11 1110. === RUN   TestServiceSource/Endpoints/not_annotated_services_with_unknown_tmpl_field_should_not_return_anything
2020-07-19T08:58:03.8320971Z #11 1110. unexpected fault address 0xfffffffffe67b000
2020-07-19T08:58:03.8321135Z #11 1110. fatal error: fault
2020-07-19T08:58:03.8321368Z #11 1110. [signal SIGSEGV: segmentation violation code=0x1 addr=0xfffffffffe67b000 pc=0x9b88c]

Seldomly, the error log shows run out of memory (Sorry, I rebuild the job and did not keep the full log)
I never coding about golang before, not sure if the test will deplete the memory or not (Github action VM offers 2G core CPU, 7G RAM).

@xunholy
Copy link

xunholy commented Jul 20, 2020

ARM64 Image is being built here: https://github.com/raspbernetes/multi-arch-images

Can be pulled here: https://hub.docker.com/r/raspbernetes/external-dns

This image will be kept in parity with this upstream of course until upstream support is provided.

@seanmalloy
Copy link
Member

Looks like the cluster-proportional-autoscaler just added multi-arch image support. See the linked pull requests from issue kubernetes-sigs/cluster-proportional-autoscaler#89 to see how this project made it work.

@seanmalloy
Copy link
Member

/help

@k8s-ci-robot
Copy link
Contributor

@seanmalloy:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 14, 2020
@excavador
Copy link

Any updates?

@seanmalloy
Copy link
Member

@excavador I'm not aware of any updates. I think we need a volunteer that would be willing to put in the effort to make this a reality.

I'm willing to help provide guidance and answer any question.

@Raffo Raffo self-assigned this Sep 26, 2020
@Raffo
Copy link
Contributor

Raffo commented Sep 26, 2020

I've assigned this to me and will tackle this during the month of October. Probably the first stable ARM releases will be out for ExternalDNS v0.7.5.

@Raffo Raffo removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Oct 22, 2020
@Raffo
Copy link
Contributor

Raffo commented Oct 22, 2020

The first working multi-arch image can be tested: gcr.io/k8s-staging-external-dns/external-dns:v20201021-v0.7.4-40-g764b9446

I tested and verified it boots on a graviton based cluster on AWS, but I would love to get more feedback if someone has something to share.

Currently supported architectures: amd64 and arm64.

@alexellis
Copy link

If you can add armhf I am sure there are folks in the openfaas community who would enjoy using this with inlets

@Raffo
Copy link
Contributor

Raffo commented Oct 22, 2020

@alexellis I can give this a shot, but I think armhf would require a bit of refactoring being essentially a 32 bit architecture. If it's complicated, we could postpone it to a later release.

@braxtone
Copy link

Thanks for putting this together! Similar to @alexellis question, I tried deploying this on my Raspberry Pi 4 cluster running k3s and it failed to pull due to the lack of support for arm/v7. Could you think you could add that architecture to your list of supported ones? That's my main use case for this issue, well that and to get off the random externalDNS image I'm on currently.

Direct pulling via docker on a Raspberry Pi 4:

$ docker pull gcr.io/k8s-staging-external-dns/external-dns:v20201021-v0.7.4-40-g764b9446
v20201021-v0.7.4-40-g764b9446: Pulling from k8s-staging-external-dns/external-dns
no matching manifest for linux/arm/v7 in the manifest list entries

Failed k3s deployment:

$ kubectl describe pod/armternal-dns-d745f6685-2h57s
...
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  75s                default-scheduler  Successfully assigned reverse/armternal-dns-d745f6685-2h57s to rpi4-03
  Normal   Pulling    37s (x3 over 74s)  kubelet            Pulling image "gcr.io/k8s-staging-external-dns/external-dns:v20201021-v0.7.4-40-g764b9446"
  Warning  Failed     36s (x3 over 74s)  kubelet            Error: ErrImagePull
  Warning  Failed     36s (x3 over 74s)  kubelet            Failed to pull image "gcr.io/k8s-staging-external-dns/external-dns:v20201021-v0.7.4-40-g764b9446": rpc error: code = NotFound desc = failed to pull and unpack image "gcr.io/k8s-staging-external-dns/external-dns:v20201021-v0.7.4-40-g764b9446": failed to unpack image on snapshotter overlayfs: no match for platform in manifest sha256:d25fe88968c160e4215a7e913ff55937a17bb585fde084f31fcf3649a72f47aa: not found
  Warning  Failed     11s (x4 over 73s)  kubelet            Error: ImagePullBackOff
  Normal   BackOff    11s (x4 over 73s)  kubelet            Back-off pulling image "gcr.io/k8s-staging-external-dns/external-dns:v20201021-v0.7.4-40-g764b9446"

@Raffo
Copy link
Contributor

Raffo commented Oct 22, 2020

@braxtone that's expected. I don't know how many people need this on raspberry pis, so I can't really optimize for what makes sense for which part of the user base, but I can promise I'll give a shot at armv7 architectures and see if I can add those too.

@alexellis
Copy link

You know two of us now, I can find many more for you if that helps? Generally when supporting arm64, projects tend to add armv7 also.

Happy to help if you need assistance. You'll see multiarch examples in openfaas and Inlets too.

@Raffo
Copy link
Contributor

Raffo commented Oct 22, 2020

@alexellis yeah, my comment was not trying to dismiss you or other Raspberry pis users, mostly justify that I don't have a good understanding of where users run ExternalDNS and I don't use arm devices myself.

I'll share an update as soon as I have one.

@xunholy
Copy link

xunholy commented Oct 22, 2020

@Raffo if you view my comment further up in the chain you will notice we've been building External DNS for a long time now and keeping it in parity with the upstream version whilst supporting multi platform architectures without issues.

I've been using these images in my RPi cluster without issues, I am doubtful this would require code based refactoring.

I hope this support is added as it's somewhat trivial in my opinion based on this fact that it has been community tested essentially for many months without issues.

@Raffo
Copy link
Contributor

Raffo commented Oct 22, 2020

@xunholy I think too it should be trivial in this case. Unfortunately we can't directly copy from your project because of the constraint with the ci system that we must use.

@Raffo
Copy link
Contributor

Raffo commented Oct 22, 2020

From a quick test, #1838 should do the trick.

@Raffo
Copy link
Contributor

Raffo commented Oct 23, 2020

arm32v7 images are built, please check if gcr.io/k8s-staging-external-dns/external-dns:v20201023-v0.7.4-44-gc148505b works fine for you.

@braxtone
Copy link

Just tried it out and it worked great for me. Thanks!

@Raffo
Copy link
Contributor

Raffo commented Oct 24, 2020

Perfect. I aim at making an official release at the beginning of November and all releases from now on will support all architectures. I will add docs for users and close this issue once I have them ready.

@patoarvizu
Copy link

It was kind of hard to find where the new ARM images should be pulled from until I came across this issue (it's not in the FAQ and gcr.io doesn't have the most intuitive search).

I know 0.7.5 isn't officially released yet so maybe that's why it's not documented, but I just wanted to point this out in case it wasn't obvious. Thanks!

@Raffo
Copy link
Contributor

Raffo commented Dec 21, 2020

@patoarvizu this is great feedback, I will add it to the release and README.

@Raffo Raffo mentioned this issue Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.