-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node_exporter 1.5.0 release SIGKILL immediately on macOS M1/arm64 #2539
Comments
We haven't updated the osx build tools in a while. I guess we need to look into that. |
Are you sure this isn't System Integrity Protection? I'm not a mac expert but that looks like when that hit me in the past. |
I'm not; how would I verify? |
I can confirm this issue, tried 1.5.0, 1.4.1 and 1.3.1. All the behaviour as described. |
(I'm on Monterey.) |
Yeah likely code signing issue. I'm not a mac expert, dunno if its only on M1 or general depending on the macos version but unsigned binaries get sigkilled, e.g: nodejs/node#40827 (comment) |
I'm having the same issue. Do you know a way to solve it ? |
@pboiseau Did you try the link above? |
Is this the same issue as #2217? There is something node exporter specific going on here. Comparing with the statsd exporter (released both before and after node exporter 1.5.0):
|
@discordianfish I've tried but I have this error
|
Unfortunately I have no idea how this mac code signing works. If someone has some suggestion what we can do better/different, let me know |
I'm only a little familiar; it's a PITA.
In both cases, guidance is to be extremely cautious with one's Apple Developer certs. If they leak, other people can sign their own binaries with said certs leading to all sorts of mischief. |
I found a solution. I share with you an ansible script that I have done to sign the app on Apple M1 in order to prevent the process from being SIGKILL. You need to have a Apple developer account and create a Developer ID Application certificate. - name: Get developer ID application certificate
ansible.builtin.shell: |
echo {{ apple_developer_certificate }} > developerID_application.txt
base64 -d -i developerID_application.txt -o developerID_application.cer
rm developerID_application.txt
- name: Check if certificate already exist in keychain
ansible.builtin.shell: |
security find-certificate - c "Developer ID Application: <Name Of Your Certificate (XXXXXXXXX)>" -a -p /Library/Keychains/System.keychain
failed_when: is_certificate.rc != 0 and is_certificate.rc != 44
register: is_certificate
- name: Import developer ID application certificate
ansible.builtin.shell: security import developerID_application.cer -k /Library/Keychains/System.keychain
when: is_certificate.rc != 0
- name: Sign node_exporter binary
become: true
ansible.builtin.shell: "codesign -s - {{ node_exporter_binary_install_dir }}/node_exporter"
- name: Verify node_exporter signature
ansible.builtin.shell: "codesign -vvvv {{ node_exporter_binary_install_dir }}/node_exporter"
register: result
failed_when: result.rc == 1 |
It would be nice if we could sign the releases as well. Dunno what that would involve, probably need to get some key signed by apple. If someone is familiar with the process, let me know! Even better if someone wants to submit a PR to add this to CI. |
I just ran into this problem as well when switching to macOS 13. One existing work-around is to use the homebrew version (https://formulae.brew.sh/formula/node_exporter#default) which is correctly signed. |
There is a write-up here: Automatic Code-signing and Notarization for macOS apps using GitHub Actions There are two routes that can be taken:
And I'd recommend we use Fastlane for automating this process. It saves lots of headaches.
The way I see it, route 2 is easier to take and we can try some builds like that, if it works, then let's use it? We can also add the test suggested above:
I'd be happy to contribute a PR for this. Please let me know what you think. |
@gitperr sounds reasonable, so if you want to submit a PR that'd be great! |
@discordianfish Thanks for the response. I set my dev environment today on my mac (arm), and tried to build node_exporter. So far my findings:
So, I'm now looking at the CircleCI to add code signing step to the build pipeline. But I'm a bit confused, where exactly would it fit in there? I had made something like this, but I'm not sure how I can test it: Thanks for any pointers. |
Alright, figured out some of the steps and got the pipelines running on commits (see my open MR #2833). I got some ad-hoc code signed builds out, but they were made by Intel macs, and they did not work on arm. Now, I got blocked by the resource class, like that: https://app.circleci.com/pipelines/github/prometheus/node_exporter/3817/workflows/18a9d70c-2317-4d49-b20f-e94fd82cf02a/jobs/19923/steps Seems like the node exporter CircleCI plan does not support m1 mac use. Is it possible to change the plan for that? Seems like CircleCI will soon stop supporting Intel macs anyway. Also, m1 macs are capable of compiling for amd64 architecture, so we won't lose anything. |
Do we plan on adding the M1 or M2 mac runner for this? Then I think it is very easy to finalize this fix. |
I think we need to update the xcode stuff in our golang-builder Docker image. It's been a very long time and the update process is really annoying/tricky due to Apple's licensing. |
I created a PR for updating the xcode stuff in golang-builder docker image: |
This should hopefully fix the SIGKILL issue on OSX machines. e.g. in: prometheus#2539 Signed-off-by: Alper Polat <[email protected]>
This should hopefully fix the SIGKILL issue on OSX machines. e.g. in: prometheus#2539 Signed-off-by: Alper Polat <[email protected]> Change the docker flags to correct ones Signed-off-by: Alper Polat <[email protected]> Fix errors in running the rcodesign from golang-builder Signed-off-by: Alper Polat <[email protected]> Use pwd instead Readlink does not work to get the proper path, pwd might do it. As promu seems to be copying the binaries based on working directory. Signed-off-by: Alper Polat <[email protected]> Try to run at the same job to see if it helps So far I am unable to find the binary's location with either pwd or readlink. I'm suspecting that the binary is not on this specific host that is running the rcodesign. Signed-off-by: Alper Polat <[email protected]> Try to debug what files are in the current working directory Signed-off-by: Alper Polat <[email protected]> Print working directory as well Signed-off-by: Alper Polat <[email protected]> Add quote wrapping Signed-off-by: Alper Polat <[email protected]> Try to debug more Signed-off-by: Alper Polat <[email protected]> Nothing seems to be in .build directory here Signed-off-by: Alper Polat <[email protected]> Remove some of debug commands Seems like the build does not get produced because of the CircleCI node index that gets passed into `--parallelism-thread`. Signed-off-by: Alper Polat <[email protected]> Add a separate sign stage for code signing Separate stage might be useful so that we have all of the builds that end up in `.build` here, and sign the one(s) that we want. First one being implemented here is darwin-arm64. Signed-off-by: Alper Polat <[email protected]> Run only if darwin-arm64 was built Earlier I tried to add a separate stage for signing, but seems like that was a bad idea because the pipeline file has to exist in `master` for that so we can run the tests properly. Checking with if might be one of the simpler and better ideas... Signed-off-by: Alper Polat <[email protected]> Add forgotten quote Fixing basic syntax error Signed-off-by: Alper Polat <[email protected]>
Signed-off-by: Alper Polat <[email protected]> Bump golang-builder version (prometheus#2908) Signed-off-by: Alper Polat <[email protected]> exec_bsd: Fix labels for vm.stats.sys.v_syscall sysctl (prometheus#2895) Signed-off-by: David O'Rourke <[email protected]> chore:remove constant from function (prometheus#2884) Signed-off-by: tyltr <[email protected]> build(deps): bump github.com/prometheus/common from 0.45.0 to 0.46.0 (prometheus#2910) Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.45.0 to 0.46.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.45.0...v0.46.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> build(deps): bump github.com/jsimonetti/rtnetlink from 1.4.0 to 1.4.1 (prometheus#2909) Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.4.0 to 1.4.1. - [Release notes](https://github.com/jsimonetti/rtnetlink/releases) - [Commits](jsimonetti/rtnetlink@v1.4.0...v1.4.1) --- updated-dependencies: - dependency-name: github.com/jsimonetti/rtnetlink dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> fix hwmon nil ptr (prometheus#2873) * fix hwmon nil ptr syslink maybe lost in some cases. --------- Signed-off-by: TaoGe <[email protected]> Fix hwmon error capture (prometheus#2915) Fix golangci-lint "ineffectual assignment" by correctly capturing any errors within the hwmon gathering loop. Signed-off-by: Ben Kochie <[email protected]> Attempt to sign the node exporter darwin build This should hopefully fix the SIGKILL issue on OSX machines. e.g. in: prometheus#2539 Signed-off-by: Alper Polat <[email protected]> Change the docker flags to correct ones Signed-off-by: Alper Polat <[email protected]> Fix errors in running the rcodesign from golang-builder Signed-off-by: Alper Polat <[email protected]> Use pwd instead Readlink does not work to get the proper path, pwd might do it. As promu seems to be copying the binaries based on working directory. Signed-off-by: Alper Polat <[email protected]> Try to run at the same job to see if it helps So far I am unable to find the binary's location with either pwd or readlink. I'm suspecting that the binary is not on this specific host that is running the rcodesign. Signed-off-by: Alper Polat <[email protected]> Try to debug what files are in the current working directory Signed-off-by: Alper Polat <[email protected]> Print working directory as well Signed-off-by: Alper Polat <[email protected]> Add quote wrapping Signed-off-by: Alper Polat <[email protected]> Try to debug more Signed-off-by: Alper Polat <[email protected]> Nothing seems to be in .build directory here Signed-off-by: Alper Polat <[email protected]> Remove some of debug commands Seems like the build does not get produced because of the CircleCI node index that gets passed into `--parallelism-thread`. Signed-off-by: Alper Polat <[email protected]> Add a separate sign stage for code signing Separate stage might be useful so that we have all of the builds that end up in `.build` here, and sign the one(s) that we want. First one being implemented here is darwin-arm64. Signed-off-by: Alper Polat <[email protected]> Run only if darwin-arm64 was built Earlier I tried to add a separate stage for signing, but seems like that was a bad idea because the pipeline file has to exist in `master` for that so we can run the tests properly. Checking with if might be one of the simpler and better ideas... Signed-off-by: Alper Polat <[email protected]> Add forgotten quote Fixing basic syntax error Signed-off-by: Alper Polat <[email protected]> Update common Prometheus files (prometheus#2917) Signed-off-by: prombot <[email protected]> Use promu to code sign The functionality being replaced here is going to be built into `promu` with prometheus/promu#284 So pipelines should use it instead. Signed-off-by: Alper Polat <[email protected]> Use Promu 0.17.0 Signed-off-by: Alper Polat <[email protected]> Introduce one error first We want to re-trigger the pipeline. But, the circleCI interface does not allow re-runs. So, going to introduce a dummy error, take it back and re-trigger the pipeline like that. Signed-off-by: Alper Polat <[email protected]> Set version to correct one Signed-off-by: Alper Polat <[email protected]>
Host operating system: output of
uname -a
node_exporter version: output of
node_exporter --version
No output. SIGKILL. See below.
node_exporter command line flags
--help
node_exporter log output
No output. SIGKILL. See below.
Are you running node_exporter in Docker?
No.
What did you do that produced an error?
Ran the binary with
--help
What did you expect to see?
The
--help
usage output.What did you see instead?
The process was SIGKILL'd.
Full trace
Other notes
I ran the same sequence on an AWS amd64 mac without error. This only happens on M1 macs (my laptop; AWS-hosted).
This happened for 1.4.0 as well but I saw the fresh release and figured I'd try it before reporting.
The text was updated successfully, but these errors were encountered: