-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better healthcheck correctness #1056
Better healthcheck correctness #1056
Conversation
I'm fairly certain this will have test failures, but I'm having difficulty running them locally for some reason. |
0b247b8
to
09ef897
Compare
d0a26be
to
4fc0e40
Compare
1. ensure the find command succeeds 2. ensure the find command produces output 3. ensure the find command produces output that matches the project_dir 4. ensure the ldd-like command finds at least one affirmatively good library Signed-off-by: Lamont Granquist <[email protected]>
The linuxism of -regextype has broken healthchecks on Solaris and FreeBSD since 2016 This applies the filters to all the distros and simplifies the code. Support for regexps on filename suffixes has been dropped in favor of simplicity and verbosity. Signed-off-by: Lamont Granquist <[email protected]>
Result of a lot of hacking on CI to make it turn green Signed-off-by: Lamont Granquist <[email protected]>
Signed-off-by: Lamont Granquist <[email protected]>
Signed-off-by: Lamont Granquist <[email protected]>
4fc0e40
to
41e598a
Compare
# feed the list of files to the "ldd" command | ||
# | ||
|
||
# this command will typically fail if the last file isn't a valid lib/binary which happens often |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I may have spotted a regression; or I'm just hitting a non-deterministic issue more frequently with these code changes
The scenario is - I've just upgraded to a newer version of Omnibus, and it looks like the ldd
command here can break halfway through with this current pattern. It took a while to track down things as the status code here is getting ignored, so the error was being swallowed.
Example, reading 3 files and the 2nd ELF file causes an error:
[9] pry(#<Omnibus::HealthCheck>)> puts ldd_output = shellout(ldd_command, input: "/opt/project/file1\n/opt/project/file2\n/opt/project/file3").stdout
/opt/project/file1:
not a dynamic executable
/opt/project/file2:
=> nil
We get a partial result in the current implementation, but if you extract the status code you can see it has failed an exitstatus of 135:
[11] pry(#<Omnibus::HealthCheck>)> puts ldd_output = shellout(ldd_command, input: "/opt/project/file1\n/opt/project/file2\n/opt/project/file3").result
NoMethodError: undefined method `result' for <Mixlib::ShellOut#1230: command: 'xargs ldd' process_status: #<Process::Status: pid 46708 exit 123> stdout: '/opt/project/file1:
not a dynamic executable
/opt/project/file2:' stderr: 'ldd: exited with unknown exit code (135)' child_pid: 46708 environment: {} timeout: 7200 user: group: working_dir: >:Mixlib::ShellOut
from (pry):11:in `block in read_shared_libs'
I actually thought there was a bug in shellout(...).stdout
since it looked like partial reads of stdout.
For my current setup, this new code path skips multiple healthchecks as a result of exiting earlier than expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial pull request: #1137
closes #1051
Fixes healthchecks on AIX, Solaris and FreeBSD.
Extracts the solaris healthcheck routine out to its own method
This reworks the filtering introduced in 9551f72 to do it in pure ruby but deliberately drops the complexity of the regexps. This was responsible for breaking health checks on FreeBSD.
There are now some internal consistency checks which ensure that the output of the find command actually produces some output and that there is at least one file in the install_path, it also ensures that the "ldd" (or "otool", etc) routine finds at least one affirmatively good library. These should at least catch breakages that result in silently failing NOPs.