-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use builder image's objcopy when building with quarkus.native.container-build=true and quarkus.native.debug.enabled=true #13963
Conversation
c7e95e5
to
4d66b88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments...
core/deployment/src/main/java/io/quarkus/deployment/pkg/steps/NativeImageBuildStep.java
Outdated
Show resolved
Hide resolved
objcopy(containerCommand, outputDir, "--only-keep-debug", executableName, symbols); | ||
objcopy(containerCommand, outputDir, String.format("--add-gnu-debuglink=%s", symbols), executableName); | ||
} | ||
// Strip debug symbols regardless, because the underlying JDK might contain them | ||
objcopy(containerCommand, outputDir, "--strip-debug", executableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this spin up a container 3 times? Ideally, we'd stitch them together so that we can get away with a one time container invocation. I.e. podman run ... --entrypoint=/bin/bash <container> -c 'objcopy --only-keep-debug ... ; objcopy --add-gnu-debuglink=....; objcopy --strip-debug ...
or podman run ... --entrypoint=/bin/bash <container> -c 'objcopy --strip-debug ...
depending on the -g
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will, but that's fairly fast in comparison to the native-image build step. Keeping it this way we don't have to maintain two different code paths for container and host builds, thus keeping it simple and more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested this myself, but apparently running containers on non-Linux isn't as quick. Perhaps mac and windows use-cases with building with the container should be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK Mac is almost as fast, and windows docker using WSL2 as the backend should also be pretty fast. I can do some testing on Windows, but unfortunately not on Mac.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maybe an optimization to do as a later step (once we know this actually is a problem?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR: I tested this on two laptops with similar specs.
I tested the "combined" (single docker container for all 3 processes) vs the "split" (1 container per process) approach and the results were:
Windows | Linux | |
---|---|---|
Combined | ~16s | ~0.8s |
Split | 3.5 + 13 + 6.5 = ~23s | 0.5 + 0.5 + 0.6 = ~1.1s |
Although the benefit of using the "combined" approach is pretty clear, the overhead of the "split" approach (~7s) is still relatively low given that the image build took ~260s.
6b8f277
to
eaf9955
Compare
@gsmet can you please review? |
@zakkak can you fix the conflict? Frankly, I can't backport it as is. We can't refactor everything in a PR we want to backport... |
eaf9955
to
ac8e60e
Compare
Done
I understand. I should have actually added this PR to 1.11.0.Final in the first place. |
As I said I'm uncomfortable backporting something that rewrites a whole class. So either we find a finer-grained approach or we don't backport it. The current approach is not safe for a backport. I don't want to break something else to fix an issue. Now if someone really wants this, they will have to review this VERY closely to be sure everything is safe. |
@zakkak can you please rebase? |
This need a rebase but we need a decision on
|
We have concluded that this does not need to be backported.
I would prefer #15233 to go in first. |
OK, then I will take #15233 out of draft so it can get a proper review. Furthermore, I'll add a test for it |
Ouch... I just saw #14635... Rebasing my PR onto that one won't be fun :( |
I feel you, if it eases your pain it will be more or less the same for this PR ;) |
Yeah, we are in the same boat :) |
Turned out to be easier than I thought :) |
ac8e60e
to
efe041c
Compare
I wish I could say the same :) This is now ready for review. I tested it locally with all three configurations and GraalVM CE 21.0:
@jonathan-meier FYI (It would also be great if you could give this a spin on your "remote" docker setup) |
@@ -13,6 +15,7 @@ | |||
public class NativeImageBuildRemoteContainerRunner extends NativeImageBuildContainerRunner { | |||
|
|||
private static final Logger log = Logger.getLogger(NativeImageBuildRemoteContainerRunner.class); | |||
private static final String CONTAINER_BUILD_VOLUME_NAME = "quarkus-native-builder-image-project-volume"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally I would prefer a dynamically generated unique volume name, but I am not sure it's worth the effort.
We could also do something like fixed-name-YYYY-MM-DD-HHMM
to avoid potential conflicts with existing volumes.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's straightforward to let docker generate the unique volume id, I quickly tested it locally:
final String[] createVolumeCommand = new String[] { containerRuntime.getExecutableName(), "volume", "create" };
volumeId = runCommandAndReadOutput(createVolumeCommand, "Failed to create temp volume.");
Using this as a first step in preBuild
and replacing all CONTAINER_BUILD_VOLUME_NAME
with volumeId
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it's not that simple. If you manually create the volume before starting the container, the volume (when mount) is owned by root:root
and native-image
fails to write to it in subsequent steps.
Creating the volume explicitly
$ podman volume create test-volume
test-volume
$ podman create --name temp -v test-volume:/project quay.io/quarkus/ubi-quarkus-native-image:21.0.0-java11
bfa2406d5fa0a003af99c91f077ed989006eb2441004135773429c425f9ce997
$ podman run --rm --entrypoint /bin/bash -it -v test-volume:/project quay.io/quarkus/ubi-quarkus-native-image:21.0.0-java11 -c "ls -la"
total 0
drwxr-xr-x. 2 root root 6 Mar 3 11:40 .
drwxr-xr-x. 19 root root 6 Mar 3 11:41 ..
Creating the volume implicitly
$ podman create --name temp -v test-volume:/project quay.io/quarkus/ubi-quarkus-native-image:21.0.0-java11
e30ffb76aa320448a3e645d7030c4dcf76b0efb83b68e3bc58a2e655c683d1de
$ podman run --rm --entrypoint /bin/bash -it -v test-volume:/project quay.io/quarkus/ubi-quarkus-native-image:21.0.0-java11 -c "ls -la"
total 0
drwxr-xr-x. 2 quarkus quarkus 6 Mar 3 11:42 .
drwxr-xr-x. 19 root root 6 Mar 3 11:42 ..
I am not a containers expert though so I'll be happy to be corrected :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! I checked the following:
- docker 19.03.13 on Ubuntu 20.10
- docker 20.10.2 on Ubuntu 20.10 in WSL
- podman 2.0.6 on Ubuntu 20.10
- podman 1.6.4 on CentOS 7.9
- podman 3.0.1 on CentOS 7.9
in all but the last case, the /project
folder was owned by the quarkus
user also when creating the volume explicitly. So something seems to have changed for podman 3. I'm guessing you're running on podman 3 as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed I am using podman 3.0.1 on Fedora 33.
Thanks for the additional info @jonathan-meier, I created containers/podman#9608.
Works like a charm! I tested various combinations of local/remote builds with debug enabled/disabled both natively on Windows and inside WSL. |
Thank you @jonathan-meier, I really appreciate it :) |
efe041c
to
57328b0
Compare
@zakkak @jonathan-meier I rebased this PR and also added an additional commit to fix (again) the confusion between native image name and executable name. Could you both have a look? |
Thanks @gsmet. It looks like the issue (which appears only in podman 3.0.x) I am working around in this PR (see #13963 (comment)) is fixed in podman 3.1.0 (which was released 2 days ago) so I was thinking about waiting for it to land in fedora and rework the PR to avoid the work-around. WDYT? |
If the workaround doesn't break Podman 3.1, I would get it in anyway. People don't always keep things updated. |
OK, I added a comment to explain that we follow this approach due to the Podman 3.0.x issue. It looks good to me (at least the linux part). It would be nice if someone could confirm that everything works as expected on windows as well. |
Looks good to me too! I again tested various combinations of containerized local/remote docker builds with debug enabled/disabled both natively on Windows and inside WSL. Please note that I did not test direct local builds on Windows producing a Windows binary (i.e. without docker, as I don't have GraalVM installed on Windows). I did not do any podman tests either. |
This patch creates a volume for /project where we can copy files to/from using `docker cp`. The volume is create in the prebuild step, then used in the build step, and finally removed in the postbuild step. The new approach allows us to execute more than one commands using `docker run --rm` on the same files in /project.
ae176ad
to
8008ceb
Compare
@gsmet I rebased this (cleanly). I think it's good to go (after the CI completes). |
Failing Jobs - Building 8008ceb
Full information is available in the Build summary check run. Test Failures⚙️ JVM Tests - JDK 8 #📦 integration-tests/kafka✖ ✖ |
PR quarkusio#13963 mistakenly stopped stripping debug information from the native executable when debug info generation is enabled resulting in both the native executable and the corresponding `.debug` file containing the debug information. Which also results in the native executable being larger when `-Dquarkus.native.debug.enabled` is used: ``` $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug 108M quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug 182M quarkus-integration-test-main-999-SNAPSHOT-runner-debug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug 75M quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug ``` Inspecting `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` and `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` with `readelf --debug-dump=info` we observe that both files are containing debug information while only the latter should contain them.
PR quarkusio#13963 mistakenly stopped stripping debug information from the native executable when debug info generation is enabled resulting in both the native executable and the corresponding `.debug` file containing the debug information. Which also results in the native executable being larger when `-Dquarkus.native.debug.enabled` is used: ``` $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug 108M quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug 182M quarkus-integration-test-main-999-SNAPSHOT-runner-debug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug 75M quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug ``` Inspecting `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` and `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` with `readelf --debug-dump=info` we observe that both files are containing debug information while only the latter should contain them. (cherry picked from commit f69d98d)
PR quarkusio#13963 mistakenly stopped stripping debug information from the native executable when debug info generation is enabled resulting in both the native executable and the corresponding `.debug` file containing the debug information. Which also results in the native executable being larger when `-Dquarkus.native.debug.enabled` is used: ``` $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug 108M quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug 182M quarkus-integration-test-main-999-SNAPSHOT-runner-debug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug 75M quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug ``` Inspecting `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` and `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` with `readelf --debug-dump=info` we observe that both files are containing debug information while only the latter should contain them. (cherry picked from commit f69d98d)
PR quarkusio#13963 mistakenly stopped stripping debug information from the native executable when debug info generation is enabled resulting in both the native executable and the corresponding `.debug` file containing the debug information. Which also results in the native executable being larger when `-Dquarkus.native.debug.enabled` is used: ``` $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug 108M quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug 182M quarkus-integration-test-main-999-SNAPSHOT-runner-debug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug 75M quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug ``` Inspecting `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` and `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` with `readelf --debug-dump=info` we observe that both files are containing debug information while only the latter should contain them. (cherry picked from commit f69d98d)
PR quarkusio#13963 mistakenly stopped stripping debug information from the native executable when debug info generation is enabled resulting in both the native executable and the corresponding `.debug` file containing the debug information. Which also results in the native executable being larger when `-Dquarkus.native.debug.enabled` is used: ``` $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug 108M quarkus-integration-test-main-999-SNAPSHOT-runner-nodebug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug 182M quarkus-integration-test-main-999-SNAPSHOT-runner-debug $ du -hs quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug 75M quarkus-integration-test-main-999-SNAPSHOT-runner-debug.debug ``` Inspecting `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` and `quarkus-integration-test-main-999-SNAPSHOT-runner-debug` with `readelf --debug-dump=info` we observe that both files are containing debug information while only the latter should contain them. (cherry picked from commit f69d98d)
Closes #13754
Closes #13856