-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to docker-maven-plugin configuration so M1 mac build runs (almost) clean with -Dstart-containers
#25648
Conversation
@@ -114,6 +114,8 @@ | |||
</plugins> | |||
</build> | |||
</profile> | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no diff ? remove from commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thank you!
@@ -168,29 +185,28 @@ | |||
<date>default</date> | |||
<color>cyan</color> | |||
</log> | |||
<!-- Speed things up a bit by not actually flushing writes to disk --> | |||
Speed things up a bit by not actually flushing writes to disk --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean to remove this comment start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not. :)
I think my IDE is trolling me, since at one point I was running that test successfully. Will fix.
-Dstart-containers
-Dstart-containers
Changes looks sensible to me but @Sanne better off verifying - and I don't have a M1 :/ I did notice lots of non meaningfull whitespace changes - not sure if our formatter being bad or just your IDE being configured differently :) either way nothing major but the line count/diff would be a lot less without them. |
I noticed that too about the whitespace and was annoyed by it. I thought I'd followed our IDE setup instructions accurately and I was getting a lot of line wrapping, so I made some more updates to my settings beyond what was in It might be worth spending an hour or two with a bunch of us in a room with a new starter, seeing if we've got compatible settings + docs across a few IDEs. I think XML is probably hardest to get right and consistent because the tooling support is most fragmented. |
looks great! I'll need to run it on my workstation at home though - sorry not possible today. @yrodiere are you ok with the Elasticsearch version changes? |
TL;DR: We'd better use Elasticsearch 7.16, and I'll submit another PR to centralize the version used in every integration tests, because currently it's a bit messy :)
7.11 is affected by at least one bug that is relevant to Hibernate Search: elastic/elasticsearch#53127 While I don't think we'll experience that bug in Quarkus integration tests, I'd rather upgrade directly to 7.16, which is the version used in Hibernate Search 6.1 integration tests [see below, I'll send a PR to upgrade]. However, be aware that the version bumps in this PR are apparently in I'm surprised the build passes without bumping the version used for Hibernate Search integration tests, though?
Please note that, at least when it comes to Elasticsearch you updated the configuration of (some) integration tests, not dev services.
Other extensions rely on Maven properties defined in build-parent, and I think logging-gelf should do the same:
Elasticsearch dev services use a version hardcoded in their configuration class: Line 32 in 31c2513
That last part is indeed at odds with the more sophisticated solutions we have for relational datasources, which relies on a properties file using interpolation to inject Maven properties: Line 72 in 36f8e56
quarkus/extensions/devservices/postgresql/src/main/resources/postgresql-devservice.properties Line 1 in b7fad49
... but it's consistent with Kafka dev services, for example: Line 48 in 10fb97a
So to sum up, it's a mess :) I'll send a PR to make the IT configuration more consistent, and upgrade to 7.16; feel free to merge this PR first and I'll handle the conflicts. I also added a comment to #25486 so that we eventually solve the problem of Elasticsearch dev services.
That's probably related to |
The library sends TCP packet directly, so it is not tied to any Elasticsearch versio. Feel free to change the version and refactor it to share a common version with the other Elasticsearch based extensions, it predates the integration test of Elasticsearch that's why it didn't use the common version from the parent. The port 5000 is from Logstash, I don't remember, as we use |
Thanks! |
Failing Jobs - Building 6ba6826
Full information is available in the Build summary check run. Failures⚙️ Gradle Tests - JDK 11 #- Failing: integration-tests/gradle
📦 integration-tests/gradle✖
⚙️ Gradle Tests - JDK 11 Windows #- Failing: integration-tests/gradle
📦 integration-tests/gradle✖
⚙️ JVM Tests - JDK 11 #- Failing: integration-tests/rest-client
📦 integration-tests/rest-client✖
⚙️ JVM Tests - JDK 11 Windows #- Failing: integration-tests/rest-client
📦 integration-tests/rest-client✖
⚙️ JVM Tests - JDK 17 #- Failing: integration-tests/rest-client
📦 integration-tests/rest-client✖
⚙️ Native Tests - Data2 #- Failing: integration-tests/jpa-db2
📦 integration-tests/jpa-db2✖ ⚙️ Native Tests - Data7 #- Failing: integration-tests/hibernate-reactive-db2 integration-tests/reactive-db2-client
📦 integration-tests/hibernate-reactive-db2✖ 📦 integration-tests/reactive-db2-client✖ ⚙️ Native Tests - HTTP #- Failing: integration-tests/rest-client
📦 integration-tests/rest-client✖
|
Here you go: #25663 |
The rest client failures can be addressed by rebasing on |
It's a bit big, and I'm struggling to follow the diffs because of the included merge commits. It's also conflicting now with @holly-cummins you think you could rebase it, and perhaps split it in smaller PRs ? Many tests failed: it might be useful to narrow it down, and also merge things in smaller bites. |
at least one of the errors seems related; found this in section "Data7"
Perhaps keep DB2 changes to the side, we can try merging the other goodies first? |
FWIW you can probably skip the part about Elasticsearch completely, now: #25663 was merged and it includes an upgrade to Elasticsearch 7.16, so more recent than what you did here and what you need for M1. |
Will do. I synced to main just before creating the PR, but that created its own problems because of the merge commit. (I normally prefer to rebase so I'm not sure why I did a merge commit this time.) And because it's such a big PR it gets merge conflicts quickly. :( |
OK, first smaller PR in for @Sanne, with the Maria DB changes: #25805 |
@holly-cummins should we close this one? |
Not quite yet! One more sub-PR to go (DB2), which I'm testing locally now. And then I need to validate elasticsearch, and then do a final run through to make sure I didn't miss anything. This is what's gone in:
|
Investigation continues. Now, the DB2 tests pass without any extra specifying the image name (perhaps I updated podman and that made a difference?). This is good news, because as @Sanne points out, there is no maven property called However, the gelf tests are failing even with @yrodiere 's changes, so I need to figure out why. |
Remember - rather than wasting too much time on DB2 I'd be ok to move all related tests to a different repository. |
Noted re. DB2 moving out of the mainline test suite, @Sanne. I reckon even if we move them to a separate repo (or guard them some other way, like with a -Dslow profile), we'd want them to pass on M1. However, my fix was pretty ugly, since it had to explicitly set an architecture that the container provider should be setting for us. And since I now can't make the DB2 tests fail, I agree we should leave it. I have a feeling that maybe things are working locally now because of some caching (a bit like testcontainers/testcontainers-java#5275 but for architectures?). If so, the DB2 M1 issues may rear their head again. But if things fail in an M1 Actions build, we can revisit this fix. And apart from the DB2-mystery-non-fix, and one more volume access modifier I had to add, and #25230, this PR has now been dissected and absorbed. So I'll close. |
Yes of course. |
Fixes #25428. This is a big PR, sorry! I kept fixing-one-more-thing before doing the PR. I've left it unsquashed, partly because the history for each individual file is pretty clean and descriptive, and partly because I made the mistake of merging upstream into my long-running branch before squashing and so there's big merge commits in which are complicated to squash around.
Here's what's changed:
DB2
We may want to do something similar in container-build-jib-with-db2 and also the dev services but I have not touched those.
MariaDb
I had a problem with tcp-ping based readiness checks for mariadb. Only one test used it, so we can switch to using an
sqladmin ping
check. Several tests do a ping inside a<postStart>
in the<wait>
. However, a<postStart>
inside a<wait>
isn’t a true readiness check. The plugin will always wait for the<
time>and then run the
`. A better (but more verbose) option is to use a container health check.This was only strictly necessary where the tests were using a tcp ping to check for readiness (which did not work on podman), but I think it’s an improvement elsewhere.
This sped up the tests quite a bit for me since it will cut the
<time>
wait short if the database is ready before the timeout pops. I saw a factor of two improvement on tests which used mariadb, but on slower hardware the effect may not be so noticeable.Be aware when testing healthcheck scripts, caching of named images can cause apparently non-deterministic behaviour. Change the name if making changes.
I also found that trying to extract common code to parent poms caused tests not to be able to connect to the containers. I don’t understand the reason - perhaps some crucial fabric8 state ends up in a target directory at the parent level?
MSSQL
This one was challenging. The MS SQL image segfaults on ARM. The best option seemed to be the widely recommended workaround of using Azure Edge Server. The two products are not functionally interchangeable, but there is enough similarity to make it worth running this way.
Note: this change will also affect dev services if built locally, but not from the official releases unless they are built on M1. I think that’s kind of weird but kind of ok, but I’m open to objections if people think that's too inconsistent.
MYSQL
Like DB2, I got a 404 if I tried to pull the image without explicitly specifying the architecture. However, just specifying the architecture wasn’t enough for the container to start. It would work on the command line with podman, but not in fabric8 with podman.
Following https://www.emmanuelgautier.com/blog/mysql-docker-arm-m1 and https://stackoverflow.com/questions/65456814/docker-apple-silicon-m1-preview-mysql-no-matching-manifest-for-linux-arm64-v8 I instead changed the image name to one with better behaviour on arm.
Oracle
I hit fabric8io/docker-maven-plugin#1369 with the log readiness check. Based on the discussion, I wonder if it’s a timing issue which my machine exposes by being fast. (I saw similar issues with
<log>
experimenting with the mariadb readiness check.)ERRO[20873] accept tcp [::]:1521: use of closed network connection
[ERROR] DOCKER> IO Error while requesting logs: org.apache.http.ConnectionClosedException: Premature end of chunk coded message body: closing chunk expected Thread-6
The image we use has a healthcheck configured, so we can just use that with fabric8.
I also hit a more serious problem, which is that I could not get the database to start on M1. It seemed to be the same issue as https://stackoverflow.com/questions/68605011/oracle-12c-docker-setup-on-apple-m1. Others there reported success with qemu, which podman is using under the covers. I tried with podman 4.0.3 and podman 4.1, without success.
Reluctantly, I disabled the tests. We should re-evaluate with later versions of the Oracle database in the future.
Elasticsearch
According to https://stackoverflow.com/questions/65962810/m1-mac-issue-bringing-up-elasticsearch-cannot-run-jdk-bin-java, we need a version bump to get M1 support, up to 7.10.2. I had to go up two minor increments to 7.11.0 on the logstash version to get arm64 working.
I notice we don’t have a centralised elasticsearch version, unlike other dev services. Is this something we want to change?
I also hit a comedy problem, which I'm noting here in case it helps others. I got a port conflict on port 5000. The process running on this port turns out to be an AirPlay server. You can deactivate it in System Preferences › Sharing, and unchecking AirPlay Receiver.
Container Image Invoker
I did not look at #25230, so those tests still fail with podman on M1.
Other issues
I had to restart my podman machine to resolve some unexplained failures. I guess it had got tired. This could affect runs of the whole test suite if it runs end to end.