Skip to content

Commit

Permalink
3.15 aarch64 test plan updates
Browse files Browse the repository at this point in the history
* Updated QUARKUS-3007 to be more in line with what happened in 3.8
* Created followup test plan for OCP on aarch leftovers after 3.8 in
  QUARKUS-3446 test plan
* Created followup test plan for bare metal coverage in 3.15 in
  QUARKUS-4942 test plan
  • Loading branch information
mjurc committed Oct 2, 2024
1 parent 759293d commit 129ea35
Show file tree
Hide file tree
Showing 3 changed files with 107 additions and 33 deletions.
53 changes: 20 additions & 33 deletions QUARKUS-3007.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,34 +7,29 @@ for x86-64 architecture.

For the purposes of planning, the following tasks therefore need to be executed (ordered by priority):
* Same test coverage and certification matrix on OCP on aarch64 for both JVM and native mode as on OCP on x86-64 (see
https://issues.redhat.com/browse/QUARKUS-3446)
[QUARKUS-3446](QUARKUS-3446.md))
* Certification of RHBQ native mode on OCP on aarch64
* Certification of RHBQ in both JVM and native mode in the bare metal scenarios on aarch64
* Certification of RHBQ in both JVM and native mode in the bare metal scenarios on aarch64
(see [QUARKUS-4942](QUARKUS-4942.md))

## Scope of the testing

### General
* Catch-up on the test coverage on OCP to be same for aarch64 as it is for x86-64
* Enabling Serverless tests in `openshift-arm` profile once RH Serverless supports aarch64
* Running test services that do not yet support aarch64 in a side OpenShift cluster running on x86-64
* This pertains to units to tens of tests across multiple modules
* Test coverage on OCP should be same for aarch64 as it is for x86-64
* Tests disabled in `aarch64` profile due to missing test services on that architecture should be re-enabled
* Certification of RHBQ native mode on OCP on aarch64
* The full native OpenShift coverage will have to be ran in native mode with the Mandrel native builder container
* Certification of both the JVM and native modes in bare metal scenarios will require all the aspects to be tested
* we support two JDK versions for JVM mode and one JDK version for native mode per release stream
* start-stop TS with RHBQ code starters
* start-stop special characters TS
* bare metal test suite
* Quarkus Quickstarts
* bare metal test suite and Quarkus Quickstarts are priority
* `startstop-ts-code-quarkus`, `startstop-ts-special-chars` are nice to have

### Impact on test suites and testing automation
* Catch-up on the test coverage on OCP to be same for aarch64 as it is for x86-64
* This means that the aarch64 OCP testing pipeline would require an additional x86-64 cluster to be deployed in AWS
and test services be prepared before the test execution by the test pipeline. The other part of this is that the
tests that currently use test services not supported on aarch64 would be moved to standalone test suite module, only
executed in `openshift-arm` profile
* The other aspect is testing on EUS OpenShift. A job executing the OpenShift interoperability modules on a matrix of
JDK version and OpenShift version should be implemented.
executed in `aarch64` profile
* We will also need to implement periodic runs of OpenShift test suite in JVM mode with Quarkus built from main branch
on aarch64.
* Certification of RHBQ native mode on OCP on aarch64
Expand All @@ -52,10 +47,8 @@ For the purposes of planning, the following tasks therefore need to be executed
* start-stop special characters TS (`startstop-ts-special-chars`)
* bare metal test suite
* Quarkus Quickstarts
* We will need to the upstream CI to include aarch64 as a regular platform for their CI.
* If this is not implemented, we are in risk of catching bugs way too late in release (with our engineering and
candidate releases being produced only after upstream release, we would have no way of knowing that upstream stays
aarch64 certified).
* We will need to run the aarch64 coverage in our periodic builds as running all the coverage planned in previous
bullets in candidate releases the first time is too late to catch bugs in the schedule.

Note: Once we have the jobs for product testing for OCP on aarch64 in both JVM and native mode, we will have three jobs
(in JVM mode, for JDK 17 and 21, and in native mode, for whichever version Mandrel supports), and each of them will be
Expand All @@ -65,34 +58,28 @@ execution. We should consider moving the jobs to the same pipeline, managing its
### Impact on resources
Product testing:
* Catch-up on the test coverage on OCP to be same for aarch64 as it is for x86-64
* One additional OCP cluster for test services in AWS: 6x xlarge x86-64 machines in AWS
* EUS OCP in AWS: 6x xlarge Graviton machines, 4x large executors in Jenkins for each cell of JDK/OCP version matrix
* One OCP cluster for test services in AWS: 6x xlarge x86-64 machines in AWS
* Certification of RHBQ native mode on OCP on aarch64
* medium executor for orchestration
* 8x xlarge executors added from Beaker, 2 hours execution time
* 9x xlarge executors added from Beaker, 2 hours execution time
* ideally, we will be using the same clusters as the other OCP jobs do, but we need at least
* 1x test cluster - 6x xlarge Graviton machines in AWS
* 1x test service cluster - 6x xlarge x86-64 machines in AWS
* Certification of both the JVM and native modes in bare metal scenarios on aarch64
* `startstop-ts-code-quarkus` - 1x large aarch64 machine from Beaker per tested JDK, minutes to tens of minutes
execution time
* `startstop-ts-special-chars` - 1x large aarch64 machine from Beakerper tested JDK, minutes to tens of minutes
execution time
* bare metal test suite
* JVM mode - 8x large aarch64 machine from Beaker per tested JDK, 1-2 hours execution time
* native mode - 8x xlarge aarch64 machine from Beaker, 2 hours execution time
* JVM mode - 9x large aarch64 machine from Beaker per tested JDK, 1-2 hours execution time
* native mode - 9x xlarge aarch64 machine from Beaker, 2 hours execution time
* Quarkus Quickstarts
* JVM mode - 1x large aarch64 machine fom Beaker per tested JDK, 1 hour execution time
* native mode - 1x xlarge aarch64 machine form Beaker, 2 hours execution time
* native mode - 1x xlarge aarch64 machine form Beaker, 5 hours execution time
* Estimation of increase in time requirements
* For machine time, parallelism in execution should be established. The spin up of beaker machines does not take that
long in trials, but we don't have any significantly large sample size yet to say how it will look in long term.
* For people/investigation time, this depends on how different JDK and test services act on aarch64. So far,
significant functional bugs were not present on aarch64. All the issues historically were related to either missing
test services or missing productized dependencies. For native applications, we have no baseline to base our
estimates upon.
* Beaker and OCP provisioning run in parallel and complete within 2 hours.
* Test jobs execute in parallel with x86-64 coverage on different machine quota.
* Running in parallel with x86-64 coverage, the pipeline should finish within 10 hours.

## Other considerations
* We should enable our other functional and structural coverage, but quickstarts and bare metal test suite are priority.
* `startstop-ts-code-quarkus`, `startstop-ts-special-chars`
* We need to consider how to handle the sprawling matrix. Currently, there are following axes in the matrix:
* mode (JVM, native)
* JDK (JDK17, JDK21)
Expand Down
35 changes: 35 additions & 0 deletions QUARKUS-3446.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# QUARKUS-3446: Enabling disabled OCP coverage on aarch64

Jira: https://issues.redhat.com/browse/QUARKUS-3446

This issue is about enabling tests for OpenShift that have been disabled on aarch64 due to missing test services with
RHBQ 3.8 release, as a followup to missed targets on OCP planned in [QUARKUS-3007](QUARKUS-3007) with RHBQ 3.8.

## Scope of the testing

### General
* OpenShift Serverless coverage
* Red Hat AMQ Streams coverage
* MongoDB coverage
* Red Hat SSO coverage
* Note that this coverage can only be re-enabled after RHBQ 3.15 release, as RH SSO is layered on RHBQ 3.15.

### Impact on test suites and testing automation
* Re-enabling tests disabled with the following issues as reason:
* https://github.com/quarkus-qe/quarkus-test-suite/issues/1142
* https://github.com/quarkus-qe/quarkus-test-suite/issues/1144
* https://github.com/quarkus-qe/quarkus-test-suite/issues/1146
* https://github.com/quarkus-qe/quarkus-test-suite/issues/1147
* https://github.com/quarkus-qe/quarkus-test-suite/issues/1145 (after 3.15)

### Impact on resources
Product testing:
* 0 impact on resources. Will run on the same infrastructure as existing OCP coverage on aarch64.
* Increase in time in order of tens of minutes, as the tests are spread into matrix and running in parallel.

## References
* Feature story: [QUARKUS-3446 Enabling disabled OCP coverage on aarch64](https://issues.redhat.com/browse/QUARKUS-3446)
* QE decomposition:[Ensure same coverage for JVM on OCP on aarch64 as on x86-64](https://issues.redhat.com/browse/QQE-258)

## Contacts
* Tester: Michal Jurč <[email protected]>
52 changes: 52 additions & 0 deletions QUARKUS-4942.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# QUARKUS-4942: Enabling Quarkus test coverage running on baremetal aarch64

Jira: https://issues.redhat.com/browse/QUARKUS-4942

This issue is about running the same coverage we run in bare metal testing for RHBQ on x86-64, as a followup to
deprioritised targets on bare metal planned in [QUARKUS-3007](QUARKUS-3007) with RHBQ 3.8.

## Scope of the testing

### General
* Certification of both the JVM and native modes in bare metal scenarios will require all the aspects to be tested
* we support two JDK versions for JVM mode and one JDK version for native mode per release stream
* bare metal test suite and Quarkus Quickstarts are priority
* `startstop-ts-code-quarkus`, `startstop-ts-special-chars` are nice to have, but they are not priority for 3.15

### Impact on test suites and testing automation
* Certification of both the JVM and native modes in bare metal scenarios on aarch64
* product aarch64 testing pipeline should be extended with:
* provisioning `large` label machines for running JVM mode coverage
* bare metal test suite, JVM and native mode
* Quarkus Quickstarts, JVM and native mode
* `startstop-ts-code-quarkus`, `startstop-ts-special-chars` are nice to have, but they are not priority for 3.15

### Impact on resources
* Certification of both the JVM and native modes in bare metal scenarios on aarch64
* bare metal test suite
* JVM mode - 9x large aarch64 machine from Beaker per tested JDK, 1-2 hours execution time
* native mode - 9x xlarge aarch64 machine from Beaker, 2 hours execution time
* Quarkus Quickstarts
* JVM mode - 1x large aarch64 machine fom Beaker per tested JDK, 1 hour execution time
* native mode - 1x xlarge aarch64 machine form Beaker, 5 hours execution time
Estimation of increase in time requirements
* Beaker provisioning should complete within 2 hours.
* Test jobs execute in parallel with x86-64 coverage on different machine quota.
* Running in parallel with x86-64 coverage, the bare metal jobs should finish within 10 hours.

## Other considerations
* We should enable our other functional and structural coverage, but quickstarts and bare metal test suite are priority.
* `startstop-ts-code-quarkus`, `startstop-ts-special-chars`
* We need to consider how to handle the sprawling matrix. Currently, there are following axes in the matrix:
* mode (JVM, native)
* JDK (JDK17, JDK21)
* container engine (Docker, Podman)

## References
* Feature story: [QUARKUS-4942 Enabling Quarkus test coverage running on baremetal aarch64](https://issues.redhat.com/browse/QUARKUS-4942)
* QE decomposition:
* [Ensure support for RHBQ in native mode on RHEL 8](https://issues.redhat.com/browse/QQE-433)
* [Ensure support for RHBQ in JVM mode on RHEL 8](https://issues.redhat.com/browse/QQE-437)

## Contacts
* Tester: Michal Jurč <[email protected]>

0 comments on commit 129ea35

Please sign in to comment.