ci: remove clang workflow, use main twister workflow to build using llvm #82354

nashif · 2024-11-29T22:11:15Z

twister: add an option to force a toolchain variant for a platform
ci: twister: build native_sim with llvm
ci: clang: remove clang workflow

Using --force-platform-toolchain native_sim/native:llvm will force building native_sim tests with llvm while everything else is built with whatever variant was set in the environment. This option can be provided multiple times for multiple platforms and toolchain variants. Signed-off-by: Anas Nashif <[email protected]>

Use twister to selectively build with llvm on some platforms. This removes duplication with the clang workflow which is being removed in a followup commit. Signed-off-by: Anas Nashif <[email protected]>

We build with clang in main twister workflow now. Signed-off-by: Anas Nashif <[email protected]>

aescolar · 2024-12-02T08:56:43Z

.github/workflows/twister.yaml

@@ -149,11 +150,19 @@ jobs:
          export ZEPHYR_BASE=${PWD}
          export ZEPHYR_TOOLCHAIN_VARIANT=zephyr
          python3 ./scripts/ci/test_plan.py -c origin/${BASE_REF}.. --pull-request --no-detailed-test-id
-          ./scripts/twister --subset ${{matrix.subset}}/${{ strategy.job-total }} --load-tests testplan.json ${TWISTER_COMMON} ${PR_OPTIONS}
+          ./scripts/twister \


So do I understand correctly that:

now we would be running in the PR CI always with clang for native_sim[32] and never with gcc?

the main and weekly CI jobs would still run with gcc

If so, some thoughts, and my main concerns with this approach:

Having a different compiler in PRs and main, I expect trouble with breakages being discovered in main instead of PRs, and confused users about "why did the PR CI did not catch this?".

I expect the majority of users will keep using gcc also with native_sim, and that therefore we will be reducing needed coverage in PRs.

Using a different compiler in CI in native_sim, native_sim//64 and the bsim targets will also cause confusion.

So do I understand correctly that:

now we would be running in the PR CI always with clang for native_sim[32] and never with gcc?

the main and weekly CI jobs would still run with gcc

In PRs, you get the gcc coverage through different variants, i.e 64 bit and nrf_bsim. Does it really matter what toolchain you built with, the goal it to build the simulator and run tests using it, it is never about the simulator itself.

You still get gcc coverage through main branch and other workflows deploying native_sim for testing, i.e. bluetooth.

aescolar · 2024-12-02T08:59:39Z

.github/workflows/twister.yaml

+            --subset ${{matrix.subset}}/${{ strategy.job-total }} \
+            --load-tests testplan.json \
+            ${TWISTER_COMMON} \
+            ${PR_OPTIONS}


twister logs (printouts) do not tell which toolchain was used. I expect users will be confused if an error occurs in only clang or only gcc and we do not tell them neither that it was built with that compiler, nor that they need to set that option to repro the issue
(this issue existed before, but before at least clang was only used in the clang job which was a hint. If we mix them now in the same workflow it is worse).

Note the testplan.json and reports "testsuites" do not indicate the toolchain either (it is in the "environment" though)

Note the testplan.json and reports "testsuites" do not indicate the toolchain either (it is in the "environment" though)

well, right now it says toolchain is zephyr, althoguh for native_sim it is using the host toolchain and not using the zephyr sdk at all.

stephanosio

I do not think completely removing the Clang workflow is going to work for us in the long run.

With the upcoming Zephyr SDK LLVM toolchain, we will have more platforms to test (ARM, RISC-V and potentially more) with Clang -- it would be better to keep these "Clang tests" in a separate workflow for clarity reasons.

As previously discussed, we should look into reducing the set of tests run by the Clang workflow to bare minimum (i.e. "smoke tests") so as to not overload the CI infrastructure.

nashif · 2024-12-02T22:04:53Z

I do not think completely removing the Clang workflow is going to work for us in the long run.

With the upcoming Zephyr SDK LLVM toolchain, we will have more platforms to test (ARM, RISC-V and potentially more) with Clang -- it would be better to keep these "Clang tests" in a separate workflow for clarity reasons.

As previously discussed, we should look into reducing the set of tests run by the Clang workflow to bare minimum (i.e. "smoke tests") so as to not overload the CI infrastructure.

This is not how we should be testing a toolchain, the clang workflow right now now is not a toolchain/sdk testing workflow, it just replicates what we do in the main workflow by building on platforms using a different toolchains.

For testing clang or toolchains in general, we need something dedicated that does not duplicate building / running the same tests.

So, for the upcoming SDK LLVM toolchain we need to come up with a plan and a new way to test. What we have in the clang workflow right now is not suitable and does not scale.

nashif · 2024-12-02T22:15:32Z

As previously discussed, we should look into reducing the set of tests run by the Clang workflow to bare minimum (i.e. "smoke tests") so as to not overload the CI infrastructure.

sure, this could be done in a new workflow to verify toolchains or some signficant changes to the current workflow should be done now, I do not think the apporach we have clang is suitable anymore, as it is just builds everything for 1 single platforms with clang and it is getting in the way with no major value right now (at least 1 hour of CI is spent building/running the same tests we build/run in the main twister workflow), and that for every change and the queue is often quite long.

nashif · 2024-12-03T03:40:34Z

ok, this made me think of a more productive way to test for additional toolchains, so so similar to how we have test/platform combos ( basically how we test things right now), we can also add support in twister to do test/toolchain combos and configurations, so this means we could have a test be built for using multiple toolchains the same way we do this with multiple platforms, something like integration_toolchains.

Doing this we would build a few targeted tests with multiple toolchains, so this would get us better control of what to build and when without trying to boil the ocean to get there.

aescolar · 2024-12-03T08:47:51Z

..we could have a test be built for using multiple toolchains the same way we do this with multiple platforms, something like integration_toolchains.

This sounds quite reasonable to me.

About:

This is not how we should be testing a toolchain, the clang workflow right now now is not a toolchain/sdk testing workflow, it just replicates what we do in the main workflow by building on platforms using a different toolchains.

Just a thought: if all we try to do is test the toolchain itself, then yes, building a few relevant tests/apps with it would cover it.
But, unfortunately, there is the another side to it: All code we deliver "should" also build clean in the relevant toolchains. So if we want to be sure it does, the "trivial" solution is to build all combinations. A pragmatic approach could be assuming the likelihood of having toolchain specific issues in the code is very small, so we can try to defer that combination to the weekly, or even just leave it to users to find and report.

In any case, as we are exponentially increasing the number of combinations, I'm wondering if (not only for toolchains) we should start having another level of test filters, so we filter both for PR and for weekly. What about something like:

Todays 'integration_*' for PRs
A new 'integration_weekly_*' for weekly (i.e. the combinations we cover to some degree in our CI)
The 'allow/exclude' filter for what a test does not support at all
So we'd have 4 tiers of combinations for each test:

What we test in PRs,
What we test in weekly
What is supposed to work but is just best effort and the community would need find
What we know does not work.

zephyrbot added area: Twister Twister area: Continuous Integration labels Nov 29, 2024

zephyrbot requested review from fabiobaltieri, gchwier, golowanow, hakehuang, KamilxPaszkiet, kartben, LukaszMrugala, PerMac and stephanosio November 29, 2024 22:12

zephyrbot assigned nashif Nov 29, 2024

nashif force-pushed the topic/ci/twister_clang branch from e44774d to 0658850 Compare November 29, 2024 22:15

nashif added 3 commits November 29, 2024 18:48

ci: twister: build native_sim with llvm

60cb67c

Use twister to selectively build with llvm on some platforms. This removes duplication with the clang workflow which is being removed in a followup commit. Signed-off-by: Anas Nashif <[email protected]>

ci: clang: remove clang workflow

e3a87ba

We build with clang in main twister workflow now. Signed-off-by: Anas Nashif <[email protected]>

nashif force-pushed the topic/ci/twister_clang branch from 0658850 to e3a87ba Compare November 29, 2024 23:48

jhedberg approved these changes Nov 30, 2024

View reviewed changes

fabiobaltieri approved these changes Dec 1, 2024

View reviewed changes

hakehuang approved these changes Dec 2, 2024

View reviewed changes

aescolar requested changes Dec 2, 2024

View reviewed changes

stephanosio reviewed Dec 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: remove clang workflow, use main twister workflow to build using llvm #82354

ci: remove clang workflow, use main twister workflow to build using llvm #82354

nashif commented Nov 29, 2024

aescolar Dec 2, 2024

nashif Dec 2, 2024

aescolar Dec 2, 2024

nashif Dec 2, 2024

stephanosio left a comment

nashif commented Dec 2, 2024

nashif commented Dec 2, 2024

nashif commented Dec 3, 2024

aescolar commented Dec 3, 2024

ci: remove clang workflow, use main twister workflow to build using llvm #82354

Are you sure you want to change the base?

ci: remove clang workflow, use main twister workflow to build using llvm #82354

Conversation

nashif commented Nov 29, 2024

aescolar Dec 2, 2024

Choose a reason for hiding this comment

nashif Dec 2, 2024

Choose a reason for hiding this comment

aescolar Dec 2, 2024

Choose a reason for hiding this comment

nashif Dec 2, 2024

Choose a reason for hiding this comment

stephanosio left a comment

Choose a reason for hiding this comment

nashif commented Dec 2, 2024

nashif commented Dec 2, 2024

nashif commented Dec 3, 2024

aescolar commented Dec 3, 2024