Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: twister: add quit-on-failure option #77246

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hakehuang
Copy link
Collaborator

in CI, we may need to quit if there is any failure to save time, so add this --quit-on-failure so that any failure will quit the test.

ubieda
ubieda previously approved these changes Oct 17, 2024
@nashif
Copy link
Member

nashif commented Oct 18, 2024

tried with after making hello world fail:

i9:zephyr(enable_quit_on_failure): ./scripts/twister -T  samples/hello_world -v --quit-on-failure
Renaming output directory to /home/nashif/zephyrproject/zephyr/twister-out.8
INFO    - Using Ninja..
INFO    - Zephyr version: v3.7.0-1080-g9001ec479ced
INFO    - Using 'zephyr' toolchain.
INFO    - Selecting default platforms per test case
INFO    - Building initial testsuite list...
INFO    - Writing JSON report /home/nashif/zephyrproject/zephyr/twister-out/testplan.json
INFO    - JOBS: 20
INFO    - Adding tasks to the queue...
INFO    - Added initial list of jobs to queue
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - General exception: 'AutoProxy[LifoQueue]' object has no attribute 'mutex'
ERROR   - Process 1204301 failed, aborting execution

Copy link
Member

@nashif nashif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General exception when using this option..

@ubieda ubieda dismissed their stale review October 18, 2024 14:56

Didn't test properly. Dismissing approval.

@hakehuang hakehuang marked this pull request as draft October 18, 2024 15:21
@hakehuang
Copy link
Collaborator Author

General exception when using this option..

so we need quit orderly? instead of just quit right. I initial idea is if there is any failure then the runner just quit. let me think about a better way. Thanks for checking this.

@@ -1341,6 +1341,11 @@ def pipeline_mgr(self, pipeline, done_queue, lock, results):
pb = ProjectBuilder(instance, self.env, self.jobserver)
pb.duts = self.duts
pb.process(pipeline, done_queue, task, lock, results)
if self.env.options.quit_on_failure:
if pb.instance.status in [TwisterStatus.FAIL, TwisterStatus.ERROR]:
with pipeline.mutex:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipeline is a LifoQueue, to clear it you should use something like:

    @staticmethod
    def clear_pipeline(pipeline):
        while not pipeline.empty():
            try:
                pipeline.get_nowait()
            except queue.Empty:
                break

and just call self.clear_pipeline() without any mutex

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current solution you break tests execution just after receiving an error/failure, without waiting to report it properly. You can add variable to ProjectBuilder class (e.g. report_ready), and set it in process method in elif op == "report".
And then check:

                            if all([
                                self.env.options.quit_on_failure,
                                pb.report_ready,
                                pb.instance.status in [TwisterStatus.FAIL, TwisterStatus.ERROR]
                            ]):
                                self.clear_pipeline(pipeline)
                                break

with that change you will find also error reason in twister.json

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let m try. Thanks a lot

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now I make hello world failed, and below is the full log with this PR

ubuntu@ubuntu-OptiPlex-7050:/home/shared/disk/zephyr_project/zephyr_test/zephyr$ scripts/twister -vvv  -T samples/hello_world/ --build-only --quit-on-failure
INFO    - Using Ninja..
INFO    - Zephyr version: v3.7.0-5335-g2e3c8e40154c
INFO    - Using 'zephyr' toolchain.
INFO    - Selecting default platforms per test case
INFO    - Building initial testsuite list...
INFO    - Writing JSON report /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/testplan.json
INFO    - JOBS: 16
INFO    - Adding tasks to the queue...
INFO    - Added initial list of jobs to queue
INFO    - 16/41 qemu_cortex_r5/zynqmp_rpu samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_cortex_r5_zynqmp_rpu/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 17/41 qemu_cortex_a9/xc7z007s   samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_cortex_a9_xc7z007s/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 18/41 qemu_x86_64/atom          samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_x86_64_atom/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 19/41 qemu_x86/atom             samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_x86_atom/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 20/41 qemu_arc/qemu_arc_hs5x    samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_arc_qemu_arc_hs5x/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 21/41 qemu_arc/qemu_arc_hs6x    samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_arc_qemu_arc_hs6x/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 22/41 qemu_arc/qemu_arc_hs/xip  samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_arc_qemu_arc_hs_xip/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 23/41 qemu_riscv32/qemu_virt_riscv32 samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_riscv32_qemu_virt_riscv32/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 24/41 qemu_xtensa/dc233c/mmu    samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_xtensa_dc233c_mmu/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 25/41 qemu_nios2/qemu_nios2     samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_nios2_qemu_nios2/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 26/41 qemu_riscv64/qemu_virt_riscv64/smp samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_riscv64_qemu_virt_riscv64_smp/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 27/41 qemu_riscv32/qemu_virt_riscv32/smp samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_riscv32_qemu_virt_riscv32_smp/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 28/41 qemu_riscv32e/qemu_virt_riscv32e samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_riscv32e_qemu_virt_riscv32e/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 29/41 qemu_riscv64/qemu_virt_riscv64 samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_riscv64_qemu_virt_riscv64/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 30/41 qemu_xtensa/dc233c        samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_xtensa_dc233c/samples/hello_world/sample.basic.helloworld/build.log
INFO    - 31/41 qemu_leon3/leon3          samples/hello_world/sample.basic.helloworld         ERROR Build failure (build)
INFO    -                                     sample.basic.helloworld                                                     BLOCKED      Build failure
ERROR   - see: /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/qemu_leon3_leon3/samples/hello_world/sample.basic.helloworld/build.log

INFO    - 1 test scenarios (41 test instances) selected, 15 configurations skipped (15 by static filter, 0 at runtime).
--------------------------------
Total test suites: 41
Total test cases: 31
Executed test cases: 16
Skipped test cases: 15
Completed test suites: 31
Passing test suites: 0
Built only test suites: 0
Failing test suites: 0
Skipped test suites: 15
Skipped test suites (runtime): 0
Skipped test suites (filter): 15
Errors: 16
--------------------------------
INFO    - 0 of 41 test configurations passed (0.00%), 0 built (not run), 0 failed, 16 errored, 15 skipped with 0 warnings in 40.26 seconds
INFO    - 0 test configurations executed on platforms, 26 test configurations were only built.
INFO    - Saving reports...
INFO    - Writing JSON report /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/twister.json
INFO    - Writing xunit report /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/twister.xml...
INFO    - Writing xunit report /home/shared/disk/zephyr_project/zephyr_test/zephyr/twister-out/twister_report.xml...
INFO    - -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
INFO    - The following issues were found (showing the top 10 items):
INFO    - 1) samples/hello_world/sample.basic.helloworld on native_sim/native None (Unknown)
INFO    - 2) samples/hello_world/sample.basic.helloworld on mps2/an385 None (Unknown)
INFO    - 3) samples/hello_world/sample.basic.helloworld on mps2/an521/cpu0 None (Unknown)
INFO    - 4) samples/hello_world/sample.basic.helloworld on qemu_cortex_m0/nrf51822 None (Unknown)
INFO    - 5) samples/hello_world/sample.basic.helloworld on qemu_malta/qemu_malta None (Unknown)
INFO    - 6) samples/hello_world/sample.basic.helloworld on qemu_malta/qemu_malta/be None (Unknown)
INFO    - 7) samples/hello_world/sample.basic.helloworld on qemu_cortex_a53/qemu_cortex_a53 None (Unknown)
INFO    - 8) samples/hello_world/sample.basic.helloworld on qemu_cortex_a53/qemu_cortex_a53/smp None (Unknown)
INFO    - 9) samples/hello_world/sample.basic.helloworld on qemu_arc/qemu_arc_em None (Unknown)
INFO    - 10) samples/hello_world/sample.basic.helloworld on qemu_arc/qemu_arc_hs None (Unknown)
INFO    -
INFO    - To rerun the tests, call twister using the following commandline:
INFO    - west twister -p <PLATFORM> -s <TEST ID>, for example:
INFO    -
INFO    - west twister -p qemu_arc/qemu_arc_hs -s samples/hello_world/sample.basic.helloworld
INFO    - or with west:
INFO    - west build -p -b qemu_arc/qemu_arc_hs samples/hello_world -T sample.basic.helloworld
INFO    - -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
INFO    - Run completed

@hakehuang hakehuang force-pushed the enable_quit_on_failure branch 2 times, most recently from 765941e to 6d4a837 Compare October 25, 2024 11:41
@hakehuang hakehuang marked this pull request as ready for review October 25, 2024 11:42
@hakehuang
Copy link
Collaborator Author

General exception when using this option..

@nashif now the flow is clean, please check. Thanks

Copy link
Collaborator

@LukaszMrugala LukaszMrugala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instances skipped this way seem to have no status in the twister.json. On one hand, it makes them easy to find, on the other - they should have a status. SKIP, maybe.

@nashif
Copy link
Member

nashif commented Nov 20, 2024

@hakehuang it would be nice to print something at the end saying that twister aborted because of an error, this way you will know something is wrong and you do not assume it was a full complete run..

@hakehuang
Copy link
Collaborator Author

hakehuang commented Nov 20, 2024

Instances skipped this way seem to have no status in the twister.json. On one hand, it makes them easy to find, on the other - they should have a status. SKIP, maybe.

@LukaszMrugala instance quit with this is no status maybe better, as they are not run, mark as skip may mix with real skip, which may confuse reader, and I add a line as @nashif recommend at the end of test, would this be OK?

@@ -241,6 +241,8 @@ def main(options, default_options):
or (tplan.warnings and options.warnings_as_errors)
or (options.coverage and not coverage_completed)
):
if env.options.quit_on_failure:
logger.info("twister aborted because of a failure/error")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now it says 'Run completed' above and then 'twister aborted...' that is confusing, it is either completed or aborted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, updated.

nashif
nashif previously approved these changes Nov 26, 2024
ubieda
ubieda previously approved these changes Nov 26, 2024
in CI, we may need to quit if there is any failure
to save time, so add this --quit-on-failure so that
any failure will quit the test.

Signed-off-by: Hake Huang <[email protected]>
@hakehuang
Copy link
Collaborator Author

@nashif , @ubieda , I just rebase to latest code base to make the CI pass, please help to approve again, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants