Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Ensure that running threads started as dependencies of a test ar… #22715

Merged
merged 1 commit into from
Sep 19, 2022

Conversation

vivien-apple
Copy link
Contributor

…e freed, even if the process has never been started

Issue Being Resolved

Trying to locally runs the test suite with an iteration count > 100 never succeeds on my laptop:

./scripts/run_in_build_env.sh \
                  "./scripts/tests/run_test_suite.py \
                     --chip-tool ./out/darwin-x64-darwin-framework-tool-no-ble-asan-clang/darwin-framework-tool \
                     --target Test_TC_RH_1_1 \
                     run \
                     --iterations 300 \
                     --test-timeout-seconds 120 \
                     --all-clusters-app ./out/darwin-x64-all-clusters-no-ble-asan-clang/chip-all-clusters-app \
                     --lock-app ./out/darwin-x64-lock-no-ble-asan-clang/chip-lock-app \
                     --ota-provider-app ./out/darwin-x64-ota-provider-no-ble-asan-clang/chip-ota-provider-app \
                     --ota-requestor-app ./out/darwin-x64-ota-requestor-no-ble-asan-clang/chip-ota-requestor-app \
                     --tv-app ./out/darwin-x64-tv-app-no-ble-asan-clang/chip-tv-app \
                  "

At some point the test are starting to get slower, and some output is dropped randomly.
I have tracked that down to a regression from #17727.

To summary, #17727 has changed the behaviour of the test suite to pass all the applications declared as targets into the test suite as dependencies of the current test. This is fine by itself, the issue is that the code waiting for a process to be stopped just loop forever if one of the dependency has not been started - which happens for most of the dependencies.

As a result, we may end up with more than a thousand threads just sleeping (time.sleep(0.1)) until the point where the situation stops beeing tractable for the underlying hardware.

I suspect that to be the root cause of the intermittent failure that happens on darwin for Test_RH_2_1. Test_RH_2_1 being the one that is at the limit of what the CI hardware can supports. Why, does it happens more for darwin than for Linux ? Again that may depends on the hardware running the test suite. And why does it happens more for darwin-framework-tool than for chip-tool itself ? I suspect the tests running under the Darwin-framework-tool to be more "chatty", which may results into some stdout buffers beeing completely filled up when the test is running, and so we are more likely to miss the "interesting" part.

So all of this is just theory at the moment, I guess CI runs will prove me right or wrong...
If my CI run is green for the darwin bits on the first run, I will tagged this one has hot fix at the current redness on the CI is a pain.

Change overview

  • Ensure to stop the python threads for non running processes that have been tentatively killed.

…e freed, even if the process has never been started
@github-actions
Copy link

github-actions bot commented Sep 19, 2022

PR #22715: Size comparison from 989ad8e to 72bc3ae

Increases (3 builds for bl602, cc13x2_26x2, psoc6)
platform target config section 989ad8e 72bc3ae change % change
bl602 lighting-app bl602+rpc .text 1096262 1096264 2 0.0
cc13x2_26x2 lock-ftd LP_CC2652R7 (read/write) 170600 170608 8 0.0
psoc6 lock cy8ckit_062s2_43012 .debug_info 22395757 22395758 1 0.0
Decreases (3 builds for bl702, cc13x2_26x2, psoc6)
platform target config section 989ad8e 72bc3ae change % change
bl702 lighting-app bl702 .debug_info 37894568 37894567 -1 -0.0
.text 956776 956774 -2 -0.0
cc13x2_26x2 lock-ftd LP_CC2652R7 (read only) 678103 678095 -8 -0.0
.text 600336 600328 -8 -0.0
psoc6 light cy8ckit_062s2_43012 .debug_info 22016437 22016436 -1 -0.0
Full report (37 builds for bl602, bl702, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, psoc6, qpg, telink)
platform target config section 989ad8e 72bc3ae change % change
bl602 lighting-app bl602 (read/write) 1383062 1383062 0 0.0
.bss 89337 89337 0 0.0
.data 9816 9816 0 0.0
.text 1064914 1064914 0 0.0
bl602+rpc (read/write) 1428274 1428274 0 0.0
.bss 96769 96769 0 0.0
.data 10200 10200 0 0.0
.text 1096262 1096264 2 0.0
bl702 lighting-app bl702 (read only) 3262 3262 0 0.0
(read/write) 1188011 1188011 0 0.0
.bleromro 6296 6296 0 0.0
.bleromrw 124 124 0 0.0
.boot2 688 688 0 0.0
.bss 66958 66958 0 0.0
.bss_psram 29696 29696 0 0.0
.comment 48 48 0 0.0
.data 4280 4280 0 0.0
.debug_abbrev 1506715 1506715 0 0.0
.debug_aranges 133088 133088 0 0.0
.debug_frame 486412 486412 0 0.0
.debug_info 37894568 37894567 -1 -0.0
.debug_line 5252100 5252100 0 0.0
.debug_loc 3361966 3361966 0 0.0
.debug_ranges 359712 359712 0 0.0
.debug_str 3455772 3455772 0 0.0
.hbn 509 509 0 0.0
.hbn_noinit 260 260 0 0.0
.init 342 342 0 0.0
.init_array 144 144 0 0.0
.psram 0 0 0 0.0
.riscv.attributes 47 47 0 0.0
.rodata 116488 116488 0 0.0
.rsvd 3188 3188 0 0.0
.shstrtab 293 293 0 0.0
.stack 2048 2048 0 0.0
.strtab 564895 564895 0 0.0
.symtab 171616 171616 0 0.0
.tcm_data 36 36 0 0.0
.tcmcode 3262 3262 0 0.0
.text 0 0 0 0.0
956776 956774 -2 -0.0
bl702+rpc (read only) 3262 3262 0 0.0
(read/write) 1283939 1283939 0 0.0
.bleromro 6296 6296 0 0.0
.bleromrw 124 124 0 0.0
.boot2 688 688 0 0.0
.bss 75006 75006 0 0.0
.bss_psram 29936 29936 0 0.0
.comment 48 48 0 0.0
.data 4800 4800 0 0.0
.debug_abbrev 1644294 1644294 0 0.0
.debug_aranges 140592 140592 0 0.0
.debug_frame 511788 511788 0 0.0
.debug_info 41801170 41801170 0 0.0
.debug_line 5626639 5626639 0 0.0
.debug_loc 3554673 3554673 0 0.0
.debug_ranges 382168 382168 0 0.0
.debug_str 3851739 3851739 0 0.0
.hbn 509 509 0 0.0
.hbn_noinit 260 260 0 0.0
.init 342 342 0 0.0
.init_array 160 160 0 0.0
.psram 0 0 0 0.0
.riscv.attributes 47 47 0 0.0
.rodata 129896 129896 0 0.0
.rsvd 3188 3188 0 0.0
.shstrtab 293 293 0 0.0
.stack 2048 2048 0 0.0
.strtab 624068 624068 0 0.0
.symtab 189424 189424 0 0.0
.tcm_data 36 36 0 0.0
.tcmcode 3262 3262 0 0.0
.text 0 0 0 0.0
1030476 1030476 0 0.0
cc13x2_26x2 all-clusters-app LP_CC2652R7 (read only) 676571 676571 0 0.0
(read/write) 174964 174964 0 0.0
.bss 81228 81228 0 0.0
.data 3380 3380 0 0.0
.rodata 89603 89603 0 0.0
.text 586656 586656 0 0.0
all-clusters-minimal-app LP_CC2652R7 (read only) 640811 640811 0 0.0
(read/write) 157996 157996 0 0.0
.bss 80500 80500 0 0.0
.data 3380 3380 0 0.0
.rodata 78739 78739 0 0.0
.text 561752 561752 0 0.0
lock-ftd LP_CC2652R7 (read only) 678103 678095 -8 -0.0
(read/write) 170600 170608 8 0.0
.bss 78484 78484 0 0.0
.data 3304 3304 0 0.0
.rodata 77287 77287 0 0.0
.text 600336 600328 -8 -0.0
lock-mtd LP_CC2652R7 (read only) 661923 661923 0 0.0
(read/write) 182468 182468 0 0.0
.bss 74172 74172 0 0.0
.data 3304 3304 0 0.0
.rodata 103123 103123 0 0.0
.text 558320 558320 0 0.0
pump-app LP_CC2652R7 (read only) 687259 687259 0 0.0
(read/write) 162148 162148 0 0.0
.bss 78420 78420 0 0.0
.data 3296 3296 0 0.0
.rodata 90507 90507 0 0.0
.text 596268 596268 0 0.0
pump-controller-app LP_CC2652R7 (read only) 671759 671759 0 0.0
(read/write) 177760 177760 0 0.0
.bss 78532 78532 0 0.0
.data 3292 3292 0 0.0
.rodata 86063 86063 0 0.0
.text 585216 585216 0 0.0
shell LP_CC2652R7 (read only) 667590 667590 0 0.0
(read/write) 186256 186256 0 0.0
.bss 83540 83540 0 0.0
.data 3376 3376 0 0.0
.rodata 86318 86318 0 0.0
.text 580956 580956 0 0.0
cyw30739 light cyw930739m2evb_01 (read/write) 587314 587314 0 0.0
.app_xip_area 463972 463972 0 0.0
.bss 65776 65776 0 0.0
.data 744 744 0 0.0
.rodata 0 0 0 0.0
.text 112 112 0 0.0
lock cyw930739m2evb_01 (read/write) 594370 594370 0 0.0
.app_xip_area 465700 465700 0 0.0
.bss 71096 71096 0 0.0
.data 752 752 0 0.0
.rodata 0 0 0 0.0
.text 112 112 0 0.0
ota-requestor-no-progress-logging cyw930739m2evb_01 (read/write) 543306 543306 0 0.0
.app_xip_area 424988 424988 0 0.0
.bss 60784 60784 0 0.0
.data 716 716 0 0.0
.rodata 0 0 0 0.0
.text 112 112 0 0.0
efr32 lighting-app BRD4161A (read/write) 1110288 1110288 0 0.0
.bss 136332 136332 0 0.0
.data 2072 2072 0 0.0
.text 971864 971864 0 0.0
BRD4161A+rpc (read/write) 973428 973428 0 0.0
.bss 150844 150844 0 0.0
.data 2252 2252 0 0.0
.text 820312 820312 0 0.0
BRD4161A+rs911x (read/write) 1003552 1003552 0 0.0
.bss 169168 169168 0 0.0
.data 2064 2064 0 0.0
.text 832300 832300 0 0.0
lock-app BRD4161A+wf200 (read/write) 1151292 1151292 0 0.0
.bss 152248 152248 0 0.0
.data 2072 2072 0 0.0
.text 996952 996952 0 0.0
window-app BRD4161A (read/write) 1102336 1102336 0 0.0
.bss 137780 137780 0 0.0
.data 2096 2096 0 0.0
.text 962440 962440 0 0.0
esp32 all-clusters-app c3devkit (read only) 1222910 1222910 0 0.0
(read/write) 1788046 1788046 0 0.0
.dram0.bss 76944 76944 0 0.0
.dram0.data 13840 13840 0 0.0
.flash.rodata 257616 257616 0 0.0
.flash.text 1222910 1222910 0 0.0
.iram0.text 65204 65204 0 0.0
m5stack (read only) 1232975 1232975 0 0.0
(read/write) 563940 563940 0 0.0
.dram0.bss 82304 82304 0 0.0
.dram0.data 34296 34296 0 0.0
.flash.rodata 314672 314672 0 0.0
.flash.text 1227591 1227591 0 0.0
.iram0.text 123939 123939 0 0.0
k32w light k32w0+release (read/write) 649868 649868 0 0.0
.bss 70712 70712 0 0.0
.data 2068 2068 0 0.0
.text 574360 574360 0 0.0
lock k32w0+release (read/write) 706824 706824 0 0.0
.bss 71160 71160 0 0.0
.data 2076 2076 0 0.0
.text 630860 630860 0 0.0
linux chip-tool-ipv6only arm64 (read only) 10353156 10353156 0 0.0
(read/write) 706241 706241 0 0.0
.bss 33937 33937 0 0.0
.data 2864 2864 0 0.0
.data.rel.ro 650560 650560 0 0.0
.dynamic 560 560 0 0.0
.got 13904 13904 0 0.0
.init 24 24 0 0.0
.init_array 200 200 0 0.0
.rodata 503860 503860 0 0.0
.text 8195348 8195348 0 0.0
thermostat-no-ble arm64 (read only) 2386908 2386908 0 0.0
(read/write) 143617 143617 0 0.0
.bss 55345 55345 0 0.0
.data 1912 1912 0 0.0
.data.rel.ro 77208 77208 0 0.0
.dynamic 560 560 0 0.0
.got 5184 5184 0 0.0
.init 24 24 0 0.0
.init_array 432 432 0 0.0
.rodata 143652 143652 0 0.0
.text 2000992 2000992 0 0.0
mbed lock-app CY8CPROTO_062_4343W+release (read only) 6224 6224 0 0.0
(read/write) 2455576 2455576 0 0.0
.bss 215044 215044 0 0.0
.data 5872 5872 0 0.0
.text 1418220 1418220 0 0.0
nrfconnect all-clusters-app nrf52840dk_nrf52840 (read/write) 1182035 1182035 0 0.0
bss 143633 143633 0 0.0
rodata 144196 144196 0 0.0
text 815256 815256 0 0.0
all-clusters-minimal-app nrf52840dk_nrf52840 (read/write) 1160687 1160687 0 0.0
bss 142860 142860 0 0.0
rodata 135768 135768 0 0.0
text 803124 803124 0 0.0
psoc6 all-clusters cy8ckit_062s2_43012 (read only) 841968 841968 0 0.0
(read/write) 1743884 1743884 0 0.0
.ARM.attributes 46 46 0 0.0
.ARM.exidx 8 8 0 0.0
.bss 188712 188712 0 0.0
.comment 204 204 0 0.0
.copy.table 24 24 0 0.0
.cy_m0p_image 6216 6216 0 0.0
.cy_sharedmem 8 8 0 0.0
.data 2664 2664 0 0.0
.debug_abbrev 1229301 1229301 0 0.0
.debug_aranges 111800 111800 0 0.0
.debug_frame 373268 373268 0 0.0
.debug_info 26815460 26815460 0 0.0
.debug_line 3667850 3667850 0 0.0
.debug_loc 3580116 3580116 0 0.0
.debug_ranges 339904 339904 0 0.0
.debug_str 3439416 3439416 0 0.0
.heap 841968 841968 0 0.0
.noinit 148 148 0 0.0
.ramVectors 736 736 0 0.0
.shstrtab 288 288 0 0.0
.stab 156 156 0 0.0
.stabstr 335 335 0 0.0
.stack_dummy 4096 4096 0 0.0
.strtab 569356 569356 0 0.0
.symtab 421168 421168 0 0.0
.text 1544120 1544120 0 0.0
.zero.table 8 8 0 0.0
text 0 0 0 0.0
all-clusters-minimal cy8ckit_062s2_43012 (read only) 842704 842704 0 0.0
(read/write) 1686492 1686492 0 0.0
.ARM.attributes 46 46 0 0.0
.ARM.exidx 8 8 0 0.0
.bss 187976 187976 0 0.0
.comment 204 204 0 0.0
.copy.table 24 24 0 0.0
.cy_m0p_image 6216 6216 0 0.0
.cy_sharedmem 8 8 0 0.0
.data 2664 2664 0 0.0
.debug_abbrev 1221100 1221100 0 0.0
.debug_aranges 111272 111272 0 0.0
.debug_frame 376348 376348 0 0.0
.debug_info 26552241 26552241 0 0.0
.debug_line 3688566 3688566 0 0.0
.debug_loc 3567753 3567753 0 0.0
.debug_ranges 338520 338520 0 0.0
.debug_str 3428429 3428429 0 0.0
.heap 842704 842704 0 0.0
.noinit 148 148 0 0.0
.ramVectors 736 736 0 0.0
.shstrtab 288 288 0 0.0
.stab 156 156 0 0.0
.stabstr 335 335 0 0.0
.stack_dummy 4096 4096 0 0.0
.strtab 533445 533445 0 0.0
.symtab 407600 407600 0 0.0
.text 1487464 1487464 0 0.0
.zero.table 0 0 0 0.0
8 8 0 0.0
light cy8ckit_062s2_43012 (read only) 850896 850896 0 0.0
(read/write) 1605044 1605044 0 0.0
.ARM.attributes 46 46 0 0.0
.ARM.exidx 8 8 0 0.0
.bss 179992 179992 0 0.0
.comment 204 204 0 0.0
.copy.table 24 24 0 0.0
.cy_m0p_image 6216 6216 0 0.0
.cy_sharedmem 8 8 0 0.0
.data 2456 2456 0 0.0
.debug_abbrev 1055156 1055156 0 0.0
.debug_aranges 103480 103480 0 0.0
.debug_frame 346676 346676 0 0.0
.debug_info 22016437 22016436 -1 -0.0
.debug_line 3258486 3258486 0 0.0
.debug_loc 3265860 3265860 0 0.0
.debug_ranges 303848 303848 0 0.0
.debug_str 3233961 3233961 0 0.0
.heap 850896 850896 0 0.0
.noinit 148 148 0 0.0
.ramVectors 736 736 0 0.0
.shstrtab 288 288 0 0.0
.stab 156 156 0 0.0
.stabstr 335 335 0 0.0
.stack_dummy 4096 4096 0 0.0
.strtab 469822 469822 0 0.0
.symtab 376048 376048 0 0.0
.text 1414208 1414208 0 0.0
.zero.table 0 0 0 0.0
8 8 0 0.0
lock cy8ckit_062s2_43012 (read only) 845864 845864 0 0.0
(read/write) 1642668 1642668 0 0.0
.ARM.attributes 46 46 0 0.0
.ARM.exidx 8 8 0 0.0
.bss 185008 185008 0 0.0
.comment 204 204 0 0.0
.copy.table 24 24 0 0.0
.cy_m0p_image 6216 6216 0 0.0
.cy_sharedmem 8 8 0 0.0
.data 2472 2472 0 0.0
.debug_abbrev 1062575 1062575 0 0.0
.debug_aranges 104152 104152 0 0.0
.debug_frame 349500 349500 0 0.0
.debug_info 22395757 22395758 1 0.0
.debug_line 3267178 3267178 0 0.0
.debug_loc 3305688 3305688 0 0.0
.debug_ranges 307192 307192 0 0.0
.debug_str 3261416 3261416 0 0.0
.heap 845864 845864 0 0.0
.noinit 148 148 0 0.0
.ramVectors 736 736 0 0.0
.shstrtab 288 288 0 0.0
.stab 156 156 0 0.0
.stabstr 335 335 0 0.0
.stack_dummy 4096 4096 0 0.0
.strtab 476025 476025 0 0.0
.symtab 379232 379232 0 0.0
.text 1446800 1446800 0 0.0
.zero.table 0 0 0 0.0
8 8 0 0.0
qpg lighting-app qpg6105+debug (read/write) 1130780 1130780 0 0.0
.bss 106112 106112 0 0.0
.data 1028 1028 0 0.0
.text 577876 577876 0 0.0
lock-app qpg6105+debug (read/write) 1101760 1101760 0 0.0
.bss 102344 102344 0 0.0
.data 1032 1032 0 0.0
.text 548860 548860 0 0.0
telink light-switch-app tlsr9518adk80d (read/write) 813660 813660 0 0.0
bss 71372 71372 0 0.0
noinit 43488 43488 0 0.0
text 574558 574558 0 0.0
lighting-app tlsr9518adk80d (read/write) 835616 835616 0 0.0
bss 72228 72228 0 0.0
noinit 43488 43488 0 0.0
text 592718 592718 0 0.0
ota-requestor-app tlsr9518adk80d (read/write) 843716 843716 0 0.0
bss 73136 73136 0 0.0
noinit 43488 43488 0 0.0
text 598960 598960 0 0.0

@vivien-apple
Copy link
Contributor Author

I have relaunched Build Darwin - Build iOS Debug once, as it was timing out. This is completely unrelated to those changes since those changes are altering the python test runner which is definitively not used in Build Darwin - Build iOS debug

The tests of interests here are:
Darwin Tests / Test Suites - Darwin (no-ble-asan-clang)
Darwin Tests / Test Suites - Darwin (no-ble-tsan-clang)

@vivien-apple
Copy link
Contributor Author

Those tests are green on first try:
Darwin Tests / Test Suites - Darwin (no-ble-asan-clang) (chip-tool)
Darwin Tests / Test Suites - Darwin (no-ble-asan-clang) (darwin-framework-tool)
Darwin Tests / Test Suites - Darwin (no-ble-tsan-clang)

I will relaunch them to see if they stay green on a second attempt.

@vivien-apple
Copy link
Contributor Author

Again, the Darwin / Build Darwin error is unrelated and the interesting tests: Darwin Tests / Test Suites - Darwin * are green. I will re-run them a third time and in parallel I will investigate the Darwin / Build Darwin failure.

Attached is the log of the Darwin / Build Darwin failure.

2022-09-19T11:29:48.3082520Z Test Case '-[MTRXPCProtocolTests testReadClusterStateCacheSuccess]' started.
2022-09-19T11:29:48.3083000Z 2022-09-19 11:29:48.304452+0000 xctest[5116:477757] XPC listener accepting connection
2022-09-19T11:29:48.3083430Z 2022-09-19 11:29:48.304936+0000 xctest[5116:477602] Subscribe attribute cache called
2022-09-19T11:29:48.3083900Z 2022-09-19 11:29:48.305292+0000 xctest[5116:477602] Attribute cache subscription succeeded for device 9876543210
2022-09-19T11:29:48.3084380Z 2022-09-19 11:29:48.305423+0000 xctest[5116:477602] Subscribe completion called with error: (null)
2022-09-19T11:29:48.3084820Z 2022-09-19 11:29:48.305529+0000 xctest[5116:477755] XPC connection disconnected
2022-09-19T11:29:48.3085240Z 2022-09-19 11:29:48.305837+0000 xctest[5116:477757] XPC listener accepting connection
2022-09-19T11:29:53.1864770Z /Users/runner/work/connectedhomeip/connectedhomeip/src/darwin/Framework/CHIPTests/MTRXPCProtocolTests.m:2290: error: -[MTRXPCProtocolTests testReadClusterStateCacheSuccess] : *** -[__NSPlaceholderArray initWithObjects:count:]: attempt to insert nil object from objects[2] (NSInvalidArgumentException)
2022-09-19T11:29:53.1957260Z Test Case '-[MTRXPCProtocolTests testReadClusterStateCacheSuccess]' failed (4.884 seconds).```

@vivien-apple
Copy link
Contributor Author

Third time in a row that Darwin Tests / Test Suites - Darwin * are green. Let's see what happens on ToT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants