Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] thread network provisioning crash in cirque test #28352

Open
yunhanw-google opened this issue Jul 28, 2023 · 1 comment
Open

[BUG] thread network provisioning crash in cirque test #28352

yunhanw-google opened this issue Jul 28, 2023 · 1 comment
Labels
bug Something isn't working linux needs triage

Comments

@yunhanw-google
Copy link
Contributor

Reproduction steps

Recently we disable the cirque test when upgrading docker image location in #27879, cirque test is used to gate the quality for network provisioning cluster, when re-enabling, we observe thread border router crash in network scanning.

Controller side
DEBUG:chip.EM:Flushed pending ack for MessageCounter:97987663 on exchange 34853i
INFO:NetworkCommissioning:Received response: NetworkCommissioning.Commands.ScanNetworksResponse(networkingStatus=<NetworkCommissioningStatusEnum.kUnknownError: 12>, debugText=None, wiFiScanResults=None, threadScanResults=[])
2023-07-28 03:49:57 5edf5df264cd PythonMatterControllerTEST[4359] CRITICAL Testfail: Failed to finish network commissioning
CRITICAL:PythonMatterControllerTEST:Testfail: Failed to finish network commissioning.

device side:

otbr-agent[55]: 00:03:19.760 [I] Mac-----------: Sending Beacon Request

otbr-agent[55]: 00:03:20.062 [I] Mac-----------: Sending Beacon Request

[1690516197.466393][60:64] CHIP:DL: Failed to perform finish Thread network scan: Method “io.openthread.BorderRouter.Scan” returned type “(a(tstayqqynyybb))”, but expected “(a(tstayqqyyyybb))”
[1690516197.466535][60:60] CHIP:DMG: Command handler moving to [ Preparing]
[1690516197.466618][60:60] CHIP:DMG: Command handler moving to [AddingComm]
[1690516197.466665][60:60] CHIP:DMG: Command handler moving to [AddedComma]
[1690516197.466674][60:60] CHIP:DMG: Decreasing reference count for CommandHandler, remaining 0
[1690516197.466752][60:60] CHIP:EM: <<< [E:34853r S:45418 M:97987663 (Ack:221481980)] (S) Msg TX to 1:000000000001B669 [18AD] --- Type 0001:09 (IM:InvokeCommandResponse)
[1690516197.466788][60:60] CHIP:IN: (S) Sending msg 97987663 on secure session with LSID: 45418
[1690516197.466865][60:60] CHIP:DMG: Command handler moving to [CommandSen]
[1690516197.466879][60:60] CHIP:DMG: Command handler moving to [AwaitingDe]

Thread 2 "gmain-matter" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff71dd700 (LWP 64)]
0x00007ffff79813f0 in g_bit_lock () from /lib/x86_64-linux-gnu/libglib-2.0.so.0

Thread 5 (Thread 0x7fffe7fff700 (LWP 67)):
#0 0x00007ffff7fb1b34 in open64 () at /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00005555555a777e in NamedPipeCommands::EventCommandListenerTask(void*) (arg=0x555555897260 <(anonymous namespace)::sChipNamedPipeCommands>) at ../../examples/platform/linux/NamedPipeCommands.cpp:75
#2 0x00007ffff7fa6609 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007ffff7557133 in clone () at /lib/x86_64-linux-gnu/libc.so.6

Thread 4 (Thread 0x7ffff51db700 (LWP 66)):
#0 0x00007ffff754a99f in poll () at /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff79af36e in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2 0x00007ffff79af6f3 in g_main_loop_run () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3 0x00007ffff7c05f8a in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#4 0x00007ffff79d8ae1 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5 0x00007ffff7fa6609 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#6 0x00007ffff7557133 in clone () at /lib/x86_64-linux-gnu/libc.so.6

Thread 3 (Thread 0x7ffff61dc700 (LWP 65)):
#0 0x00007ffff754a99f in poll () at /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff79af36e in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2 0x00007ffff79af4a3 in g_main_context_iteration () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3 0x00007ffff79af4f1 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#4 0x00007ffff79d8ae1 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5 0x00007ffff7fa6609 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#6 0x00007ffff7557133 in clone () at /lib/x86_64-linux-gnu/libc.so.6

Thread 2 (Thread 0x7ffff71dd700 (LWP 64)):
#0 0x00007ffff79813f0 in g_bit_lock () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#1 0x00007ffff79ef2dd in g_variant_n_children () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2 0x000055555577a01d in chip::DeviceLayer::ThreadStackManagerImpl::_OnNetworkScanFinished(_GAsyncResult*) (this=0x55555589e660 chip::DeviceLayer::ThreadStackManagerImpl::sInstance, res=0x7ffff0022830) at ../../src/platform/Linux/ThreadStackManagerImpl.cpp:672
#3 0x0000555555779c45 in chip::DeviceLayer::ThreadStackManagerImpl::_OnNetworkScanFinished(_GObject*, _GAsyncResult*, void*) (source_object=0x7ffff000e8b0, res=0x7ffff0022830, user_data=0x55555589e660 chip::DeviceLayer::ThreadStackManagerImpl::sInstance) at ../../src/platform/Linux/ThreadStackManagerImpl.cpp:644
#4 0x00007ffff7b9bf49 in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#5 0x00007ffff7b9cb3d in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#6 0x00007ffff7c0239d in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#7 0x00007ffff7b9bf49 in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#8 0x00007ffff7b9cb3d in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#9 0x00007ffff7bf73a4 in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#10 0x00007ffff7b9bf49 in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#11 0x00007ffff7b9bf8d in () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#12 0x00007ffff79af04e in g_main_context_dispatch () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#13 0x00007ffff79af400 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x00007ffff79af6f3 in g_main_loop_run () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x0000555555769e16 in chip::DeviceLayer::(anonymous namespace)::GLibMainLoopThread(void*) (userData=0x5555558c2de0) at ../../src/platform/Linux/PlatformManagerImpl.cpp:65
#16 0x00007ffff79d8ae1 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#17 0x00007ffff7fa6609 in start_thread () at /lib/x86_64-linux-gnu/libpthread.so.0
#18 0x00007ffff7557133 in clone () at /lib/x86_64-linux-gnu/libc.so.6

Bug prevalence

always

GitHub hash of the SDK that was being used

none

Platform

other

Platform Version(s)

No response

Anything else?

No response

@JuliaPageMorseMicro
Copy link

I saw a crash in a similar location due to read() returning zero into readBytes and so readBuf[-1] being written. Ensure nothing is done if readBytes is 0.

$ git diff examples/platform/linux/NamedPipeCommands.cpp
diff --git a/examples/platform/linux/NamedPipeCommands.cpp b/examples/platform/linux/NamedPipeCommands.cpp
index 9ec28b0121..ac2bf31aa8 100644
--- a/examples/platform/linux/NamedPipeCommands.cpp
+++ b/examples/platform/linux/NamedPipeCommands.cpp
@@ -80,11 +80,14 @@ void * NamedPipeCommands::EventCommandListenerTask(void * arg)
         }
 
         ssize_t readBytes      = read(fd, readbuf, kChipEventCmdBufSize);
-        readbuf[readBytes - 1] = '\0';
-        ChipLogProgress(NotSpecified, "Received payload: \"%s\"", readbuf);
+        if (readBytes > 0)
+        {
+            readbuf[readBytes - 1] = '\0';
+            ChipLogProgress(NotSpecified, "Received payload: \"%s\"", readbuf);
 
-        // Process the received command request from event fifo
-        self->mDelegate->OnEventCommandReceived(readbuf);
+            // Process the received command request from event fifo
+            self->mDelegate->OnEventCommandReceived(readbuf);
+        }
 
         close(fd);
     }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working linux needs triage
Projects
None yet
Development

No branches or pull requests

2 participants