Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M5 multi fabric test, random crash seen #21684

Closed
vinay-apple opened this issue Aug 5, 2022 · 5 comments · Fixed by #21759
Closed

M5 multi fabric test, random crash seen #21684

vinay-apple opened this issue Aug 5, 2022 · 5 comments · Fixed by #21759

Comments

@vinay-apple
Copy link

M5 multi fabric test, random crash seen

SHA: 63ad9f7

Steps:

  1. Pair M5 in Chai test app and also add it to Apple Home app.
  2. While running some toggle tests, saw random crash on M5

Guru Meditation Error: Core 0 panic'ed (LoadProhibited). Exception was unhandled.

Core 0 register dump:
PC : 0x400fe779 PS : 0x00060c30 A0 : 0x800fe8ac A1 : 0x3ffe9030
0x400fe779: chip::DeviceLayer::DeviceInfoProviderImpl::GetUserLabelLength(unsigned short, unsigned int&) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/examples/providers/DeviceInfoProviderImpl.cpp:132

A2 : 0x00000000 A3 : 0x00000000 A4 : 0x3ffe9080 A5 : 0x00000004
A6 : 0x3ffe9060 A7 : 0x3ffe8f60 A8 : 0x800fe76f A9 : 0x3ffe9020
A10 : 0x3ffe9030 A11 : 0x00000000 A12 : 0x00000028 A13 : 0x3ffe9058
A14 : 0x3ffe8f20 A15 : 0x00000002 SAR : 0x00000016 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x4000c46c LEND : 0x4000c477 LCOUNT : 0x00000000

Backtrace:0x400fe776:0x3ffe90300x400fe8a9:0x3ffe9080 0x400fe8cd:0x3ffe90b0 0x400e0420:0x3ffe90d0 0x400db73a:0x3ffe9130 0x400dcfa5:0x3ffe91e0 0x400dd0f4:0x3ffe9200 0x400dd356:0x3ffe92f0 0x400dd535:0x3ffe9420 0x400dd598:0x3ffe9460 0x4013b939:0x3ffe9480 0x4013b949:0x3ffe94a0 0x401407d5:0x3ffe94c0 0x40140aa6:0x3ffe94e0 0x40140acd:0x3ffe9550 0x40095631:0x3ffe9570
0x400fe776: chip::DeviceLayer::DeviceInfoProviderImpl::GetUserLabelLength(unsigned short, unsigned int&) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/examples/providers/DeviceInfoProviderImpl.cpp:128

0x400fe8a9: chip::DeviceLayer::DeviceInfoProviderImpl::UserLabelIteratorImpl::UserLabelIteratorImpl(chip::DeviceLayer::DeviceInfoProviderImpl&, unsigned short) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/examples/providers/DeviceInfoProviderImpl.cpp:169

0x400fe8cd: chip::DeviceLayer::DeviceInfoProviderImpl::UserLabelIteratorImpl* chip::Platform::New<chip::DeviceLayer::DeviceInfoProviderImpl::UserLabelIteratorImpl, chip::DeviceLayer::DeviceInfoProviderImpl&, unsigned short&>(chip::DeviceLayer::DeviceInfoProviderImpl&, unsigned short&) at /Users/vganji/MATTER-iOS/connectedhomeip/src/lib/support/CHIPMem.h:148
(inlined by) chip::DeviceLayer::DeviceInfoProviderImpl::IterateUserLabel(unsigned short) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/examples/providers/DeviceInfoProviderImpl.cpp:161

0x400e0420: (anonymous namespace)::UserLabelAttrAccess::Read(chip::app::ConcreteReadAttributePath const&, chip::app::AttributeValueEncoder&) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/clusters/user-label-server/user-label-server.cpp:91
(inlined by) Read at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/clusters/user-label-server/user-label-server.cpp:166
(inlined by) Read at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/clusters/user-label-server/user-label-server.cpp:159

0x400db73a: chip::app::ReadSingleClusterData(chip::Access::SubjectDescriptor const&, bool, chip::app::ConcreteReadAttributePath const&, chip::app::AttributeReportIBs::Builder&, chip::app::AttributeValueEncoder::AttributeEncodeState*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/util/ember-compatibility-functions.cpp:476
(inlined by) chip::app::ReadSingleClusterData(chip::Access::SubjectDescriptor const&, bool, chip::app::ConcreteReadAttributePath const&, chip::app::AttributeReportIBs::Builder&, chip::app::AttributeValueEncoder::AttributeEncodeState*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/util/ember-compatibility-functions.cpp:579

0x400dcfa5: chip::app::reporting::Engine::RetrieveClusterData(chip::Access::SubjectDescriptor const&, bool, chip::app::AttributeReportIBs::Builder&, chip::app::ConcreteReadAttributePath const&, chip::app::AttributeValueEncoder::AttributeEncodeState*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/reporting/Engine.cpp:82

0x400dd0f4: chip::app::reporting::Engine::BuildSingleReportDataAttributeReportIBs(chip::app::ReportDataMessage::Builder&, chip::app::ReadHandler*, bool*, bool*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/reporting/Engine.cpp:181

0x400dd356: chip::app::reporting::Engine::BuildAndSendSingleReportData(chip::app::ReadHandler*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/reporting/Engine.cpp:510

0x400dd535: chip::app::reporting::Engine::Run() at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/reporting/Engine.cpp:629

0x400dd598: chip::app::reporting::Engine::Run(chip::System::Layer*, void*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/src/app/reporting/Engine.cpp:582

0x4013b939: chip::System::TimerData::Callback::Invoke() const at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/system/SystemTimer.h:61
(inlined by) chip::System::TimerPoolchip::System::TimerList::Node::Invoke(chip::System::TimerList::Node*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/system/SystemTimer.h:224

0x4013b949: chip::LambdaBridge::Initialize<chip::System::LayerImplFreeRTOS::ScheduleWork(void ()(chip::System::Layer, void*), void*)::{lambda()#1}>(chip::System::LayerImplFreeRTOS::ScheduleWork(void ()(chip::System::Layer, void*), void*)::{lambda()#1} const&)::{lambda(std::aligned_storage<16u, 4u>::type const&)#1}::_FUN(std::aligned_storage<16u, 4u>::type const) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/system/SystemLayerImplFreeRTOS.cpp:94
(inlined by) operator() at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/lib/support/LambdaBridge.h:39
(inlined by) _FUN at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/lib/support/LambdaBridge.h:39

0x401407d5: chip::DeviceLayer::Internal::GenericPlatformManagerImplchip::DeviceLayer::PlatformManagerImpl::_DispatchEvent(chip::DeviceLayer::ChipDeviceEvent const*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl.ipp:252

0x40140aa6: chip::DeviceLayer::PlatformManager::DispatchEvent(chip::DeviceLayer::ChipDeviceEvent const*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/include/platform/PlatformManager.h:437
(inlined by) chip::DeviceLayer::Internal::GenericPlatformManagerImpl_FreeRTOSchip::DeviceLayer::PlatformManagerImpl::_RunEventLoop() at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl_FreeRTOS.ipp:213

0x40140acd: chip::DeviceLayer::PlatformManager::RunEventLoop() at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/include/platform/PlatformManager.h:362
(inlined by) chip::DeviceLayer::Internal::GenericPlatformManagerImpl_FreeRTOSchip::DeviceLayer::PlatformManagerImpl::EventLoopTaskMain(void*) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/esp-idf/chip/../../../../../../config/esp32/third_party/connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl_FreeRTOS.ipp:238

0x40095631: vPortTaskWrapper at /Users/vganji/tools/esp-idf/components/freertos/port/xtensa/port.c:131
63ad9f780899b41db2544f44b34ced525c54384c-M5-crashed-Mutli-fabric.txt

@bzbarsky-apple
Copy link
Contributor

0x400fe779: chip::DeviceLayer::DeviceInfoProviderImpl::GetUserLabelLength(unsigned short, unsigned int&) at /Users/vganji/MATTER-iOS/connectedhomeip/examples/all-clusters-app/esp32/build/../third_party/connectedhomeip/examples/providers/DeviceInfoProviderImpl.cpp:132

That's:

    return mStorage->SyncGetKeyValue(keyAlloc.UserLabelLengthKey(endpoint), &val, len);

I tried doing chip-tool userlabel read label-list 1 1 against the esp32 all-clusters app a few times, and once I got this crash. At that point mStorage was null...

@bzbarsky-apple
Copy link
Contributor

This was broken by #21109.

After that PR we have the following things going on in the esp-32 all-clusters-app, in order:

  1. app_main calls deviceMgr.Init(&EchoCallbacks);. This starts the Matter event loop.
  2. StartAppTask(). This starts the background thread running AppTaskMain.
  3. app_main schedules InitServer to run on the Matter thread.

Now the Matter event loop and AppTaskMain are racing. If the Matter event loop wins the race, it will do Server::Init, which will do the setting up of the storage delegate on whatever the value of the device info provider is at that point (which is null, so nothing will happen). Then AppTaskMain will call AppTask::Init, which calls chip::DeviceLayer::SetDeviceInfoProvider(&gExampleDeviceInfoProvider);

Unlike other platforms, there is no direct set of the storage delegate on gExampleDeviceInfoProvider. And in any case, doing that set racily with the server init on the other thread is just broken....

@yufengwangca Please fix? I expect the other esp32 example apps have the same problem.

@yufengwangca
Copy link
Contributor

We need to make sure the AppTaskMain -> AppTask::Init -> chip::DeviceLayer::SetDeviceInfoProvider(&gExampleDeviceInfoProvider) run before the Server::Init run in the Matter event loop

@shubhamdp
Copy link
Contributor

@yufengwangca can 0855da3 be the probable fix for this?

@yufengwangca
Copy link
Contributor

@yufengwangca can 0855da3 be the probable fix for this?

I don't this could completely fix this issue by calling SetDeviceInfoProvider before starting the app main thread, since chip main thread has already started.

The root cause of this issue is there is no guarantee, SetDeviceInfoProvider (in app main thread) is called before Server::Init (in chip main thread) since they are running in the different context. We need to set DeviceInfoProvider before we start the chip main thread. Please see my fix #21759

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants