Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashe when scanning QR code - VerifyOrDie failure at ../../../../../../../../../../../connectedhomeip/src/lib/support/Pool.h:217: Allocated() == 0 #15760

Closed
kean-apple opened this issue Mar 2, 2022 · 4 comments · Fixed by #15816

Comments

@kean-apple
Copy link

kean-apple commented Mar 2, 2022

Problem

iOS chiptool crashes when trying to scanning QR. code

SHA: 3f78165

  1. Launch iOS chiptool
  2. Go to QR code scanner - try to scan in QR code to pair M5 board3.

In step 2, chiptool crashes..20% of times

022-03-02 15:01:20.221121-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080221] [3912:194118] CHIP: [CTL] Setup in progress, stopping setup before shutting down
2022-03-02 15:01:20.221212-0800 localhost CHIPTool[3912]: (CHIP) DevicePairingDelegate status updated: 1
2022-03-02 15:01:20.221391-0800 localhost CHIPTool[3912]: (CHIP) DevicePairingDelegate Pairing complete. Status ../../../../../../../../../../../connectedhomeip/src/controller/CHIPDeviceController.cpp:660: CHIP Error 0x00000002: Connection aborted
2022-03-02 15:01:20.221516-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080221] [3912:194118] CHIP: [CTL] Shutting down the controller
2022-03-02 15:01:20.221590-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080221] [3912:194118] CHIP: [CTL] Shutting down the System State, this will teardown the CHIP Stack
2022-03-02 15:01:20.221647-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080221] [3912:194118] CHIP: [DMG] IM WH moving to [Uninitialized]
2022-03-02 15:01:20.221611-0800 localhost CHIPTool[3912]: (CoreFoundation) [com.apple.CFBundle:strings] Bundle: CFBundle 0x10e1041e0 </private/var/containers/Bundle/Application/1F15B8D0-0430-49D7-AEE2-A589DB036320/CHIPTool.app> (executable, loaded), key: Undefined error:%u., value: , table: Localizable, localizationName: (null), result: Undefined error:%u.
2022-03-02 15:01:20.221701-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080221] [3912:194118] CHIP: [DMG] IM WH moving to [Uninitialized]
2022-03-02 15:01:20.221754-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080221] [3912:194118] CHIP: [DMG] IM WH moving to [Uninitialized]
2022-03-02 15:01:20.221806-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080221] [3912:194118] CHIP: [DMG] IM WH moving to [Uninitialized]
2022-03-02 15:01:20.221820-0800 localhost CHIPTool[3912]: Got pairing error back Error Domain=CHIPErrorDomain Code=1 "Undefined error:2." UserInfo={NSLocalizedDescription=Undefined error:2., errorCode=2}
2022-03-02 15:01:20.222061-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080222] [3912:194118] CHIP: [BLE] CancelConnection
2022-03-02 15:01:20.222123-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080222] [3912:194118] CHIP: [DL] Inet Layer shutdown
2022-03-02 15:01:20.222243-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080222] [3912:194118] CHIP: [DL] BLE shutdown
2022-03-02 15:01:20.222391-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080222] [3912:194118] CHIP: [DL] System Layer shutdown
2022-03-02 15:01:20.222554-0800 localhost CHIPTool[3912]: (CHIP) [com.zigbee.chip:all] � [1646262080222] [3912:194118] CHIP: [SPT] VerifyOrDie failure at ../../../../../../../../../../../connectedhomeip/src/lib/support/Pool.h:217: Allocated() == 0

Proposed Solution

<suggested fix, suggested enhancement>
chiptool-crash-scanning-qr-code2.txt
chiptool-crash-scanning-qr-code.txt

@sagar-apple
Copy link
Contributor

@bzbarsky-apple here's an example of a crash occurring during shutdown.

@woody-apple
Copy link
Contributor

Looks not iOS related

@woody-apple woody-apple changed the title iOS chiptool crashes when scanning QR code - VerifyOrDie failure at ../../../../../../../../../../../connectedhomeip/src/lib/support/Pool.h:217: Allocated() == 0 Crashe when scanning QR code - VerifyOrDie failure at ../../../../../../../../../../../connectedhomeip/src/lib/support/Pool.h:217: Allocated() == 0 Mar 3, 2022
@bzbarsky-apple
Copy link
Contributor

Relevant stack:

stop reason = signal SIGABRT
    frame #0: 0x00000001d5e01b78 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000002103583bc libsystem_pthread.dylib`pthread_kill + 268
    frame #2: 0x00000001a957451c libsystem_c.dylib`abort + 168
    frame #3: 0x000000010e5ed218 CHIP`::chipAbort() + 16 [opt]
    frame #4: 0x000000010e671ac4 CHIP`chip::BitMapObjectPool<chip::Messaging::ExchangeContext, 8ul>::~BitMapObjectPool(this=0x0000000103028168) + 116 [opt]
  * frame #5: 0x000000010e671a28 CHIP`chip::BitMapObjectPool<chip::Messaging::ExchangeContext, 8ul>::~BitMapObjectPool(this=0x0000000103028168) + 32 [opt]
    frame #6: 0x000000010e6719e4 CHIP`chip::Messaging::ExchangeManager::~ExchangeManager(this=0x0000000103028000) + 52 [opt]
    frame #7: 0x000000010e670f7c CHIP`chip::Messaging::ExchangeManager::~ExchangeManager() + 32
    frame #8: 0x000000010e646118 CHIP`void chip::Platform::Delete<chip::Messaging::ExchangeManager>(p=0x0000000103028000) + 32 [opt]
    frame #9: 0x000000010e645f48 CHIP`chip::Controller::DeviceControllerSystemState::Shutdown(this=0x00000002818f8360) + 328 [opt]
    frame #10: 0x000000010e631ba8 CHIP`chip::Controller::DeviceControllerSystemState::Release(this=0x00000002818f8360) + 148 [opt]
    frame #11: 0x000000010e6316a4 CHIP`chip::Controller::DeviceController::Shutdown(this=0x0000000108027c00) + 124 [opt]
    frame #12: 0x000000010e63393c CHIP`chip::Controller::DeviceCommissioner::Shutdown(this=0x0000000108027c00) + 212 [opt]
    frame #13: 0x000000010d8a4420 CHIP`__32-[CHIPDeviceController shutdown]_block_invoke(.block_descriptor=0x0000000283285890) at CHIPDeviceController.mm:129:37

@bzbarsky-apple
Copy link
Contributor

So presumably we have an outstanding unclosed exchange when the shutdown happens?

I see nothing in ExchangeManager::Shutdown closing outstanding exchanges. Should there be something there? Maybe not, if we are trying to detect leaked exchanges...

I also see nothing in SessionManager::Shutdown expiring sessions (which would presumably time out exchanges for those sessions, and thus close them). Should there be something here? I would think yes.

bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Mar 3, 2022
We already handled shutdown of any ongoing PASE bits.

This PR adds two more things:

1) Shutting down any ongoing CASE session establishment exchanges for
   which we are the initiator.  This is done by shutting down all the
   operational device proxies on our mCASESessionManager (since we own
   all of those anyway) and fixing operational device proxy
   shutdown/destruction to actually clean up the CASEClient if we're
   still in the middle of CASE establishment.

2) Expiring the SecureSessions for our fabric, so that any still-open
   operational exchanges for those sessions get closed correctly (with
   a timeout).  This is needed because our client cluster APIs don't
   give us any way to cancel the operation (invoke, read, write, etc)
   and we need to make sure those get cleaned up when we shut down.

Fixes project-chip#15760
bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Mar 3, 2022
We already handled shutdown of any ongoing PASE bits.

This PR adds two more things:

1) Shutting down any ongoing CASE session establishment exchanges for
   which we are the initiator.  This is done by shutting down all the
   operational device proxies on our mCASESessionManager (since we own
   all of those anyway) and fixing operational device proxy
   shutdown/destruction to actually clean up the CASEClient if we're
   still in the middle of CASE establishment.

2) Expiring the SecureSessions for our fabric, so that any still-open
   operational exchanges for those sessions get closed correctly (with
   a timeout).  This is needed because our client cluster APIs don't
   give us any way to cancel the operation (invoke, read, write, etc)
   and we need to make sure those get cleaned up when we shut down.

In addition to that:

* Reject wrong-fabric results in
  DeviceCommissioner::OnOperationalNodeResolved (due to buggy minimal
  mdns), so if we start sharing a CASESessionManager across
  controllers we will not be in a position where we are ending up with
  CASE sessions we create but can't tear down.
* Fix CASE shutdown to not leave a dangling MRP entry after it shuts
  down the exchange.

Fixes project-chip#15760
andy31415 pushed a commit that referenced this issue Mar 3, 2022
…15816)

We already handled shutdown of any ongoing PASE bits.

This PR adds two more things:

1) Shutting down any ongoing CASE session establishment exchanges for
   which we are the initiator.  This is done by shutting down all the
   operational device proxies on our mCASESessionManager (since we own
   all of those anyway) and fixing operational device proxy
   shutdown/destruction to actually clean up the CASEClient if we're
   still in the middle of CASE establishment.

2) Expiring the SecureSessions for our fabric, so that any still-open
   operational exchanges for those sessions get closed correctly (with
   a timeout).  This is needed because our client cluster APIs don't
   give us any way to cancel the operation (invoke, read, write, etc)
   and we need to make sure those get cleaned up when we shut down.

In addition to that:

* Reject wrong-fabric results in
  DeviceCommissioner::OnOperationalNodeResolved (due to buggy minimal
  mdns), so if we start sharing a CASESessionManager across
  controllers we will not be in a position where we are ending up with
  CASE sessions we create but can't tear down.
* Fix CASE shutdown to not leave a dangling MRP entry after it shuts
  down the exchange.

Fixes #15760
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants