Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SQA] Ble-wifi commissioning gets fail in RPI platform(Linux) #32033

Closed
Survensa opened this issue Feb 9, 2024 · 14 comments · Fixed by #32093
Closed

[SQA] Ble-wifi commissioning gets fail in RPI platform(Linux) #32033

Survensa opened this issue Feb 9, 2024 · 14 comments · Fixed by #32093
Assignees
Labels

Comments

@Survensa
Copy link

Survensa commented Feb 9, 2024

Feature Area

Other

Test Case

Ble-wifi Pairing

Reproduction steps

  1. Built all-clusters-app and chiptool in two seperate RPI
  2. Brought up DUT using commnad : ./chip-all-clusters-app --wifi
  3. Tried pairing CNTRL using ./chip-tool pairing ble-wifi 1 SSID SSID-PWD 20202021 3840

Expected behavior: All-clusters-app and Chip-tool should be commission over Ble-wifi transport without getting any error.

Actual behavior:

[1707463305.699491][9835:9836] CHIP:BLE: New device scanned: E4:5F:01:43:42:16
[1707463305.699583][9835:9836] CHIP:BLE: Device discriminator match. Attempting to connect.
[1707463305.705138][9835:9836] CHIP:BLE: ChipDeviceScanner has stopped scanning!
[1707463305.824149][9835:9836] CHIP:DL: FAIL: ConnectDevice : GDBus.Error:org.freedesktop.DBus.Error.UnknownObject: Method "Connect" with signature "" on interface "org.bluez.Device1" doesn't exist
(41)
[1707463305.824472][9835:9837] CHIP:DL: HandlePlatformSpecificBLEEvent 16386
[1707463305.824542][9835:9837] CHIP:DIS: Closing all BLE connections
[1707463305.824645][9835:9837] CHIP:TOO: Pairing Failure: src/platform/Linux/bluez/BluezEndpoint.cpp:748: CHIP Error 0x000000AC: Internal error
[1707463305.825235][9835:9835] CHIP:CTL: Shutting down the commissioner
[1707463305.825442][9835:9835] CHIP:CTL: Shutting down the controller
[1707463305.825674][9835:9835] CHIP:IN: Expiring all sessions for fabric 0x1!!
[1707463305.825726][9835:9835] CHIP:FP: Forgetting fabric 0x1
[1707463305.825797][9835:9835] CHIP:TS: Pending Last Known Good Time: 2023-10-14T01:16:48
[1707463305.826075][9835:9835] CHIP:TS: Previous Last Known Good Time: 2023-10-14T01:16:48
[1707463305.826135][9835:9835] CHIP:TS: Reverted Last Known Good Time to previous value
[1707463305.826226][9835:9835] CHIP:CTL: Shutting down the commissioner
[1707463305.826383][9835:9835] CHIP:CTL: Shutting down the controller
[1707463305.826440][9835:9835] CHIP:CTL: Shutting down the System State, this will teardown the CHIP Stack
[1707463305.826787][9835:9835] CHIP:DMG: All ReadHandler-s are clean, clear GlobalDirtySet
[1707463305.826979][9835:9835] CHIP:FP: Shutting down FabricTable
[1707463305.827055][9835:9835] CHIP:TS: Pending Last Known Good Time: 2023-10-14T01:16:48
[1707463305.827288][9835:9835] CHIP:TS: Previous Last Known Good Time: 2023-10-14T01:16:48
[1707463305.827358][9835:9835] CHIP:TS: Reverted Last Known Good Time to previous value
[1707463305.827898][9835:9835] CHIP:DL: writing settings to file (/tmp/chip_counters.ini-NTxUbk)
[1707463305.829240][9835:9835] CHIP:DL: renamed tmp file to file (/tmp/chip_counters.ini)
[1707463305.829391][9835:9835] CHIP:DL: NVS set: chip-counters/total-operational-hours = 0 (0x0)
[1707463305.829450][9835:9835] CHIP:DL: Inet Layer shutdown
[1707463305.829500][9835:9835] CHIP:DL: BLE shutdown
[1707463305.833918][9835:9835] CHIP:DL: System Layer shutdown
[1707463305.835734][9835:9835] CHIP:TOO: Run command failure: src/platform/Linux/bluez/BluezEndpoint.cpp:748: CHIP Error 0x000000AC: Internal error

Bug prevalence

Whenever I do

GitHub hash of the SDK that was being used

3ec1ba5

Platform

raspi

Anything else?

  • After merging this PR Linux: Implement PDC support in WiFiDriver / ConnectivityManagerImpl #31983, the All-clusters-app is not commissioning with the chip-tool over the Ble-wifi transport.
  • However the latest TOT Chip-tool is pairing successfully with older All-clusters-app at the same time latest All-clusters-app is not commissioning with older chip-tool in Ble-wifi transport.
  • So it seems, there is a bug in All-clusters-app.

DUT Log: All-clusters-app_log.txt
Chip-tool Log: Chip-tool_log.txt

@Survensa
Copy link
Author

Tried with other sample-apps as well and the ble-wifi commisioning gets fail by throwing an below error:

Run command failure: src/platform/Linux/bluez/BluezEndpoint.cpp:748: CHIP Error 0x000000AC: Internal error

@Survensa Survensa changed the title [SQA] Ble-wifi commissioning gets fail in RPI platform(All-Clusters-app) [SQA] Ble-wifi commissioning gets fail in RPI platform(Linux) Feb 12, 2024
@ksperling-apple
Copy link
Contributor

The error here is within the BLE code on the client-side stack (within chip-tool), failing to even establish a PASE connection with the device. No BLE code was changed in #31983, and we're not even to getting anywhere near exercising the WiFi driver code on the device side here, so this issue is unrelated to that PR.

The key error seems to be this:

CHIP:DL: FAIL: ConnectDevice : GDBus.Error:org.freedesktop.DBus.Error.UnknownObject: Method "Connect" with signature "" on interface "org.bluez.Device1" doesn't exist
(41)

From a brief web search it looks like this is some kind of issue where bluez is wedged with respect to the pairing state of the particular remote device. Possibly something that our Linux bluez integration needs to handle better?

See e.g.

@ksperling-apple ksperling-apple removed their assignment Feb 12, 2024
@tcarmelveilleux
Copy link
Contributor

Both #31828 and #31668 touched BlueZ-level stuff.

@arkq FYI

@Survensa
Copy link
Author

Survensa commented Feb 14, 2024

@arkq Tried pulling your PR to check the pairing

Sample app used : All Cluster App

I am able to pair via BLE-WiFi for the first time (i.e., first pair after I reboot the whole device). After that, I killed the app then cleared the temp files and attempted pairing again, but encountered some communication issues.

Attached log for your reference:

@arkq
Copy link
Contributor

arkq commented Feb 14, 2024

In the provided sample app log I can see Chip stack locking error at 'src/app/EventManagement.cpp:416'. Code is unsafe/racy which means that the app has terminated prematurely. That's the root cause of commissioning failure.

@yunhanw-google could you look at this? You've introduced that assertion a week ago.

@yunhanw-google
Copy link
Contributor

I think this is expected one, it means the application code call this API from non-matter thread, you need to call this from matter thread.

@arkq
Copy link
Contributor

arkq commented Feb 15, 2024

@Survensa please try again after applying PR #32093 and PR #32137

@Survensa
Copy link
Author

@arkq

@Survensa please try again after applying PR #32093 and PR #32137

I attempted multiple pairings by killing the app and clearing the temporary files. Here are the results: I was able to successfully pair each time when I defined my own discriminator, such as "--discriminator 3848". However, when attempting to pair with the default discriminator 3840, the pairing fails intermittently, occurring approximately 3 out of 10 times.

@arkq
Copy link
Contributor

arkq commented Feb 16, 2024

Could you attach logs from failed attempt?

@Survensa
Copy link
Author

@arkq
Copy link
Contributor

arkq commented Feb 16, 2024

It looks like BLE connection interference. In the current form of Matter specification (spec for Bluetooth Transport Protocol) we can no do anything about it:

Logs from controller:

[1708085434.753382][1979:1981] CHIP:IN: Message appended to BLE send queue
[1708085434.753442][1979:1981] CHIP:SC: Sent PBKDF param request
[1708085435.266876][1979:1981] CHIP:DL: HandlePlatformSpecificBLEEvent 16387
[1708085435.537691][1979:1981] CHIP:DL: HandlePlatformSpecificBLEEvent 16388
[1708085435.537840][1979:1981] CHIP:BLE: subscribe complete, ep = 0xaaaae5b76358
[1708085441.117986][1979:1980] CHIP:DL: Bluez disconnected
[1708085441.118061][1979:1980] CHIP:DL: Bluez notify CHIPoBluez connection disconnected
[1708085441.118282][1979:1981] CHIP:IN: Failed to establish BLE connection: src/platform/Linux/BLEManagerImpl.cpp:536: Ble Error 0x0000040C: Remote device closed BLE connection
[1708085441.118333][1979:1981] CHIP:IN: Clearing BLE pending packets.
[1708085451.764241][1979:1981] CHIP:SC: PASESession timed out while waiting for a response from the peer. Expected message type was 33
[1708085451.764334][1979:1981] CHIP:IN: SecureSession[0xffff84037cf0]: Released - Type:1 LSID:18197
[1708085451.764387][1979:1981] CHIP:TOO: Secure Pairing Failed
[1708085451.764459][1979:1981] CHIP:DIS: Closing all BLE connections
[1708085451.764530][1979:1981] CHIP:TOO: Pairing Failure: src/protocols/secure_channel/PASESession.cpp:258: CHIP Error 0x00000032: Timeout

And on the DUT side I can not see any logs related with BLE communication.

@cjandhyala
Copy link
Contributor

@arkq On my setup I saw two issues when I tried to pair over ble (using chip-tool pairing ble-wifi command i.e. using ./out/chip-tool/chip-tool pairing ble-wifi 80 SSID PASSWD 20202021 2555 and all-cluster-app launched with ./out/all-clusters-app/chip-all-clusters-app --wifi --discriminator 2555)

  1. In one iteration , I see the all-cluster-app (running on raspi) not connecting to the WiFi AP.
  2. In other iteration, I noticed chip-tool could not establish BT connection with the all-cluster-app.

Attached logs for both scenarios..

chip-tool-log_not_connecting_to_wifi.txt
raspi_ACA_log_not_connecting_to_wifi.txt

chip_tool_log_bt_not_Connected.txt
rasp_log_ACA_BT_not connected.txt

@arkq
Copy link
Contributor

arkq commented Feb 16, 2024

Ad.1: It seems that your RPi is not configured properly. D-Bus has rejected the call from the chip-all-clusters-app to wpa_supplicant. Either you need to configure properly D-Bus on your RPi, or run chip-all-clusters-app as root (e.g. with sudo):

[1708101548.913713][12429:12430] CHIP:DL: wpa_supplicant: Start WiFi management
[1708101548.922068][12429:12430] CHIP:DL: wpa_supplicant: connected to wpa_supplicant proxy
[1708101548.923692][12429:12430] CHIP:DL: wpa_supplicant: can't find interface wlan0: GDBus.Error:org.freedesktop.DBus.Error.AccessDenied: Rejected send message, 2 matched rules; type="method_call", sender=":1.152" (uid=1000 pid=12429 comm="./out/all-clusters-app/chip-all-clusters-app --wif" label="unconfined") interface="fi.w1.wpa_supplicant1" member="GetInterface" error name="(unset)" requested_reply="0" destination=":1.6" (uid=0 pid=810 comm="/sbin/wpa_supplicant -u -s -O /run/wpa_supplicant " label="unconfined")
[1708101548.923758][12429:12430] CHIP:DL: wpa_supplicant: try to create interface wlan0
[1708101548.925241][12429:12430] CHIP:DL: wpa_supplicant: failed to create interface wlan0: GDBus.Error:org.freedesktop.DBus.Error.AccessDenied: Rejected send message, 2 matched rules; type="method_call", sender=":1.152" (uid=1000 pid=12429 comm="./out/all-clusters-app/chip-all-clusters-app --wif" label="unconfined") interface="fi.w1.wpa_supplicant1" member="CreateInterface" error name="(unset)" requested_reply="0" destination=":1.6" (uid=0 pid=810 comm="/sbin/wpa_supplicant -u -s -O /run/wpa_supplicant " label="unconfined")
[1708101549.416176][12429:12429] CHIP:-: Wi-Fi Management taking too long to start - device configuration will be reset.

@arkq
Copy link
Contributor

arkq commented Feb 16, 2024

Ad.2: I would count is as BLE connection interference (it can happen from time to time):

[1708108374.296743][12951:12953] CHIP:EM: <<< [E:11845i S:0 M:179379553] (U) Msg TX to 0:0000000000000000 [0000] [BLE] --- Type 0000:20 (SecureChannel:PBKDFParamRequest)
[1708108374.296829][12951:12953] CHIP:IN: Message appended to BLE send queue
[1708108374.296895][12951:12953] CHIP:SC: Sent PBKDF param request
[1708108374.729119][12951:12953] CHIP:DL: HandlePlatformSpecificBLEEvent 16387
[1708108374.827783][12951:12953] CHIP:DL: HandlePlatformSpecificBLEEvent 16388
[1708108374.827868][12951:12953] CHIP:BLE: subscribe complete, ep = 0xaaaadd326358
[1708108389.310564][12951:12953] CHIP:BLE: connect handshake timed out, closing ep 0xaaaadd326358
[1708108389.310678][12951:12953] CHIP:IN: Failed to establish BLE connection: src/ble/BLEEndPoint.cpp:1522: Ble Error 0x00000415: Connect handshake timed out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants