Lock concurrent access to mWpaSupplicant #9704

tewarid · 2021-09-14T20:25:16Z

Lock concurrent access to mWpaSupplicant.

Problem

state in GDBusWpaSupplicant may be concurrently accessed by multiples threads leading to race condition.

Change overview

Lock concurrent access to mWpaSupplicant.

Testing

Tested using a custom embedded linux system with all-clusters-app example, and macOS chip-tool for provisioning.

mspang

Please use a mutex.

tewarid · 2021-09-14T22:39:32Z

Please use a mutex.

@mspang Please see #9322 (comment) where decision was made with @bzbarsky-apple to use std::atomic. Why is a lock called for in this case?

Code uses std::atomic here under similar circumstance

connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl_POSIX.h

Line 113 in 181ab66

std::atomic<bool> mShouldRunEventLoop;

Code also uses volatile

connectedhomeip/src/inet/InetLayer.h

Line 167 in 181ab66

volatile enum {

mspang · 2021-09-14T23:14:27Z

Please use a mutex.

@mspang Please see #9322 (comment) where decision was made with @bzbarsky-apple to use std::atomic. Why is a lock called for in this case?

Code uses std::atomic here under similar circumstance

connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl_POSIX.h

Line 113 in 181ab66

std::atomic<bool> mShouldRunEventLoop;

It's not quite the same, but this was probably a mistake.

Code also uses volatile

That doesn't avoid the race except in certain compilers (MSVC) using nonstandard assumptions.

connectedhomeip/src/inet/InetLayer.h

Line 167 in 181ab66

volatile enum {

kghost

I understand that there can be racing, due to there 2 directions which can go into the class:

From CHIP stack.
From dbus callback.

The better solution can be dispatching the dbus callback back into the chip main thread, so it will be guarded by the chip global lock

Anyway, if this PR resolves the racing problem, there is no reason to block it. But it would be better if leave notes about it, and maybe we can improve it later.

src/platform/Linux/ConnectivityManagerImpl.cpp

tewarid · 2021-09-16T10:13:24Z

The better solution can be dispatching the dbus callback back into the chip main thread, so it will be guarded by the chip global lock

@kghost Could you please point to an example in the code?

tewarid · 2021-09-16T10:25:52Z

Please use a mutex.

@mspang Please see #9322 (comment) where decision was made with @bzbarsky-apple to use std::atomic. Why is a lock called for in this case?
Code uses std::atomic here under similar circumstance

connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl_POSIX.h

Line 113 in 181ab66

std::atomic<bool> mShouldRunEventLoop;

It's not quite the same, but this was probably a mistake.

You're right, but don't think it is a mistake - a separate mutex is used to lock critical section.

Code also uses volatile

That doesn't avoid the race except in certain compilers (MSVC) using nonstandard assumptions.

Not a concern of this PR, but does this need to be fixed?

connectedhomeip/src/inet/InetLayer.h

Line 167 in 181ab66

volatile enum {

kghost · 2021-09-16T15:11:31Z

@kghost Could you please point to an example in the code?

I mean that modify the main loop, use a single loop for both chip sdk and d-bus, such that there is only one thread, to prevent the racing.

It won't be an easy change, I need to invest that.

src/platform/Linux/ConnectivityManagerImpl.cpp

mspang · 2021-09-16T18:22:32Z

Please use a mutex.

@mspang Please see #9322 (comment) where decision was made with @bzbarsky-apple to use std::atomic. Why is a lock called for in this case?
Code uses std::atomic here under similar circumstance

connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl_POSIX.h

Line 113 in 181ab66

std::atomic<bool> mShouldRunEventLoop;

It's not quite the same, but this was probably a mistake.

You're right, but don't think it is a mistake - a separate mutex is used to lock critical section.

The reasons it was a mistake:

Because people will copy it
Because atomics are subtle
Because relaxed atomics are especially subtle to the extent that the folks who standardized them aren't confident that they can be used safely

Not because it doesn't work or encounters a race condition (it does mean that the load that results in event loop termination does not synchronize-with the thread that stored the termination condition, which is where there is room for controversy). Basically, it's not worth the risk and the controversy. Mutex is a more understandable tool and gives you a real critical section.

Code also uses volatile

That doesn't avoid the race except in certain compilers (MSVC) using nonstandard assumptions.

Not a concern of this PR, but does this need to be fixed?

connectedhomeip/src/inet/InetLayer.h

Line 167 in 181ab66

volatile enum {

mspang · 2021-09-16T18:23:46Z

Please use a mutex.

@mspang Please see #9322 (comment) where decision was made with @bzbarsky-apple to use std::atomic. Why is a lock called for in this case?
Code uses std::atomic here under similar circumstance

connectedhomeip/src/include/platform/internal/GenericPlatformManagerImpl_POSIX.h

Line 113 in 181ab66

std::atomic<bool> mShouldRunEventLoop;

It's not quite the same, but this was probably a mistake.

You're right, but don't think it is a mistake - a separate mutex is used to lock critical section.

The reasons it was a mistake:

Because people will copy it

Because atomics are subtle

Because relaxed atomics are especially subtle to the extent that the folks who standardized them aren't confident that they can be used safely

Not because it doesn't work or encounters a race condition (it does mean that the load that results in event loop termination does not synchronize-with the thread that stored the termination condition, which is where there is room for controversy). Basically, it's not worth the risk and the controversy. Mutex is a more understandable tool and gives you a real critical section.

BTW, I wrote that code, so you can believe me when I say it was a mistake.

Code also uses volatile

That doesn't avoid the race except in certain compilers (MSVC) using nonstandard assumptions.

Not a concern of this PR, but does this need to be fixed?

connectedhomeip/src/inet/InetLayer.h

Line 167 in 181ab66

volatile enum {

mspang · 2021-09-16T20:45:48Z

Not a concern of this PR, but does this need to be fixed?

connectedhomeip/src/inet/InetLayer.h

Line 167 in 181ab66

volatile enum {

Yes, this code is incorrect and should be fixed, C++11 is the first C++ with a memory model and it did not make volatile a synchronization operation. In fact volatile doesn't have any standard semantics, it's implementation defined what it means.

* Lock concurrent access to mWpaSupplicant * Use std::lock_guard to manage std::mutex

boring-cyborg bot added linux platform labels Sep 14, 2021

restyled-io bot mentioned this pull request Sep 14, 2021

Restyle Declare state in GDBusWpaSupplicant as std::atomic #9705

Closed

mspang requested changes Sep 14, 2021

View reviewed changes

pullapprove bot requested review from bzbarsky-apple, chrisdecenzo, Damian-Nordic, hawk248, jepenven-silabs and msandstedt September 14, 2021 22:45

pullapprove bot added the review - pending label Sep 14, 2021

woody-apple added the SDK Approved label Sep 14, 2021

Lock concurrent access to mWpaSupplicant

6692707

tewarid force-pushed the make-state-atomic branch from e69ac41 to 6692707 Compare September 15, 2021 12:12

tewarid requested a review from mspang September 15, 2021 12:15

tewarid changed the title ~~Declare state in GDBusWpaSupplicant as std::atomic~~ Lock concurrent access to mWpaSupplicant Sep 15, 2021

woody-apple approved these changes Sep 16, 2021

View reviewed changes

pullapprove bot requested review from cecille, emargolis, erjiaqing, harimau-qirex, holbrookt, jelderton, jmartinez-silabs, kghost, LuDuda, mlepage-google and mrjerryjohns September 16, 2021 01:07

pullapprove bot requested review from saurabhst and yufengwangca September 16, 2021 01:07

msandstedt approved these changes Sep 16, 2021

View reviewed changes

kghost reviewed Sep 16, 2021

View reviewed changes

src/platform/Linux/ConnectivityManagerImpl.cpp Outdated Show resolved Hide resolved

emargolis approved these changes Sep 16, 2021

View reviewed changes

Use std::lock_guard to manage std::mutex

ccbb4de

tewarid requested a review from kghost September 16, 2021 11:26

jmartinez-silabs approved these changes Sep 16, 2021

View reviewed changes

pullapprove bot added review - changed requested and removed review - pending labels Sep 16, 2021

mrjerryjohns reviewed Sep 16, 2021

View reviewed changes

src/platform/Linux/ConnectivityManagerImpl.cpp Show resolved Hide resolved

mspang approved these changes Sep 16, 2021

View reviewed changes

pullapprove bot added review - approved and removed review - changed requested labels Sep 16, 2021

LuDuda approved these changes Sep 16, 2021

View reviewed changes

mspang merged commit 4198a3c into project-chip:master Sep 17, 2021

tewarid deleted the make-state-atomic branch September 17, 2021 11:47

mleisner pushed a commit to mleisner/connectedhomeip that referenced this pull request Sep 20, 2021

Lock concurrent access to mWpaSupplicant (project-chip#9704)

1a12da7

* Lock concurrent access to mWpaSupplicant * Use std::lock_guard to manage std::mutex

arkq mentioned this pull request Oct 17, 2022

Check WPA supplicant state before saving config #22895

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lock concurrent access to mWpaSupplicant #9704

Lock concurrent access to mWpaSupplicant #9704

tewarid commented Sep 14, 2021 •

edited

Loading

mspang left a comment

tewarid commented Sep 14, 2021

mspang commented Sep 14, 2021

kghost left a comment •

edited

Loading

tewarid commented Sep 16, 2021

tewarid commented Sep 16, 2021

kghost commented Sep 16, 2021

mspang commented Sep 16, 2021 •

edited

Loading

mspang commented Sep 16, 2021

mspang commented Sep 16, 2021

Lock concurrent access to mWpaSupplicant #9704

Lock concurrent access to mWpaSupplicant #9704

Conversation

tewarid commented Sep 14, 2021 • edited Loading

Problem

Change overview

Testing

mspang left a comment

Choose a reason for hiding this comment

tewarid commented Sep 14, 2021

mspang commented Sep 14, 2021

kghost left a comment • edited Loading

Choose a reason for hiding this comment

tewarid commented Sep 16, 2021

tewarid commented Sep 16, 2021

kghost commented Sep 16, 2021

mspang commented Sep 16, 2021 • edited Loading

mspang commented Sep 16, 2021

mspang commented Sep 16, 2021

tewarid commented Sep 14, 2021 •

edited

Loading

kghost left a comment •

edited

Loading

mspang commented Sep 16, 2021 •

edited

Loading