-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-112075: Use relaxed stores for places where we may race with when reading lock-free #115786
Conversation
0d9fef7
to
7b365a2
Compare
Objects/dictobject.c
Outdated
#define STORE_KEY(ep, key) FT_ATOMIC_STORE_PTR_RELAXED(ep->me_key, key) | ||
#define STORE_VALUE(ep, value) FT_ATOMIC_STORE_PTR_RELAXED(ep->me_value, value) | ||
#define STORE_SPLIT_VALUE(mp, idx, value) FT_ATOMIC_STORE_PTR_RELAXED(mp->ma_values->values[idx], value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want "release", although I'm not 100% sure. It's possible that the synchronization from the incref is sufficient even with relaxes stores
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I've failed to capture exactly what you're concerned about, but I think the synchronization around keys and values provides what we need: https://gist.github.com/DinoV/fd83e80e5eb37ffa197dadd9791cca6e because reads to ma_keys and ma_values are fully sequential, and writes to them are done with release stores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that the initialization of the fields within key
or value
may not be fully visible. I think it's probably okay, but only because our mutex implementation uses stronger orderings than conventional locks.
For example, let's say value is a PyLongObject
:
In pseudo-code:
PyLongObject *value = ...;
value->ob_digit = 123456; // (1) non-atomic store that I'm concerned about
...
Py_BEGIN_CRITICAL_SECTION(dict); // (2) lock dict
PyDictKeyEntry *entries = DK_ENTRIES(dict->ma_keys);
entries[...]->me_value = value; // (3) relaxed store
Py_END_CRITICAL_SECTION();
My concern was that (1) would be reordered past (3). I think it's okay because the mutexes used by critical sections use sequential consistency for both locking and unlocking, so (1) can't be reordered past (2). The "conventional" ordering is "acquire" for lock and "release" for unlock, which would not be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think we'll be forced to lock on both threads because _Py_TryXGetRef
is going to fail on the value that's not yet shared across threads and we'll be forced into the locking code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this a bit more, I think (3) really needs to be at least "release".
First, contrary to what I wrote above, the mutex locking isn't sufficient because a seq-cst operation on a variable is weaker than a seq-cst fence -- it doesn't ensure that the write in (1) is visible in this case.
I don't think we can rely on the _Py_TryXGetRef
failing either. The object might have a non-zero shared refcount for a number of reasons.
Here's a simplified model. If you change the relevant "memory_order_release" to "memory_order_relaxed" it will fail:
https://github.com/colesbury/c11-model-checker/blob/cpython-models/test/dict_value.c
7b365a2
to
596ceb4
Compare
8b4659f
to
7c8cd5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
… when reading lock-free (python#115786)
… when reading lock-free (python#115786)
… when reading lock-free (python#115786)
With lock-free reads and iterations we can read from some places w/o locks. This adds relaxed atomic stores to those places.
dict
objects thread-safe in--disable-gil
builds #112075