-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arm64: alternative implementation of atomics using CAS #164
Conversation
"atomic_store doesn't incorrectly set the atomic value if it changes while in the critical zone" - based on the discussion in #160 we should simply say, that any store between the load/store makes the operation undefined. "the wrong value is provided as "expected" through temp_reg" - this already makes the operation undefined There should be a test for reading data between the load/store. There should be a test for op_flags as well (an existing test can be extended though). The atomic load/store must satisfy many rules, maybe creating a list of them would look better. Do you want to work on this or shall I do it? |
OK, but it should be important to note that might apply to both read and write operations, and not only to the memory address, but also its surrounding area. Indeed, really old ARM cpu's which luckily we don't support, would invalidate locks if any write was done anywhere in memory[1]
That needs clarification in the documentation as well, note that the fact that ARM, and any other I would also argue that calling it "temp_reg" is deceiving, and should be instead "state_reg" or something that better identifies it's actual use.
I would instead argue there should be no reads, so that there are no more surprises when all other CPUs are implemented.
Considering I might seem to be still confused on how this is to be used, it might be better if you do it. [1] https://en.wikipedia.org/wiki/Load-link/store-conditional |
I would not limit it to an area. All memory writes, regardless of the address, makes the operation undefined. I cannot think any practical uses, where writing cannot be postponed after the store.
The comment says: temp_reg must contain the value loaded into dst_reg by the sljit_emit_atomic_load operation (the store operation does not preserve the value of temp_reg) I think "temp" emphasize that its value is not preserved, but I am open to any ideas. I don't like the "state" term, since it does not contain any status.
This is certainly possible, but it would be strange, if reads would block a memory store. |
b672667
to
c8d1db9
Compare
test_src/sljitTest.c
Outdated
sljit_emit_atomic_load(compiler, SLJIT_MOV_U32, SLJIT_R2, SLJIT_S2); | ||
sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_R0, 0, SLJIT_R2, 0); | ||
sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_S3, 0, SLJIT_R2, 0); | ||
sljit_emit_op1(compiler, SLJIT_MOV, SLJIT_R1, 0, SLJIT_IMM, 987654321); | ||
sljit_emit_op1(compiler, SLJIT_MOV_U32, SLJIT_R1, 0, SLJIT_IMM, 987654321); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing this should be elsewhere.
8c15dd4
to
09274b0
Compare
fcab634
to
70f906c
Compare
I think the patch is quite nice now! You can add I realized one type of test is missing: the double load.
|
Only available with ARMv8.1 or higher, so behind a flag that would be enabled by an ACE compatible compiler depending on -march. As a sideeffect, avoids the races that could result in an infloop with M1.
At least with ARM's LL/SC, a second load invalidates the reservation of the first, which IMHO would be also in line with the constrain to only support one atomic per thread that was documented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for doing all the changes
Works only on CPUs that implement LSE, and which is available since armv8.1-a (--march=armv8-a+lse), like the M1.
First patch implement the feature, last one is just leftover code that might be useful while debugging and therefore optional