You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Go's atomics are sequential consistency, which requires memory fences or instructions with implicit memory fences even on strongly consistent architectures like x86/amd64.
It would be nice to have some more relaxed atomics so lock-free algorithms can be implemented efficiently in Go - because implementing them inefficiently defeats the purpose. Here's an example of a gopher discovering that porting C++ lock-free algorithms to Go results in poor performance (an order of magnitude slower, according to them.) This is especially true with non-x86 architectures. I'm a bit of a lock-free junkie, my first Go program was a lock-free hashtable. So personally for me this has been a long-standing pet-peeve with the language and I'm volunteering my help if we can agree on a direction forward.
This is a fairly uncommon need, but when you do need to get more performance there's no good alternatives. You have to write your whole function, not just the load/store in assembly, for each architecture you want to optimize. It should be noted that the Go runtime itself requires and implements acquire loads and release stores, and uses them judiciously where more performance is required. This also means it could be implemented fairly easily by just exporting those internal atomics from the sync/atomic package, with appropriate names and signatures.
There was a an old feature request from fellow lock-free junkie @dvyukov asking for more relaxed atomics, I remember reading the thread, but could not find it in the issue tracker. The Go maintainer balked at the idea of adding all of the atomics from c++, but was more open to the idea of a limited set that covers the majority of use cases. Nothing concrete came out of the discussion though.
I think having Acquire versions of the loads and Release versions of the stores in sync/atomic covers 95% of the use cases. I point to the Go runtime as an example to validate that claim, and also underscore why it's useful. It would double the amount of load/store functions in sync/atomic, which is reasonable in my opinion.
So to sum up, this is an advanced feature that is sometimes necessary with no good workarounds. It is already implemented and used by the Go runtime, so it can be implemented easily. Can this be my Christmas present to the Go community, and by extension, myself?
The text was updated successfully, but these errors were encountered:
That gives you a naked load/store that the Go compiler can't optimize away. It's relaxed on relaxed architectures and acquire/release on x86/64, but using the arch file postfix build flags, and assembly versions for other architectures, you can implement both relaxed and acquire/release atomics. The internal runtime atomics are currently implemented this way.
It's unsafe and may break in a future version of Go, but if you need weaker atomics you're already doing unsafe things.
Go's atomics are sequential consistency, which requires memory fences or instructions with implicit memory fences even on strongly consistent architectures like x86/amd64.
It would be nice to have some more relaxed atomics so lock-free algorithms can be implemented efficiently in Go - because implementing them inefficiently defeats the purpose. Here's an example of a gopher discovering that porting C++ lock-free algorithms to Go results in poor performance (an order of magnitude slower, according to them.) This is especially true with non-x86 architectures. I'm a bit of a lock-free junkie, my first Go program was a lock-free hashtable. So personally for me this has been a long-standing pet-peeve with the language and I'm volunteering my help if we can agree on a direction forward.
This is a fairly uncommon need, but when you do need to get more performance there's no good alternatives. You have to write your whole function, not just the load/store in assembly, for each architecture you want to optimize. It should be noted that the Go runtime itself requires and implements acquire loads and release stores, and uses them judiciously where more performance is required. This also means it could be implemented fairly easily by just exporting those internal atomics from the sync/atomic package, with appropriate names and signatures.
There was a an old feature request from fellow lock-free junkie @dvyukov asking for more relaxed atomics, I remember reading the thread, but could not find it in the issue tracker. The Go maintainer balked at the idea of adding all of the atomics from c++, but was more open to the idea of a limited set that covers the majority of use cases. Nothing concrete came out of the discussion though.
I think having Acquire versions of the loads and Release versions of the stores in sync/atomic covers 95% of the use cases. I point to the Go runtime as an example to validate that claim, and also underscore why it's useful. It would double the amount of load/store functions in sync/atomic, which is reasonable in my opinion.
So to sum up, this is an advanced feature that is sometimes necessary with no good workarounds. It is already implemented and used by the Go runtime, so it can be implemented easily. Can this be my Christmas present to the Go community, and by extension, myself?
The text was updated successfully, but these errors were encountered: