-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable C11 automatically if the compiler supports it #1729
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1729 +/- ##
=======================================
Coverage 78.31% 78.32%
=======================================
Files 580 580
Lines 97140 97145 +5
Branches 13926 13928 +2
=======================================
+ Hits 76073 76086 +13
+ Misses 20445 20437 -8
Partials 622 622 ☔ View full report in Codecov by Sentry. |
Should
|
I think we'll need to add a condition on #if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
#include <stdatomic.h>
// CRYPTO_refcount_t is a |uint32_t|
#define AWS_LC_ATOMIC_LOCK_FREE ATOMIC_LONG_LOCK_FREE
#else
#define AWS_LC_ATOMIC_LOCK_FREE 0
#endif
#if !defined(OPENSSL_C11_ATOMIC) && defined(OPENSSL_THREADS) && \
!defined(__STDC_NO_ATOMICS__) && defined(__STDC_VERSION__) && \
__STDC_VERSION__ >= 201112L && AWS_LC_ATOMIC_LOCK_FREE > 0
#define OPENSSL_C11_ATOMIC
#endif |
Hmmmm, it looks like our atomic version can't be enabled with just C11 then, I think we need two checks and flags:
|
Andrew, please excuse me for insisting, but as I already suggested, why not just rely on the compiler version in the source code instead of all the cmake machinery ? The atomics are not dependent on C11 but on (clang || gcc >= 4.7). That's much easier IMHO to just fix the condition in the code itself rather than risk to break stuff by adding complex version detection. It would even allow to keep the -std=c99 for now if it turns out to have some uses. |
Ahhh, I missed that part, doing an experiment locally confirms that it does compile as expected, let's see if it gets through all the CI compilers. |
Just using atomics if the compiler supports it does not work because we enabled |
OK, I can understand, of course! And the other option, why not simply drop the forced -std=c99 ? In which case exactly is it required to manually force and older version of the standard when 99% of the modern compilers will automatically default to the newer ones ? |
Hit a fun issue, GCC 4.8 actually doesn't support atomics even though it supports C11. This seems to be a known issue, you might want to increase your check to >= 4.9 which is when support and the header file got included. |
I'm using them there and I can assure you they work pretty well:
What might possibly be missing is stdatomic or some defines, but I still don't understand why these would be needed at all. I've never been aware of them and have been using atomics on gcc and clang for many years now. I really think that simply dropping -std=c99 which breaks the test and triggers your warning, and checking on gcc version should do the trick. I don't understand why there seems to be some complicated checks that strive to force certain version and later to check them, this might have been inherited from older legacy code maybe, but I really feel that there's some form of inconsistency there that is self-feeding and not necessary at all. |
Or maybe if you want, for non-gcc compilers that might require stdatomic.h, you could do something like this:
I.e. it wlil consider the C standard version for compilers that do not define the ATOMIC macros and that might require stdatomic.h (if there are any at all, but given that gcc and clang have them builtin, that would only be for possibly other ones). E.g. that's gcc 4.8 below:
|
Poking around we need stdatomic.h to get access to |
We have some active customers that still rely on very old versions of GCC and C99. Trust me I am very excited for the day when they update and we can drop a lot of this complicated build options. |
Then maybe it's easier to detect older compilers and force std=c99 only for them if that's the purpose ? |
Regrading stdatomic.h, on gcc you have the ATOMIC_LONG_LOCK_FREE definition by default without including anything. If stdatomic is needed for clang, what about including only with clang, since all of its versions support C99 ? |
### Description of changes: While working on #1729 I noticed the CMake builds took a while because it was all single threaded. The GitHub runners [have 4 cores](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories) so this should speed it up a bit. ### Testing: Waiting to see what happens here. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.
I figured out why I didn't see any performance improvement on some platforms with this change:
The solution was increasing X from 100 to 1,000 so it takes longer for each thread to finish. TLDR yes C11 atomics are much faster than pthread |
efa04c7
to
9b0a2fb
Compare
Issues:
Resolves #1723
Description of changes:
#1723 pointed out the default AWS-LC behavior can be very slow because we enable C99 by default. The user has to opt into the better atomics by specifying -DCMAKE_C_STANDARD=11 and they probably won't find that option. This change turns C11 on by default if the compiler supports it.
Add a new benchmark to test our ref count performance.
Call-outs:
If users do not change their CMake options those with modern compilers will get C11 atomics automatically, legacy customers will continue with the pthread implementation. Both legacy and modern users can continue to manually control this by setting the CMAKE_C_STANDARD option. This option is only available to CMake >= 3.1 customers.
Testing:
Tried different options and verified the build.ninja included the expected
-std=gnu99
or-std=gnu11
. Performance numbers on an Graviton 4 R8g show a 1.6 times increase for single threaded uncontested locking performance, and 33.0 times increase in heavily contested locking performance. depending on the use case real world applications will see less of an increase. Each thread attempts to increment the sameCRYPTO_refcount_t
1,000 times:C99 (previous performance):
C11 (current performance):
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.