-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCC optimization flag incompatibility #22
Comments
One solution at your level would be something like #define POINT_MULT_SCALAR_W5_IMPL(ptype) \
#if (__GNUC__ == 4 && __GNUC_MINOR__ == 8 && __GNUC_PATCHLEVEL__ == 5) \
__attribute__((optimize("no-tree-vectorize"))) \
#endif \
static void ptype##_gather_booth_w5(ptype *restrict p, const ptype table[16], \
limb_t booth_idx) \
{ \
size_t i; \
limb_t booth_sign = (booth_idx >> 5) & 1; \
\
booth_idx &= 0x1f; \
vec_zero(p, sizeof(ptype)); /* implicit infinity at table[-1] */\
/* ~6% with -Os, ~2% with -O3 ... */\
for (i = 1; i <= 16; i++) \
ptype##_ccopy(p, table + i - 1, i == booth_idx); \
\
ptype##_cneg(p, booth_sign); \
} \ (assuming this is the problematic function and problematic compiler version) |
Why not other way around? :-):-):-) But on serious note, if specific compiler version fails to compile a piece of code, while others can, it speaks rather in favour of compiler bug. This is not to say that it necessarily means compiler bug, but it's first assumption to make.
I for one am not big fan of compiler-specific workarounds, but suggestion is not the way to go. Because nested pre-processor directives don't work. But function can have separate declaration with designated attributes... Another way to solve it would be ... more assembly, so that compiler won't be in position to make the self-defeating assumptions... On side note. Keep in mind that blst is not that dependent on optimization level, because most of the "magic" happens in assembly. In other words difference between -O2 and -O3 is effectively negligible, so you don't actually have to compile blst with -O3. Unless of course if your C code is sensitive to optimization level, and you want to compile everything in the same go... |
Can you confirm that compiling with [Just in case for reference, this is not a suggested solution, just an attempt to pinpoint the problem.] |
There is a related GCC bug that has been lurking for 10 years at least, for example x264 https://mailman.videolan.org/pipermail/x264-devel/2010-June/007462.html. Clang doesn't exhibit this which also supports a GCC bug. Yes
Yes that's the case, I don't compile BLST as a separate DLL but compile it at the same time as the rest of the Nim/C code. |
In other words bug is so old that it's considered a feature:-) This is exactly why I'm not fond of compiler-specific workarounds, they effectively let compiler off the hook... Either way, could you double-check vec_select_n? |
vec_select_n is merged. Closing... |
Sorry for the late reply, I didn't have time to upgrade earlier . Unfortunately I seem to still get wrong results with GCC -O3 with the master from yesterday (a8398ed) unless I pass fno-tree-vectorize, despite that branch being merged and f8a77bd For now I'll keep using |
* Bump BLST * Test for supranational/blst#22 regression * Use SHA256 from BLST + bump nim-blscurve to reenable fno-tree-vectorize * SHA256 on non-blst platforms import fixes * import fixes again * can't prefix with nimcrypto * address review comment [skip ci] * {.noInit.} on the digests
Still? Hmm... Since -Drestrict= helps, I assume it still does, can you test one thing? Drop the qualifiers from ptype##_ccopy in src/point.h. |
Just in case for reference. |
FYI, BLST is incompatible with the following GCC flag:
-ftree-loop-vectorize
This causes the scalar multiplication to miscompile and easily noticeable on
blst_sk_to_pk_in_g1
function.Unfortunately it is automatically activated with
-O3
. You might want to have a note about not compiling with-O3
or using-fno-tree-loop-vectorize
to deactivate the offending flag.Tested with GCC v10.1.0
Clang works fine (v10.0.1)
The text was updated successfully, but these errors were encountered: