Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel hashes crash when built with AVX2 on Windows. #12

Closed
damaki opened this issue Aug 1, 2019 · 2 comments
Closed

Parallel hashes crash when built with AVX2 on Windows. #12

damaki opened this issue Aug 1, 2019 · 2 comments
Labels

Comments

@damaki
Copy link
Owner

damaki commented Aug 1, 2019

Description:
When libkeccak is built on Windows with AVX2 instructions enabled (ARCH=x86_64 and SIMD=AVX2) parallel hashes (KangarooTwelve, ParallelHash, etc) crash with a Program_Error raised with EXCEPTION_ACCESS_VIOLATION as the message when processing input data or generate output data. This only occurs if the data input/output buffer is large enough to trigger the usage of AVX2 instructions.

The problem has only been observed on Windows. Builds on Linux using the same version of the compiler (GNAT Community 2019) are confirmed to be working at the time of writing.

Steps to reproduce:
Compiler version: 64-bit GCC 8.3.1 20190518 (for GNAT Community 2019 20190517)
Operating system: Windows

  1. On Windows, run make test ARCH=x86_64 SIMD=AVX2
  2. The crash occurs when the tests are run.

Workaround:
The workaround is to avoid building libkeccak with AVX2 on Windows. Instead, use SSE2 instructions only, i.e. build libkeccak with ARCH=x86_64 SIMD=SSE2. This will result in slightly lower performance compared to AVX2, but is still pretty fast and at least it doesn't crash.

Root cause:
The root of the problem is that GCC is not respecting the requested 32-byte alignment on objects of type Keccak.Arch.AVX2.V4DI_Vectors.V4DI allocated on the stack, but is still generating AVX2 instructions (i.e. vmovdqa) which assume 32-byte alignment. This attempt to load/store misaligned data on the stack is causing the segfault in the AVX2 instantiations of Keccak.Generic_Parallel_Keccakf.Permute_All.

By contrast, on Linux GCC adjusts the stack pointer to ensure it is 32-byte aligned with the following disassembly:

   0x000000000040ecf0 <+0>:    push   %rbp
   0x000000000040ecf1 <+1>:    mov    $0x432540,%eax
   0x000000000040ecf6 <+6>:    mov    $0x4326c0,%edx
   0x000000000040ecfb <+11>:    mov    %rsp,%rbp
   0x000000000040ecfe <+14>:    and    $0xffffffffffffffe0,%rsp
   0x000000000040ed02 <+18>:    sub    $0x368,%rsp

The disassembly of the same function when built on Windows with the same version of GNAT does not align the stack pointer:

   0x0000000000452e50 <+0>:     sub    $0x468,%rsp

This seems to be a known bug in 64-bit GCC Windows, judging by the following links:

@damaki damaki added the bug label Aug 1, 2019
@damaki
Copy link
Owner Author

damaki commented Aug 24, 2019

GCC bug 54412 is also relevant: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

@damaki
Copy link
Owner Author

damaki commented Jun 4, 2022

Closing this since this is a GCC bug and is outside the scope of this library. The top-level README.md was updated in #18 to add a warning that AVX2 is not guaranteed to work on Windows with a reference to the GCC bug.

@damaki damaki closed this as not planned Won't fix, can't repro, duplicate, stale Jun 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant