Skip to content

Commit

Permalink
Only use the ARM NEON 32-way unrolled rANS on AArch64.
Browse files Browse the repository at this point in the history
NEON alone isn't a sufficient guard as AArch32 also has some limited
Neon capabilities.  While we could no doubt have a 32-bit alternative,
for now this is the simple fix and let aarch32 use the scalar
implementation.

Doing a 32-bit neon is a complex task and without having access to the
hardware it's pretty much impossible.  I also wouldn't have high hopes
for any significant speed gains over scalar with only half the lanes
available.

Fixes samtools#81
  • Loading branch information
jkbonfield committed Apr 18, 2023
1 parent 5aecc6e commit 16347f9
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion htscodecs/rANS_static32x16pr_neon.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
*/

#include "config.h"
#ifdef __ARM_NEON
#if defined(__ARM_NEON) && defined(__aarch64__)
#include <arm_neon.h>

#include <limits.h>
Expand Down

0 comments on commit 16347f9

Please sign in to comment.