Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm: add support for detecting NEON (WIP) #144

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

carenas
Copy link
Contributor

@carenas carenas commented Dec 9, 2022

Probably worth some refactoring to make it more generic and apply as well to s390x, or more extensible considering there are now multiple vector instruction sets per architecture with the most recent ones being variable width.

Is there any value on haviing implementation speciific names?, wouldn't it better to use SLJIT_HAS_SIMD?, how to differenciate if vectors would have different width?

the arm code for SIMD in PCRE2 has a lot of issues, but one of the advantages it had was that it was less verbose and offloaded some of the hand coding by using the compiler intrinsics (but should be instead using sljit intrinsics), why not add them?

@zherczeg
Copy link
Owner

These days vector seems more popular than simd. And should be generic of course. Sljit does not emit any vector instructions at the moment.

@carenas carenas force-pushed the neon branch 3 times, most recently from 06c92fe to 522d11d Compare December 11, 2022 04:00
Requires getauxval() which is available at least in Linux/Android
with recent versions of the libc and that is therefore behind a
configure like macro.

A similar function from FreeBSD>=12, Windows and NetBSD is used
for each case.

While at it, consolidate the code to use the same externally
visible flag that is used in x86 for SSE2.
#define SLJIT_DETECT_SSE2 1
#if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32) || \
(defined SLJIT_CONFIG_ARM_V7 && SLJIT_CONFIG_ARM_V7)
/* Auto detect availability of SSE2 (using CPUID) or NEON.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth doing the checks here to avoid adding the "detect" support for cases that are known not to require it (ex: OpenBSD or macOS)

return;

#if defined(__ARM_ARCH) && __ARM_ARCH == 8
/* TODO: confirm if optional with armv9 */
Copy link
Contributor Author

@carenas carenas Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if true (because NEON is being "replaced" by SVE2, then this might also apply to the 64 bit version and therefore some of it might need to move to a "common" file

Comment on lines +10756 to +10761
if (simd && fpu)
strcpy(features, " (with: fpu, simd)");
else if (fpu)
strcpy(features, " (with fpu)");
else
strcpy(features, " (without fpu)");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might misrepreset the case where (!fpu && simd), but that is by design as that combination (while possible) is not something that is available in the market yet.

SLJIT_IS_FPU_AVAILABLE and its usecases is also missing and worth discussing, but IMHO might be able to wait until SIMD support is added to sljit proper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants