Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emsdk/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h is not lib/clang/19/include/**/emmintrin.h #22226

Closed
gblikas opened this issue Jul 12, 2024 · 16 comments

Comments

@gblikas
Copy link

gblikas commented Jul 12, 2024

I'm looking to narrow down a build issue. I am compiling opencv4 via

vcpkg install opencv4:wasm32-emscripten --editable

and have modified a bad header to #include <emmintrin.h> as it is a missed dependency. I did this because the documentation for Using SIMD with WebAssembly indicates that _mm_setr_epi64 should be available via #include <emmintrin.h>. Judging by my error's output:

/Users/gblikas/projects/vcpkg/buildtrees/opencv4/src/4.8.0-2bf495557d/modules/core/include/opencv2/core/hal/intrin_sse.hpp:252:15: error: use of undeclared identifier '_mm_setr_epi64'; did you mean '_mm_set_epi64'?
  252 |         val = _mm_setr_epi64((__m64)v0, (__m64)v1);
      |               ^~~~~~~~~~~~~~
      |               _mm_set_epi64
/Users/gblikas/projects/emsdk/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h:1133:1: note: '_mm_set_epi64' declared here
 1133 | _mm_set_epi64(long long q1, long long q0)
      | ^

and

$ cat /Users/gblikas/projects/emsdk/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h  | grep "_setr"
_mm_setr_pd(double __c0, double __c1)
_mm_setr_epi32(int i0, int i1, int i2, int i3)
_mm_setr_epi16(short w0, short w1, short w2, short w3, short w4, short w5, short w6, short w7)
_mm_setr_epi8(char b0, char b1, char b2, char b3, char b4, char b5, char b6, char b7, char b8, char b9, char b10, char b11, char b12, char b13, char b14, char b15)

it appears as if the included header emmintrin.h from **/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h is not the same as any of the lib/clang/19/include/**/emmintrin.h headers.

I'm not sure what emscripten/cache/sysroot is, so this issue is most-likley user error. However,

$ grep -lR "_mm_setr_epi64" emsdk
emsdk/upstream/emscripten/test/sse/test_sse2.cpp
emsdk/upstream/emscripten/test/sse/benchmark_sse2.cpp
emsdk/upstream/lib/clang/19/include/emmintrin.h
emsdk/upstream/lib/clang/19/include/ppc_wrappers/emmintrin.h

does show that _mm_setr_epi64 should be found in emsdk/upstream/lib/clang/19/include/**/emmintrin.h. Is there a known work around for this?ng the

If this issue is better suited for the opencv repository, or vcpkg, I'll move the issue over there. However I believe this is an #include issue with emscripten-core.


Potentially Related issue

Version Info

$ emcc -v 
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.62 (34c1aa36052b1882058f22aa1916437ba0872690)
clang version 19.0.0git (https:/github.com/llvm/llvm-project c00ada070207979f092be9046a02fcfff8b9f9ce)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /Users/gblikas/projects/emsdk/upstream/bin
@sbc100
Copy link
Collaborator

sbc100 commented Jul 12, 2024

The compat/emmintrin.h in the emscripten sysroot looks like it uses wasm intrinsics which is likely better than the clang builtin one that is using __builtin_ia32 intrinsics.

I'm guess we simply have not updated it in a while and we need to add the missing _mm_setr_epi64 declaration.

@tlively do you know about the history of these two files?

@sbc100
Copy link
Collaborator

sbc100 commented Jul 12, 2024

It looks like the old version of this header (which was removed in #7924) did contain this function but when it was re-introduced in #11243 that function was not included.

CC @juj

@gblikas
Copy link
Author

gblikas commented Jul 12, 2024

The compat/emmintrin.h in the emscripten sysroot looks like it uses wasm intrinsics which is likely better than the clang builtin one that is using __builtin_ia32 intrinsics.

I'm guess we simply have not updated it in a while and we need to add the missing _mm_setr_epi64 declaration.

@tlively do you know about the history of these two files?

@sbc100 Thanks for the prompt response. If it's not too involved, I can push a PR.

@gblikas
Copy link
Author

gblikas commented Jul 12, 2024

It looks like the old version of this header (which was removed in #7924) did contain this function but when it was re-introduced in #11243 that function was not included.

CC @juj

@sbc100 So, I got a working solution. I'm now formalizing the patch in emscripten-core, and will post a PR we can edit/tune in, in a moment.

Additionally, I'm looking at the definitions,

static __inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_setr_epi32(int i0, int i1, int i2, int i3)
{
return (__m128i)wasm_i32x4_make(i0, i1, i2, i3);
}
static __inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_setr_epi16(short w0, short w1, short w2, short w3, short w4, short w5, short w6, short w7)
{
return (__m128i)wasm_i16x8_make(w0, w1, w2, w3, w4, w5, w6, w7);
}
static __inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_setr_epi8(char b0, char b1, char b2, char b3, char b4, char b5, char b6, char b7, char b8, char b9, char b10, char b11, char b12, char b13, char b14, char b15)
{
return (__m128i)wasm_i8x16_make(b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, b10, b11, b12, b13, b14, b15);
}
and wondering why they are preferring builtin types: char, short, int, long, long long. It seems to be that it's better to transition them to their int8_t, int16_t, int32_t, int64_t, and __m64 specific types, in order to provide more portable and readable code. @tlively or @juj do you have any suggestions on this?

I understand that wasm_simd128.h typedef's the builtin types, and perhaps the reason for leaving

static __inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_setr_epi64(__m64 q0, __m64 q1)

out was the fact that wasm_simd128.h doesn't actually contain a direct typedef for __m64?

@tlively
Copy link
Member

tlively commented Jul 13, 2024

I would be fine with using the intX_t types.

__m64 is apparently defined in xmmintrin.h, so if it's missing, that would be the place to add it, not wasm_simd128.h.

@gblikas
Copy link
Author

gblikas commented Jul 14, 2024

@tlively or @sbc100 do you have an idea why the conditional in #7924 (was the PR it was removed):

#ifndef __EMSCRIPTEN__ // MMX support is not available in Emscripten/SIMD.js.
static __inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_setr_epi64(__m64 q0, __m64 q1)
{
  return (__m128i){ (long long)q0, (long long)q1 };
}
#endif

was based on the __EMSCRIPTEN__ macro instead of something MMX/SEE specific; what's the history here wrt the comment?

@tlively
Copy link
Member

tlively commented Jul 15, 2024

These headers are only ever included in Emscripten builds, so #ifndef __EMSCRIPTEN__ is functionally equivalent to #if 0. I guess this function was removed from the build because it could not be efficiently supported on top of SIMD.js back in the day, but I don't know why exactly.

@gblikas
Copy link
Author

gblikas commented Jul 25, 2024

@tlively #22243 should be good then.

@juj
Copy link
Collaborator

juj commented Aug 1, 2024

It looks like the old version of this header (which was removed in #7924) did contain this function but when it was re-introduced in #11243 that function was not included.

CC @juj

Originally when I wrote 128-bit SSE,SSE2, ... support, I did not add any of the old legacy 64-bit MMX support, which is considered obsolete: https://www.phoronix.com/news/LLVM-Clang-MMX-Intrinsics-SSE2 and https://groups.google.com/g/llvm-dev/c/7FEDyVe2ipQ/m/PY4F6hElAgAJ?pli=1

The reason was that Wasm SIMD128 did not support any 64-bit intrinsics.

Note that there is a little bit of overlap with 128-bit SSE* and 64-bit MMX, as the SSE* instruction sets also added (for completeness) some 64-bit MMX registers instructions in them: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=5740&techs=SSE_ALL&text=__m64

so as result, not "faking" 64-bit intrinsics resulted in those SSE/SSE2/SSE3 functions being dropped out that had a __m64 in their signature.

I don't oppose adding support to 64-bit MMX functions, emulated by 128-bit Wasm SIMD128. That is the path that LLVM is taking as well, as mentioned in the above phoronix link.

@allsey87
Copy link

allsey87 commented Aug 27, 2024

@gblikas I think you might be taking the wrong approach with all this. OpenCV has support for WebAssembly SIMD and does not require SSE or SSE2 mappings/emulation.

The problem is that when building OpenCV for Emscripten, CV_CPU_COMPILE_SSE2 is defined by default via the CMake logic which results in the incorrect header files being included.

The solution to your problem is to disable SSE by passing the following options to CMake:

-DCV_ENABLE_INTRINSICS=ON -DCPU_BASELINE= -DCPU_DISPATCH=

Hope this helps!

@juj
Copy link
Collaborator

juj commented Aug 27, 2024

For anyone working close on OpenCV, it would actually be amazing to read a writeup of how the SSE path vs handwritten Wasm SIMD path vs a ARM Neon path compilation in Emscripten works out with performance. That is, since Emscripten supports all of those, users can choose which SIMD backend to target.

It would be a really nice investigation to see "here's how much of a performance difference there is if you use Wasm SIMD instead of SSE" in different OpenCV tasks.

@allsey87
Copy link

That is, since Emscripten supports all of those

Well, the SSE path for OpenCV is broken at the moment with some necessary functions missing (see #22243). I haven't tried the NEON path yet.

@gblikas
Copy link
Author

gblikas commented Oct 8, 2024

@gblikas I think you might be taking the wrong approach with all this. OpenCV has support for WebAssembly SIMD and does not require SSE or SSE2 mappings/emulation.

The problem is that when building OpenCV for Emscripten, CV_CPU_COMPILE_SSE2 is defined by default via the CMake logic which results in the incorrect header files being included.

The solution to your problem is to disable SSE by passing the following options to CMake:

-DCV_ENABLE_INTRINSICS=ON -DCPU_BASELINE= -DCPU_DISPATCH=

Hope this helps!

@allsey87 Totally right. Sorry for the long delay; after seeing your comment, I went through and modified the opencv4 portfile to support building Emscripten. In doing so, many of the flags don't work or result in broken builds related to what you mention:

-DCV_ENABLE_INTRINSICS=ON -DCPU_BASELINE= -DCPU_DISPATCH=

Also, there are some threading issues once you do build it - threads is pretty much broken.

@gblikas
Copy link
Author

gblikas commented Oct 8, 2024

For anyone working close on OpenCV, it would actually be amazing to read a writeup of how the SSE path vs handwritten Wasm SIMD path vs a ARM Neon path compilation in Emscripten works out with performance. That is, since Emscripten supports all of those, users can choose which SIMD backend to target.

It would be a really nice investigation to see "here's how much of a performance difference there is if you use Wasm SIMD instead of SSE" in different OpenCV tasks.

@juj I'm not close to the project, but am taking a stab at it. I've just finished my internal deployment, and am looking to do comparative analysis. Might be slow, but I'll get there.

@gblikas
Copy link
Author

gblikas commented Oct 8, 2024

That is, since Emscripten supports all of those

Well, the SSE path for OpenCV is broken at the moment with some necessary functions missing (see #22243). I haven't tried the NEON path yet.

@allsey87 I opended #22243 which might be irrelevant now... I also haven't heard anything back from OpenCV on how they actually use Emscripten internally... Their documentation says one thing, then the github runners another, then the internal Intel whatever you call it says another. See opencv/opencv#25956.

@gblikas
Copy link
Author

gblikas commented Oct 8, 2024

I'm going to close this comment until I hear back from the opencv team and finalize my own internal testing against Emscripten 2.0.13 (documented OpenCV deps) and Emscripten latest (what OpenCV looks to be running?...).

@gblikas gblikas closed this as completed Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants