Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues related to MXCSR register #1060

Merged
merged 4 commits into from
Sep 26, 2023

Conversation

M-HT
Copy link
Contributor

@M-HT M-HT commented Sep 5, 2023

These are fixes for issues described in #1059.

@mr-c
Copy link
Collaborator

mr-c commented Sep 5, 2023

Thank you @M-HT ! Can you add some tests to demonstrate the fixes?

@M-HT
Copy link
Contributor Author

M-HT commented Sep 5, 2023

The first two issues are missing definitions so tests don't make sense, because programs using them fail to compile without the fix.
As for the third issue, I don't know how to demonstrate it.

@mr-c
Copy link
Collaborator

mr-c commented Sep 6, 2023

In the native-aliases CI test, we convert all of the test case source code to not use prefixed version of the intrinsics and we enable SIMDE_ENABLE_NATIVE_ALIASES to ensure that the aliases work

To test the _MM_ROUND_MASK native alias, write a test that uses SIMDE_ENABLE_NATIVE_ALIASES; likewise for _MM_GET_FLUSH_ZERO_MODE via SIMDE_MM_GET_FLUSH_ZERO_MODE.

I'm less concerned about the third case.

@M-HT
Copy link
Contributor Author

M-HT commented Sep 6, 2023

I'm not sure what meaningful test can be written which uses _MM_ROUND_MASK / _MM_GET_FLUSH_ZERO_MODE.
Also native-aliases.sh script only removes simde prefix and not SIMDE prefix, and it ignores macro "constants" like _MM_ROUND_MASK, _MM_SHUFFLE, ...

@M-HT
Copy link
Contributor Author

M-HT commented Sep 10, 2023

How about this test, is it acceptable ?

static int
test_simde_MXCSR (SIMDE_MUNIT_TEST_ARGS) {
  uint32_t original_mxcsr = simde_mm_getcsr();
  uint32_t masked_mxcsr = original_mxcsr & ~(SIMDE_MM_ROUND_MASK | SIMDE_MM_FLUSH_ZERO_MASK);

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_NEAREST | SIMDE_MM_FLUSH_ZERO_OFF);
  uint32_t rm_nearest_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_nearest_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_NEAREST | SIMDE_MM_FLUSH_ZERO_ON);
  uint32_t rm_nearest_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_nearest_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_DOWN | SIMDE_MM_FLUSH_ZERO_OFF);
  uint32_t rm_down_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_down_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_DOWN | SIMDE_MM_FLUSH_ZERO_ON);
  uint32_t rm_down_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_down_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_UP | SIMDE_MM_FLUSH_ZERO_OFF);
  uint32_t rm_up_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_up_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_UP | SIMDE_MM_FLUSH_ZERO_ON);
  uint32_t rm_up_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_up_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_TOWARD_ZERO | SIMDE_MM_FLUSH_ZERO_OFF);
  uint32_t rm_zero_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_zero_off = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(masked_mxcsr | SIMDE_MM_ROUND_TOWARD_ZERO | SIMDE_MM_FLUSH_ZERO_ON);
  uint32_t rm_zero_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_ROUNDING_MODE());
  uint32_t fzm_zero_on = HEDLEY_STATIC_CAST(uint32_t, SIMDE_MM_GET_FLUSH_ZERO_MODE());

  simde_mm_setcsr(original_mxcsr);

  simde_assert_equal_u32(rm_nearest_off, rm_nearest_on);
  simde_assert_equal_u32(rm_down_off, rm_down_on);
  simde_assert_equal_u32(rm_up_off, rm_up_on);
  simde_assert_equal_u32(rm_zero_off, rm_zero_on);

  simde_assert_equal_u32(fzm_nearest_off, fzm_down_off);
  simde_assert_equal_u32(fzm_nearest_off, fzm_up_off);
  simde_assert_equal_u32(fzm_nearest_off, fzm_zero_off);

  simde_assert_equal_u32(fzm_nearest_on, fzm_down_on);
  simde_assert_equal_u32(fzm_nearest_on, fzm_up_on);
  simde_assert_equal_u32(fzm_nearest_on, fzm_zero_on);

  return 0;
}

@mr-c
Copy link
Collaborator

mr-c commented Sep 12, 2023

Also native-aliases.sh script only removes simde prefix and not SIMDE prefix, and it ignores macro "constants" like _MM_ROUND_MASK, _MM_SHUFFLE, ...

Oh, I didn't see that. It would be good to add those SIMDE_ prefixed to the script, but I can see how that is a bit tricky with the private SIMDE_ constants and other defines..

How about this test, is it acceptable ?

I like them, thanks! Can you add them to this PR?

@M-HT
Copy link
Contributor Author

M-HT commented Sep 13, 2023

I looked at the how to fix the failing tests and noticed following:
In xmmintrin.h (from gcc, clang and visual studio), functions _mm_getcsr and _mm_setcsr use unsigned int datatype and _MM_GET_ROUNDING_MODE, _MM_SET_ROUNDING_MODE, _MM_GET_FLUSH_ZERO_MODE, _MM_SET_FLUSH_ZERO_MODE are either macros using _mm_getcsr and _mm_setcsr or functions using unsigned int datatype.

In simde, functions simde_mm_getcsr, simde_mm_setcsr, SIMDE_MM_GET_FLUSH_ZERO_MODE, SIMDE_MM_SET_FLUSH_ZERO_MODE use uint32_t datatype (which is an understandable change), but functions SIMDE_MM_GET_ROUNDING_MODE and SIMDE_MM_SET_ROUNDING_MODE use unsigned int datatype.

I think functions SIMDE_MM_GET_ROUNDING_MODE and SIMDE_MM_SET_ROUNDING_MODE should also use uint32_t datatype (to be consistent). Then fixing the failing tests would be simple.

@mr-c
Copy link
Collaborator

mr-c commented Sep 13, 2023

I think functions SIMDE_MM_GET_ROUNDING_MODE and SIMDE_MM_SET_ROUNDING_MODE should also use uint32_t datatype (to be consistent). Then fixing the failing tests would be simple.

Makes sense to me, please implement that; thank you!

@mr-c mr-c enabled auto-merge (squash) September 13, 2023 13:20
@mr-c
Copy link
Collaborator

mr-c commented Sep 14, 2023

@M-HT Do you have time to file the bug report with emscripten/llvm? The emscripten people are pretty fast at fixing things

Directions: https://emscripten.org/docs/getting_started/bug_reports.html

Error log: https://github.com/simd-everywhere/simde/actions/runs/6172090486/job/16751643532?pr=1060#step:8:1076

@M-HT
Copy link
Contributor Author

M-HT commented Sep 14, 2023

I created this issue: emscripten-core/emscripten#20250

@mr-c mr-c disabled auto-merge September 26, 2023 07:15
@mr-c mr-c enabled auto-merge (squash) September 26, 2023 11:58
@mr-c mr-c merged commit 653aba8 into simd-everywhere:master Sep 26, 2023
74 checks passed
@M-HT M-HT deleted the fix-issues-mxcsr branch September 26, 2023 17:18
yyctw pushed a commit to yyctw/simde that referenced this pull request Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants