Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]Support select_if in arm #53093

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

before-Sunrise
Copy link
Contributor

@before-Sunrise before-Sunrise commented Nov 21, 2024

Why I'm doing:

What I'm doing:

Benchmark for uint8_t (100000000 elements):
SIMD time: 15.9133 ms
Non-SIMD time: 290.615 ms
Speedup: 18.2624x

Benchmark for int16_t (100000000 elements):
SIMD time: 27.9742 ms
Non-SIMD time: 295.018 ms
Speedup: 10.5461x

Benchmark for int32_t (100000000 elements):
SIMD time: 51.5047 ms
Non-SIMD time: 291.931 ms
Speedup: 5.66804x

Benchmark for int64_t (100000000 elements):
SIMD time: 98.8005 ms
Non-SIMD time: 290.183 ms
Speedup: 2.93706x

Benchmark for double (100000000 elements):
SIMD time: 97.1446 ms
Non-SIMD time: 291.176 ms
Speedup: 2.99734x

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
  • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.4
    • 3.3
    • 3.2
    • 3.1
    • 3.0

Signed-off-by: before-Sunrise <[email protected]>
Signed-off-by: before-Sunrise <[email protected]>
}

template <typename T, bool left_const = false, bool right_const = false>
inline void neon_select_if_common_implement(uint8_t*& selector, T*& dst, const T*& a, const T*& b, int size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need a const T*& ? can it be simplify as a const T* or even const void* ?

vec_a = vld1q_u16(reinterpret_cast<const uint16_t*>(a) + i * 8);
} else {
// vdupq_n_u16: Copy a 16-bit value to all elements in the register
vec_a = vdupq_n_u16(*reinterpret_cast<const uint16_t*>(a));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's a const, you don't need to populate the register many times ?

}

uint8x16_t index = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1};
uint8x16_t mask = vqtbl1q_u8(loaded_mask, index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this piece of code is almost same exception the data_size parameter, it's better to extract the common part

Signed-off-by: before-Sunrise <[email protected]>
Signed-off-by: before-Sunrise <[email protected]>
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants