-
Notifications
You must be signed in to change notification settings - Fork 75
[NSE-927] Implement AVX512 optimization selection in Runtime and merge two C2R code files into one. #937
[NSE-927] Implement AVX512 optimization selection in Runtime and merge two C2R code files into one. #937
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/native-sql-engine/issues Then could you also rename commit message and pull request title in the following format?
See also: |
cc @FelixYBW |
reminder_8x = _mm256_sub_epi32(x8_8x, reminder_8x); | ||
reminder_8x = _mm256_and_si256(reminder_8x, x7_8x); | ||
reminder_8x = _mm256_add_epi32(reminder_8x, length_8x); | ||
__m256i dst_length_8x = _mm256_loadu_si256((__m256i*)length_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here should be aligned load since we allocate length_data as 32B aligned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
reminder_8x = _mm256_add_epi32(reminder_8x, length_8x); | ||
__m256i dst_length_8x = _mm256_loadu_si256((__m256i*)length_data); | ||
dst_length_8x = _mm256_add_epi32(dst_length_8x, reminder_8x); | ||
_mm256_storeu_si256((__m256i*)length_data, dst_length_8x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly, aligned store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
__m256i x8_8x = _mm256_load_si256((__m256i*)x_8); | ||
__m256i offsetarray_1_8x; | ||
if (j + 16 < num_rows_) { | ||
offsetarray_1_8x = _mm256_loadu_si256((__m256i*)&offsetarray[j]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fill an issue here. by default arrow should always allocate 64Byte aligned buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
can you fix the bencharm issue in this PR as well?
|
a7d0a59
to
b5c261e
Compare
Have fixed in #936 |
What changes were proposed in this pull request?
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)