diff --git a/neon.md b/neon.md index 0735ac3..c72676b 100644 --- a/neon.md +++ b/neon.md @@ -1,6 +1,6 @@ # Summary -TL;DR: SIMDe currently implements 4664 out of 6670 (69.93%) NEON functions. If you don't count poly types, it's 4664 / 5692 (81.94%). +TL;DR: SIMDe currently implements 5194 out of 6670 (77.87%) NEON functions. If you don't count poly types, it's 5194 / 5692 (91.25%). SIMDe does not currently support polynomial types, so they are excluded from this list (though separate totals are often provided to be transparent about what was skipped. We do plan to support these types in the future. @@ -8,179 +8,17 @@ SIMDe does not currently support polynomial types, so they are excluded from thi | Architecture | Functions | Functions with supported types | Implemented by SIMDe | Percent implemented | |--------------|----------:|-------------------------------:|---------------------:|--------------------:| -| ARMv7 | 3411 | 2984 | 2828 | 94.77% | -| ARMv8 | 4290 | 3479 | 3091 | 88.85% | -| AArch64 | 6670 | 5692 | 4664 | 81.94% | +| ARMv7 | 3411 | 2984 | 2984 | 100.00% | +| ARMv8 | 4290 | 3479 | 3320 | 95.43% | +| AArch64 | 6670 | 5692 | 5194 | 91.25% | # Families -There are 390 function families in NEON (based on how we define families). Discounting functions which use unsupported types, SIMDe has completely implemented 245 (62.82%) and partially implemented another 43 (11.03%). +There are 390 function families in NEON (based on how we define families). Discounting functions which use unsupported types, SIMDe has completely implemented 303 (77.69%) and partially implemented another 16 (4.10%). ## Incomplete Families -There are currently 43 incomplete families. - -### abd - -SIMDe currently implements 25 of 28 (89.29%) functions. - - * [x] vabd_s8 - * [x] vabd_s16 - * [x] vabd_s32 - * [x] vabd_u8 - * [x] vabd_u16 - * [x] vabd_u32 - * [x] vabd_f32 - * [ ] vabd_f16 - * [x] vabd_s8 - * [x] vabdq_s8 - * [x] vabd_s16 - * [x] vabdq_s16 - * [x] vabd_s32 - * [x] vabdq_s32 - * [x] vabd_u8 - * [x] vabdq_u8 - * [x] vabd_u16 - * [x] vabdq_u16 - * [x] vabd_u32 - * [x] vabdq_u32 - * [x] vabd_f32 - * [x] vabdq_f32 - * [x] vabd_f64 - * [x] vabdq_f64 - * [x] vabds_f32 - * [x] vabdd_f64 - * [ ] vabd_f16 - * [ ] vabdq_f16 - -### cgez - -SIMDe currently implements 21 of 24 (87.50%) functions. - - * [x] vcgez_s8 - * [x] vcgez_s16 - * [x] vcgez_s32 - * [x] vcgez_s64 - * [x] vcgez_f32 - * [x] vcgez_f64 - * [ ] vcgez_f16 - * [x] vcgez_s8 - * [x] vcgezq_s8 - * [x] vcgez_s16 - * [x] vcgezq_s16 - * [x] vcgez_s32 - * [x] vcgezq_s32 - * [x] vcgez_s64 - * [x] vcgezq_s64 - * [x] vcgez_f32 - * [x] vcgezq_f32 - * [x] vcgez_f64 - * [x] vcgezq_f64 - * [x] vcgezd_s64 - * [x] vcgezs_f32 - * [x] vcgezd_f64 - * [ ] vcgez_f16 - * [ ] vcgezq_f16 - -### cgtz - -SIMDe currently implements 21 of 24 (87.50%) functions. - - * [x] vcgtz_s8 - * [x] vcgtz_s16 - * [x] vcgtz_s32 - * [x] vcgtz_s64 - * [x] vcgtz_f32 - * [x] vcgtz_f64 - * [ ] vcgtz_f16 - * [x] vcgtz_s8 - * [x] vcgtzq_s8 - * [x] vcgtz_s16 - * [x] vcgtzq_s16 - * [x] vcgtz_s32 - * [x] vcgtzq_s32 - * [x] vcgtz_s64 - * [x] vcgtzq_s64 - * [x] vcgtz_f32 - * [x] vcgtzq_f32 - * [x] vcgtz_f64 - * [x] vcgtzq_f64 - * [x] vcgtzd_s64 - * [x] vcgtzs_f32 - * [x] vcgtzd_f64 - * [ ] vcgtz_f16 - * [ ] vcgtzq_f16 - -### cle - -SIMDe currently implements 34 of 37 (91.89%) functions. - - * [x] vcle_s8 - * [x] vcle_s16 - * [x] vcle_s32 - * [x] vcle_u8 - * [x] vcle_u16 - * [x] vcle_u32 - * [x] vcle_f32 - * [x] vcle_s64 - * [x] vcle_u64 - * [x] vcle_f64 - * [ ] vcle_f16 - * [x] vcle_s8 - * [x] vcleq_s8 - * [x] vcle_s16 - * [x] vcleq_s16 - * [x] vcle_s32 - * [x] vcleq_s32 - * [x] vcle_u8 - * [x] vcleq_u8 - * [x] vcle_u16 - * [x] vcleq_u16 - * [x] vcle_u32 - * [x] vcleq_u32 - * [x] vcle_f32 - * [x] vcleq_f32 - * [x] vcle_s64 - * [x] vcleq_s64 - * [x] vcle_u64 - * [x] vcleq_u64 - * [x] vcle_f64 - * [x] vcleq_f64 - * [x] vcled_s64 - * [x] vcled_u64 - * [x] vcles_f32 - * [x] vcled_f64 - * [ ] vcle_f16 - * [ ] vcleq_f16 - -### cltz - -SIMDe currently implements 21 of 24 (87.50%) functions. - - * [x] vcltz_s8 - * [x] vcltz_s16 - * [x] vcltz_s32 - * [x] vcltz_s64 - * [x] vcltz_f32 - * [x] vcltz_f64 - * [ ] vcltz_f16 - * [x] vcltz_s8 - * [x] vcltzq_s8 - * [x] vcltz_s16 - * [x] vcltzq_s16 - * [x] vcltz_s32 - * [x] vcltzq_s32 - * [x] vcltz_s64 - * [x] vcltzq_s64 - * [x] vcltz_f32 - * [x] vcltzq_f32 - * [x] vcltz_f64 - * [x] vcltzq_f64 - * [x] vcltzd_s64 - * [x] vcltzs_f32 - * [x] vcltzd_f64 - * [ ] vcltz_f16 - * [ ] vcltzq_f16 +There are currently 16 incomplete families. ### cmla @@ -220,111 +58,6 @@ SIMDe currently implements 12 of 21 (57.14%) functions. * [x] vcmlaq_rot270_f32 * [x] vcmlaq_rot270_f64 -### cvt_n - -SIMDe currently implements 36 of 44 (81.82%) functions. - - * [x] vcvt_n_s32_f32 - * [x] vcvt_n_u32_f32 - * [x] vcvt_n_s64_f64 - * [x] vcvt_n_u64_f64 - * [x] vcvt_n_f32_s32 - * [x] vcvt_n_f32_u32 - * [x] vcvt_n_f64_s64 - * [x] vcvt_n_f64_u64 - * [x] vcvt_n_f16_s16 - * [x] vcvt_n_f16_u16 - * [x] vcvt_n_s16_f16 - * [x] vcvt_n_u16_f16 - * [x] vcvt_n_s32_f32 - * [x] vcvtq_n_s32_f32 - * [x] vcvt_n_u32_f32 - * [x] vcvtq_n_u32_f32 - * [ ] vcvts_n_s32_f32 - * [ ] vcvts_n_u32_f32 - * [x] vcvt_n_s64_f64 - * [x] vcvtq_n_s64_f64 - * [x] vcvt_n_u64_f64 - * [x] vcvtq_n_u64_f64 - * [ ] vcvtd_n_s64_f64 - * [ ] vcvtd_n_u64_f64 - * [x] vcvt_n_f32_s32 - * [x] vcvtq_n_f32_s32 - * [x] vcvt_n_f32_u32 - * [x] vcvtq_n_f32_u32 - * [ ] vcvts_n_f32_s32 - * [ ] vcvts_n_f32_u32 - * [x] vcvt_n_f64_s64 - * [x] vcvtq_n_f64_s64 - * [x] vcvt_n_f64_u64 - * [x] vcvtq_n_f64_u64 - * [ ] vcvtd_n_f64_s64 - * [ ] vcvtd_n_f64_u64 - * [x] vcvt_n_f16_s16 - * [x] vcvtq_n_f16_s16 - * [x] vcvt_n_f16_u16 - * [x] vcvtq_n_f16_u16 - * [x] vcvt_n_s16_f16 - * [x] vcvtq_n_s16_f16 - * [x] vcvt_n_u16_f16 - * [x] vcvtq_n_u16_f16 - -### cvta - -SIMDe currently implements 8 of 22 (36.36%) functions. - - * [x] vcvta_s32_f32 - * [x] vcvta_u32_f32 - * [ ] vcvta_s64_f64 - * [ ] vcvta_u64_f64 - * [ ] vcvta_s16_f16 - * [ ] vcvta_u16_f16 - * [x] vcvta_s32_f32 - * [x] vcvtaq_s32_f32 - * [x] vcvta_u32_f32 - * [x] vcvtaq_u32_f32 - * [x] vcvtas_s32_f32 - * [x] vcvtas_u32_f32 - * [ ] vcvta_s64_f64 - * [ ] vcvtaq_s64_f64 - * [ ] vcvta_u64_f64 - * [ ] vcvtaq_u64_f64 - * [ ] vcvtad_s64_f64 - * [ ] vcvtad_u64_f64 - * [ ] vcvta_s16_f16 - * [ ] vcvtaq_s16_f16 - * [ ] vcvta_u16_f16 - * [ ] vcvtaq_u16_f16 - -### cvth - -SIMDe currently implements 4 of 24 (16.67%) functions, not counting 2 which require currently unsupported types. - - * [ ] vcvth_f16_s16 - * [ ] vcvth_f16_s32 - * [ ] vcvth_f16_s64 - * [ ] vcvth_f16_u16 - * [ ] vcvth_f16_u32 - * [ ] vcvth_f16_u64 - * [x] vcvth_s16_f16 - * [ ] vcvth_s32_f16 - * [ ] vcvth_s64_f16 - * [x] vcvth_u16_f16 - * [ ] vcvth_u32_f16 - * [ ] vcvth_u64_f16 - * [ ] vcvth_f16_s16 - * [ ] vcvth_f16_s32 - * [ ] vcvth_f16_s64 - * [ ] vcvth_f16_u16 - * [ ] vcvth_f16_u32 - * [ ] vcvth_f16_u64 - * [x] vcvth_s16_f16 - * [ ] vcvth_s32_f16 - * [ ] vcvth_s64_f16 - * [x] vcvth_u16_f16 - * [ ] vcvth_u32_f16 - * [ ] vcvth_u64_f16 - ### div SIMDe currently implements 3 of 9 (33.33%) functions. @@ -439,82 +172,6 @@ SIMDe currently implements 2 of 12 (16.67%) functions, not counting 8 which requ * [x] vduph_lane_f16 * [ ] vduph_laneq_f16 -### ld3 - -SIMDe currently implements 32 of 33 (96.97%) functions, not counting 12 which require currently unsupported types. - - * [x] vld3_s8 - * [x] vld3_s16 - * [x] vld3_s32 - * [x] vld3_u8 - * [x] vld3_u16 - * [x] vld3_u32 - * [x] vld3_f16 - * [x] vld3_f32 - * [x] vld3_s64 - * [x] vld3_u64 - * [x] vld3_f64 - * [x] vld3_s8 - * [x] vld3q_s8 - * [x] vld3_s16 - * [x] vld3q_s16 - * [x] vld3_s32 - * [x] vld3q_s32 - * [x] vld3_u8 - * [x] vld3q_u8 - * [x] vld3_u16 - * [x] vld3q_u16 - * [x] vld3_u32 - * [x] vld3q_u32 - * [x] vld3_f16 - * [ ] vld3q_f16 - * [x] vld3_f32 - * [x] vld3q_f32 - * [x] vld3_s64 - * [x] vld3_u64 - * [x] vld3q_s64 - * [x] vld3q_u64 - * [x] vld3_f64 - * [x] vld3q_f64 - -### ld4 - -SIMDe currently implements 32 of 33 (96.97%) functions, not counting 12 which require currently unsupported types. - - * [x] vld4_s8 - * [x] vld4_s16 - * [x] vld4_s32 - * [x] vld4_u8 - * [x] vld4_u16 - * [x] vld4_u32 - * [x] vld4_f16 - * [x] vld4_f32 - * [x] vld4_s64 - * [x] vld4_u64 - * [x] vld4_f64 - * [x] vld4_s8 - * [x] vld4q_s8 - * [x] vld4_s16 - * [x] vld4q_s16 - * [x] vld4_s32 - * [x] vld4q_s32 - * [x] vld4_u8 - * [x] vld4q_u8 - * [x] vld4_u16 - * [x] vld4q_u16 - * [x] vld4_u32 - * [x] vld4q_u32 - * [x] vld4_f16 - * [ ] vld4q_f16 - * [x] vld4_f32 - * [x] vld4q_f32 - * [x] vld4_s64 - * [x] vld4_u64 - * [x] vld4q_s64 - * [x] vld4q_u64 - * [x] vld4_f64 - * [x] vld4q_f64 - ### maxnm SIMDe currently implements 6 of 9 (66.67%) functions. @@ -705,370 +362,6 @@ SIMDe currently implements 25 of 28 (89.29%) functions. * [ ] vpmin_f16 * [ ] vpminq_f16 -### reinterpret - -SIMDe currently implements 302 of 330 (91.52%) functions, not counting 316 which require currently unsupported types. - - * [x] vreinterpret_s16_s8 - * [x] vreinterpret_s32_s8 - * [x] vreinterpret_f32_s8 - * [x] vreinterpret_u8_s8 - * [x] vreinterpret_u16_s8 - * [x] vreinterpret_u32_s8 - * [x] vreinterpret_u64_s8 - * [x] vreinterpret_s64_s8 - * [x] vreinterpret_f64_s8 - * [x] vreinterpret_f16_s8 - * [x] vreinterpret_s8_s16 - * [x] vreinterpret_s32_s16 - * [x] vreinterpret_f32_s16 - * [x] vreinterpret_u8_s16 - * [x] vreinterpret_u16_s16 - * [x] vreinterpret_u32_s16 - * [x] vreinterpret_u64_s16 - * [x] vreinterpret_s64_s16 - * [x] vreinterpret_f64_s16 - * [x] vreinterpret_f16_s16 - * [x] vreinterpret_s8_s32 - * [x] vreinterpret_s16_s32 - * [x] vreinterpret_f32_s32 - * [x] vreinterpret_u8_s32 - * [x] vreinterpret_u16_s32 - * [x] vreinterpret_u32_s32 - * [x] vreinterpret_u64_s32 - * [x] vreinterpret_s64_s32 - * [x] vreinterpret_f64_s32 - * [x] vreinterpret_f16_s32 - * [x] vreinterpret_s8_f32 - * [x] vreinterpret_s16_f32 - * [x] vreinterpret_s32_f32 - * [x] vreinterpret_u8_f32 - * [x] vreinterpret_u16_f32 - * [x] vreinterpret_u32_f32 - * [x] vreinterpret_u64_f32 - * [x] vreinterpret_s64_f32 - * [x] vreinterpret_f64_f32 - * [x] vreinterpret_f16_f32 - * [x] vreinterpret_s8_u8 - * [x] vreinterpret_s16_u8 - * [x] vreinterpret_s32_u8 - * [x] vreinterpret_f32_u8 - * [x] vreinterpret_u16_u8 - * [x] vreinterpret_u32_u8 - * [x] vreinterpret_u64_u8 - * [x] vreinterpret_s64_u8 - * [x] vreinterpret_f64_u8 - * [x] vreinterpret_f16_u8 - * [x] vreinterpret_s8_u16 - * [x] vreinterpret_s16_u16 - * [x] vreinterpret_s32_u16 - * [x] vreinterpret_f32_u16 - * [x] vreinterpret_u8_u16 - * [x] vreinterpret_u32_u16 - * [x] vreinterpret_u64_u16 - * [x] vreinterpret_s64_u16 - * [x] vreinterpret_f64_u16 - * [x] vreinterpret_f16_u16 - * [x] vreinterpret_s8_u32 - * [x] vreinterpret_s16_u32 - * [x] vreinterpret_s32_u32 - * [x] vreinterpret_f32_u32 - * [x] vreinterpret_u8_u32 - * [x] vreinterpret_u16_u32 - * [x] vreinterpret_u64_u32 - * [x] vreinterpret_s64_u32 - * [x] vreinterpret_f64_u32 - * [x] vreinterpret_f16_u32 - * [x] vreinterpret_s8_u64 - * [x] vreinterpret_s16_u64 - * [x] vreinterpret_s32_u64 - * [x] vreinterpret_f32_u64 - * [x] vreinterpret_u8_u64 - * [x] vreinterpret_u16_u64 - * [x] vreinterpret_u32_u64 - * [x] vreinterpret_s64_u64 - * [x] vreinterpret_f64_u64 - * [x] vreinterpret_f16_u64 - * [x] vreinterpret_s8_s64 - * [x] vreinterpret_s16_s64 - * [x] vreinterpret_s32_s64 - * [x] vreinterpret_f32_s64 - * [x] vreinterpret_u8_s64 - * [x] vreinterpret_u16_s64 - * [x] vreinterpret_u32_s64 - * [x] vreinterpret_u64_s64 - * [x] vreinterpret_f64_s64 - * [x] vreinterpret_f16_s64 - * [ ] vreinterpret_s8_f16 - * [ ] vreinterpret_s16_f16 - * [ ] vreinterpret_s32_f16 - * [ ] vreinterpret_f32_f16 - * [ ] vreinterpret_u8_f16 - * [x] vreinterpret_u16_f16 - * [ ] vreinterpret_u32_f16 - * [x] vreinterpret_u64_f16 - * [ ] vreinterpret_s64_f16 - * [ ] vreinterpret_f64_f16 - * [x] vreinterpret_s8_f64 - * [x] vreinterpret_s16_f64 - * [x] vreinterpret_s32_f64 - * [x] vreinterpret_u8_f64 - * [x] vreinterpret_u16_f64 - * [x] vreinterpret_u32_f64 - * [x] vreinterpret_u64_f64 - * [x] vreinterpret_s64_f64 - * [ ] vreinterpret_f16_f64 - * [x] vreinterpret_f32_f64 - * [x] vreinterpret_s16_s8 - * [x] vreinterpret_s32_s8 - * [x] vreinterpret_f32_s8 - * [x] vreinterpret_u8_s8 - * [x] vreinterpret_u16_s8 - * [x] vreinterpret_u32_s8 - * [x] vreinterpret_u64_s8 - * [x] vreinterpret_s64_s8 - * [x] vreinterpret_f64_s8 - * [x] vreinterpret_f16_s8 - * [x] vreinterpret_s8_s16 - * [x] vreinterpret_s32_s16 - * [x] vreinterpret_f32_s16 - * [x] vreinterpret_u8_s16 - * [x] vreinterpret_u16_s16 - * [x] vreinterpret_u32_s16 - * [x] vreinterpret_u64_s16 - * [x] vreinterpret_s64_s16 - * [x] vreinterpret_f64_s16 - * [x] vreinterpret_f16_s16 - * [x] vreinterpret_s8_s32 - * [x] vreinterpret_s16_s32 - * [x] vreinterpret_f32_s32 - * [x] vreinterpret_u8_s32 - * [x] vreinterpret_u16_s32 - * [x] vreinterpret_u32_s32 - * [x] vreinterpret_u64_s32 - * [x] vreinterpret_s64_s32 - * [x] vreinterpret_f64_s32 - * [x] vreinterpret_f16_s32 - * [x] vreinterpret_s8_f32 - * [x] vreinterpret_s16_f32 - * [x] vreinterpret_s32_f32 - * [x] vreinterpret_u8_f32 - * [x] vreinterpret_u16_f32 - * [x] vreinterpret_u32_f32 - * [x] vreinterpret_u64_f32 - * [x] vreinterpret_s64_f32 - * [x] vreinterpret_f64_f32 - * [x] vreinterpret_f16_f32 - * [x] vreinterpret_s8_u8 - * [x] vreinterpret_s16_u8 - * [x] vreinterpret_s32_u8 - * [x] vreinterpret_f32_u8 - * [x] vreinterpret_u16_u8 - * [x] vreinterpret_u32_u8 - * [x] vreinterpret_u64_u8 - * [x] vreinterpret_s64_u8 - * [x] vreinterpret_f64_u8 - * [x] vreinterpret_f16_u8 - * [x] vreinterpret_s8_u16 - * [x] vreinterpret_s16_u16 - * [x] vreinterpret_s32_u16 - * [x] vreinterpret_f32_u16 - * [x] vreinterpret_u8_u16 - * [x] vreinterpret_u32_u16 - * [x] vreinterpret_u64_u16 - * [x] vreinterpret_s64_u16 - * [x] vreinterpret_f64_u16 - * [x] vreinterpret_f16_u16 - * [x] vreinterpret_s8_u32 - * [x] vreinterpret_s16_u32 - * [x] vreinterpret_s32_u32 - * [x] vreinterpret_f32_u32 - * [x] vreinterpret_u8_u32 - * [x] vreinterpret_u16_u32 - * [x] vreinterpret_u64_u32 - * [x] vreinterpret_s64_u32 - * [x] vreinterpret_f64_u32 - * [x] vreinterpret_f16_u32 - * [x] vreinterpret_s8_u64 - * [x] vreinterpret_s16_u64 - * [x] vreinterpret_s32_u64 - * [x] vreinterpret_f32_u64 - * [x] vreinterpret_u8_u64 - * [x] vreinterpret_u16_u64 - * [x] vreinterpret_u32_u64 - * [x] vreinterpret_s64_u64 - * [x] vreinterpret_f64_u64 - * [x] vreinterpret_f16_u64 - * [x] vreinterpret_s8_s64 - * [x] vreinterpret_s16_s64 - * [x] vreinterpret_s32_s64 - * [x] vreinterpret_f32_s64 - * [x] vreinterpret_u8_s64 - * [x] vreinterpret_u16_s64 - * [x] vreinterpret_u32_s64 - * [x] vreinterpret_u64_s64 - * [x] vreinterpret_f64_s64 - * [x] vreinterpret_f16_s64 - * [ ] vreinterpret_s8_f16 - * [ ] vreinterpret_s16_f16 - * [ ] vreinterpret_s32_f16 - * [ ] vreinterpret_f32_f16 - * [ ] vreinterpret_u8_f16 - * [x] vreinterpret_u16_f16 - * [ ] vreinterpret_u32_f16 - * [x] vreinterpret_u64_f16 - * [ ] vreinterpret_s64_f16 - * [ ] vreinterpret_f64_f16 - * [x] vreinterpretq_s16_s8 - * [x] vreinterpretq_s32_s8 - * [x] vreinterpretq_f32_s8 - * [x] vreinterpretq_u8_s8 - * [x] vreinterpretq_u16_s8 - * [x] vreinterpretq_u32_s8 - * [x] vreinterpretq_u64_s8 - * [x] vreinterpretq_s64_s8 - * [x] vreinterpretq_f64_s8 - * [x] vreinterpretq_f16_s8 - * [x] vreinterpretq_s8_s16 - * [x] vreinterpretq_s32_s16 - * [x] vreinterpretq_f32_s16 - * [x] vreinterpretq_u8_s16 - * [x] vreinterpretq_u16_s16 - * [x] vreinterpretq_u32_s16 - * [x] vreinterpretq_u64_s16 - * [x] vreinterpretq_s64_s16 - * [x] vreinterpretq_f64_s16 - * [x] vreinterpretq_f16_s16 - * [x] vreinterpretq_s8_s32 - * [x] vreinterpretq_s16_s32 - * [x] vreinterpretq_f32_s32 - * [x] vreinterpretq_u8_s32 - * [x] vreinterpretq_u16_s32 - * [x] vreinterpretq_u32_s32 - * [x] vreinterpretq_u64_s32 - * [x] vreinterpretq_s64_s32 - * [x] vreinterpretq_f64_s32 - * [x] vreinterpretq_f16_s32 - * [x] vreinterpretq_s8_f32 - * [x] vreinterpretq_s16_f32 - * [x] vreinterpretq_s32_f32 - * [x] vreinterpretq_u8_f32 - * [x] vreinterpretq_u16_f32 - * [x] vreinterpretq_u32_f32 - * [x] vreinterpretq_u64_f32 - * [x] vreinterpretq_s64_f32 - * [x] vreinterpretq_f64_f32 - * [x] vreinterpretq_f16_f32 - * [x] vreinterpretq_s8_u8 - * [x] vreinterpretq_s16_u8 - * [x] vreinterpretq_s32_u8 - * [x] vreinterpretq_f32_u8 - * [x] vreinterpretq_u16_u8 - * [x] vreinterpretq_u32_u8 - * [x] vreinterpretq_u64_u8 - * [x] vreinterpretq_s64_u8 - * [x] vreinterpretq_f64_u8 - * [x] vreinterpretq_f16_u8 - * [x] vreinterpretq_s8_u16 - * [x] vreinterpretq_s16_u16 - * [x] vreinterpretq_s32_u16 - * [x] vreinterpretq_f32_u16 - * [x] vreinterpretq_u8_u16 - * [x] vreinterpretq_u32_u16 - * [x] vreinterpretq_u64_u16 - * [x] vreinterpretq_s64_u16 - * [x] vreinterpretq_f64_u16 - * [x] vreinterpretq_f16_u16 - * [x] vreinterpretq_s8_u32 - * [x] vreinterpretq_s16_u32 - * [x] vreinterpretq_s32_u32 - * [x] vreinterpretq_f32_u32 - * [x] vreinterpretq_u8_u32 - * [x] vreinterpretq_u16_u32 - * [x] vreinterpretq_u64_u32 - * [x] vreinterpretq_s64_u32 - * [x] vreinterpretq_f64_u32 - * [x] vreinterpretq_f16_u32 - * [x] vreinterpretq_s8_u64 - * [x] vreinterpretq_s16_u64 - * [x] vreinterpretq_s32_u64 - * [x] vreinterpretq_f32_u64 - * [x] vreinterpretq_u8_u64 - * [x] vreinterpretq_u16_u64 - * [x] vreinterpretq_u32_u64 - * [x] vreinterpretq_s64_u64 - * [x] vreinterpretq_f64_u64 - * [x] vreinterpretq_f64_s64 - * [x] vreinterpretq_f16_u64 - * [x] vreinterpretq_s8_s64 - * [x] vreinterpretq_s16_s64 - * [x] vreinterpretq_s32_s64 - * [x] vreinterpretq_f32_s64 - * [x] vreinterpretq_u8_s64 - * [x] vreinterpretq_u16_s64 - * [x] vreinterpretq_u32_s64 - * [x] vreinterpretq_u64_s64 - * [x] vreinterpretq_f16_s64 - * [ ] vreinterpretq_s8_f16 - * [ ] vreinterpretq_s16_f16 - * [ ] vreinterpretq_s32_f16 - * [ ] vreinterpretq_f32_f16 - * [ ] vreinterpretq_u8_f16 - * [x] vreinterpretq_u16_f16 - * [ ] vreinterpretq_u32_f16 - * [ ] vreinterpretq_u64_f16 - * [ ] vreinterpretq_s64_f16 - * [ ] vreinterpretq_f64_f16 - * [x] vreinterpret_s8_f64 - * [x] vreinterpret_s16_f64 - * [x] vreinterpret_s32_f64 - * [x] vreinterpret_u8_f64 - * [x] vreinterpret_u16_f64 - * [x] vreinterpret_u32_f64 - * [x] vreinterpret_u64_f64 - * [x] vreinterpret_s64_f64 - * [ ] vreinterpret_f16_f64 - * [x] vreinterpret_f32_f64 - * [x] vreinterpretq_s8_f64 - * [x] vreinterpretq_s16_f64 - * [x] vreinterpretq_s32_f64 - * [x] vreinterpretq_u8_f64 - * [x] vreinterpretq_u16_f64 - * [x] vreinterpretq_u32_f64 - * [x] vreinterpretq_u64_f64 - * [x] vreinterpretq_s64_f64 - * [ ] vreinterpretq_f16_f64 - * [x] vreinterpretq_f32_f64 - -### rev64 - -SIMDe currently implements 21 of 24 (87.50%) functions, not counting 6 which require currently unsupported types. - - * [x] vrev64_s8 - * [x] vrev64_s16 - * [x] vrev64_s32 - * [x] vrev64_u8 - * [x] vrev64_u16 - * [x] vrev64_u32 - * [x] vrev64_f32 - * [ ] vrev64_f16 - * [x] vrev64_s8 - * [x] vrev64q_s8 - * [x] vrev64_s16 - * [x] vrev64q_s16 - * [x] vrev64_s32 - * [x] vrev64q_s32 - * [x] vrev64_u8 - * [x] vrev64q_u8 - * [x] vrev64_u16 - * [x] vrev64q_u16 - * [x] vrev64_u32 - * [x] vrev64q_u32 - * [x] vrev64_f32 - * [x] vrev64q_f32 - * [ ] vrev64_f16 - * [ ] vrev64q_f16 - ### rnd SIMDe currently implements 5 of 8 (62.50%) functions. @@ -1124,558 +417,13 @@ SIMDe currently implements 6 of 9 (66.67%) functions. * [ ] vrndp_f16 * [ ] vrndpq_f16 -### st1_lane - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst1_lane_s8 - * [x] vst1_lane_s16 - * [x] vst1_lane_s32 - * [x] vst1_lane_s64 - * [x] vst1_lane_u8 - * [x] vst1_lane_u16 - * [x] vst1_lane_u32 - * [x] vst1_lane_u64 - * [ ] vst1_lane_f16 - * [x] vst1_lane_f32 - * [x] vst1_lane_f64 - * [x] vst1_lane_s8 - * [x] vst1q_lane_s8 - * [x] vst1_lane_s16 - * [x] vst1q_lane_s16 - * [x] vst1_lane_s32 - * [x] vst1q_lane_s32 - * [x] vst1_lane_s64 - * [x] vst1q_lane_s64 - * [x] vst1_lane_u8 - * [x] vst1q_lane_u8 - * [x] vst1_lane_u16 - * [x] vst1q_lane_u16 - * [x] vst1_lane_u32 - * [x] vst1q_lane_u32 - * [x] vst1_lane_u64 - * [x] vst1q_lane_u64 - * [ ] vst1_lane_f16 - * [ ] vst1q_lane_f16 - * [x] vst1_lane_f32 - * [x] vst1q_lane_f32 - * [x] vst1_lane_f64 - * [x] vst1q_lane_f64 - -### st1_x2 - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst1_s8_x2 - * [x] vst1_s16_x2 - * [x] vst1_s32_x2 - * [x] vst1_u8_x2 - * [x] vst1_u16_x2 - * [x] vst1_u32_x2 - * [ ] vst1_f16_x2 - * [x] vst1_f32_x2 - * [x] vst1_s64_x2 - * [x] vst1_u64_x2 - * [x] vst1_f64_x2 - * [x] vst1_s8_x2 - * [x] vst1q_s8_x2 - * [x] vst1_s16_x2 - * [x] vst1q_s16_x2 - * [x] vst1_s32_x2 - * [x] vst1q_s32_x2 - * [x] vst1_u8_x2 - * [x] vst1q_u8_x2 - * [x] vst1_u16_x2 - * [x] vst1q_u16_x2 - * [x] vst1_u32_x2 - * [x] vst1q_u32_x2 - * [ ] vst1_f16_x2 - * [ ] vst1q_f16_x2 - * [x] vst1_f32_x2 - * [x] vst1q_f32_x2 - * [x] vst1_s64_x2 - * [x] vst1_u64_x2 - * [x] vst1q_s64_x2 - * [x] vst1q_u64_x2 - * [x] vst1_f64_x2 - * [x] vst1q_f64_x2 - -### st1_x3 - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst1_s8_x3 - * [x] vst1_s16_x3 - * [x] vst1_s32_x3 - * [x] vst1_u8_x3 - * [x] vst1_u16_x3 - * [x] vst1_u32_x3 - * [ ] vst1_f16_x3 - * [x] vst1_f32_x3 - * [x] vst1_s64_x3 - * [x] vst1_u64_x3 - * [x] vst1_f64_x3 - * [x] vst1_s8_x3 - * [x] vst1q_s8_x3 - * [x] vst1_s16_x3 - * [x] vst1q_s16_x3 - * [x] vst1_s32_x3 - * [x] vst1q_s32_x3 - * [x] vst1_u8_x3 - * [x] vst1q_u8_x3 - * [x] vst1_u16_x3 - * [x] vst1q_u16_x3 - * [x] vst1_u32_x3 - * [x] vst1q_u32_x3 - * [ ] vst1_f16_x3 - * [ ] vst1q_f16_x3 - * [x] vst1_f32_x3 - * [x] vst1q_f32_x3 - * [x] vst1_s64_x3 - * [x] vst1_u64_x3 - * [x] vst1q_s64_x3 - * [x] vst1q_u64_x3 - * [x] vst1_f64_x3 - * [x] vst1q_f64_x3 - -### st1_x4 - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst1_s8_x4 - * [x] vst1_s16_x4 - * [x] vst1_s32_x4 - * [x] vst1_u8_x4 - * [x] vst1_u16_x4 - * [x] vst1_u32_x4 - * [ ] vst1_f16_x4 - * [x] vst1_f32_x4 - * [x] vst1_s64_x4 - * [x] vst1_u64_x4 - * [x] vst1_f64_x4 - * [x] vst1_s8_x4 - * [x] vst1q_s8_x4 - * [x] vst1_s16_x4 - * [x] vst1q_s16_x4 - * [x] vst1_s32_x4 - * [x] vst1q_s32_x4 - * [x] vst1_u8_x4 - * [x] vst1q_u8_x4 - * [x] vst1_u16_x4 - * [x] vst1q_u16_x4 - * [x] vst1_u32_x4 - * [x] vst1q_u32_x4 - * [ ] vst1_f16_x4 - * [ ] vst1q_f16_x4 - * [x] vst1_f32_x4 - * [x] vst1q_f32_x4 - * [x] vst1_s64_x4 - * [x] vst1_u64_x4 - * [x] vst1q_s64_x4 - * [x] vst1q_u64_x4 - * [x] vst1_f64_x4 - * [x] vst1q_f64_x4 - -### st2_lane - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst2_lane_s8 - * [x] vst2_lane_u8 - * [x] vst2_lane_s16 - * [x] vst2_lane_s32 - * [x] vst2_lane_u16 - * [x] vst2_lane_u32 - * [ ] vst2_lane_f16 - * [x] vst2_lane_f32 - * [x] vst2_lane_s64 - * [x] vst2_lane_u64 - * [x] vst2_lane_f64 - * [x] vst2_lane_s8 - * [x] vst2_lane_u8 - * [x] vst2_lane_s16 - * [x] vst2q_lane_s16 - * [x] vst2_lane_s32 - * [x] vst2q_lane_s32 - * [x] vst2_lane_u16 - * [x] vst2q_lane_u16 - * [x] vst2_lane_u32 - * [x] vst2q_lane_u32 - * [ ] vst2_lane_f16 - * [ ] vst2q_lane_f16 - * [x] vst2_lane_f32 - * [x] vst2q_lane_f32 - * [x] vst2q_lane_s8 - * [x] vst2q_lane_u8 - * [x] vst2_lane_s64 - * [x] vst2q_lane_s64 - * [x] vst2_lane_u64 - * [x] vst2q_lane_u64 - * [x] vst2_lane_f64 - * [x] vst2q_lane_f64 - -### st3 - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst3_s8 - * [x] vst3_s16 - * [x] vst3_s32 - * [x] vst3_u8 - * [x] vst3_u16 - * [x] vst3_u32 - * [ ] vst3_f16 - * [x] vst3_f32 - * [x] vst3_s64 - * [x] vst3_u64 - * [x] vst3_f64 - * [x] vst3_s8 - * [x] vst3q_s8 - * [x] vst3_s16 - * [x] vst3q_s16 - * [x] vst3_s32 - * [x] vst3q_s32 - * [x] vst3_u8 - * [x] vst3q_u8 - * [x] vst3_u16 - * [x] vst3q_u16 - * [x] vst3_u32 - * [x] vst3q_u32 - * [ ] vst3_f16 - * [ ] vst3q_f16 - * [x] vst3_f32 - * [x] vst3q_f32 - * [x] vst3_s64 - * [x] vst3_u64 - * [x] vst3q_s64 - * [x] vst3q_u64 - * [x] vst3_f64 - * [x] vst3q_f64 - -### st3_lane - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst3_lane_s8 - * [x] vst3_lane_u8 - * [x] vst3_lane_s16 - * [x] vst3_lane_s32 - * [x] vst3_lane_u16 - * [x] vst3_lane_u32 - * [ ] vst3_lane_f16 - * [x] vst3_lane_f32 - * [x] vst3_lane_s64 - * [x] vst3_lane_u64 - * [x] vst3_lane_f64 - * [x] vst3_lane_s8 - * [x] vst3_lane_u8 - * [x] vst3_lane_s16 - * [x] vst3q_lane_s16 - * [x] vst3_lane_s32 - * [x] vst3q_lane_s32 - * [x] vst3_lane_u16 - * [x] vst3q_lane_u16 - * [x] vst3_lane_u32 - * [x] vst3q_lane_u32 - * [ ] vst3_lane_f16 - * [ ] vst3q_lane_f16 - * [x] vst3_lane_f32 - * [x] vst3q_lane_f32 - * [x] vst3q_lane_s8 - * [x] vst3q_lane_u8 - * [x] vst3_lane_s64 - * [x] vst3q_lane_s64 - * [x] vst3_lane_u64 - * [x] vst3q_lane_u64 - * [x] vst3_lane_f64 - * [x] vst3q_lane_f64 - -### st4 - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst4_s8 - * [x] vst4_s16 - * [x] vst4_s32 - * [x] vst4_u8 - * [x] vst4_u16 - * [x] vst4_u32 - * [ ] vst4_f16 - * [x] vst4_f32 - * [x] vst4_s64 - * [x] vst4_u64 - * [x] vst4_f64 - * [x] vst4_s8 - * [x] vst4q_s8 - * [x] vst4_s16 - * [x] vst4q_s16 - * [x] vst4_s32 - * [x] vst4q_s32 - * [x] vst4_u8 - * [x] vst4q_u8 - * [x] vst4_u16 - * [x] vst4q_u16 - * [x] vst4_u32 - * [x] vst4q_u32 - * [ ] vst4_f16 - * [ ] vst4q_f16 - * [x] vst4_f32 - * [x] vst4q_f32 - * [x] vst4_s64 - * [x] vst4_u64 - * [x] vst4q_s64 - * [x] vst4q_u64 - * [x] vst4_f64 - * [x] vst4q_f64 - -### st4_lane - -SIMDe currently implements 30 of 33 (90.91%) functions, not counting 12 which require currently unsupported types. - - * [x] vst4_lane_s8 - * [x] vst4_lane_u8 - * [x] vst4_lane_s16 - * [x] vst4_lane_s32 - * [x] vst4_lane_u16 - * [x] vst4_lane_u32 - * [ ] vst4_lane_f16 - * [x] vst4_lane_f32 - * [x] vst4_lane_s64 - * [x] vst4_lane_u64 - * [x] vst4_lane_f64 - * [x] vst4_lane_s8 - * [x] vst4_lane_u8 - * [x] vst4_lane_s16 - * [x] vst4q_lane_s16 - * [x] vst4_lane_s32 - * [x] vst4q_lane_s32 - * [x] vst4_lane_u16 - * [x] vst4q_lane_u16 - * [x] vst4_lane_u32 - * [x] vst4q_lane_u32 - * [ ] vst4_lane_f16 - * [ ] vst4q_lane_f16 - * [x] vst4_lane_f32 - * [x] vst4q_lane_f32 - * [x] vst4q_lane_s8 - * [x] vst4q_lane_u8 - * [x] vst4_lane_s64 - * [x] vst4q_lane_s64 - * [x] vst4_lane_u64 - * [x] vst4q_lane_u64 - * [x] vst4_lane_f64 - * [x] vst4q_lane_f64 - -### trn - -SIMDe currently implements 21 of 24 (87.50%) functions, not counting 6 which require currently unsupported types. - - * [x] vtrn_s8 - * [x] vtrn_s16 - * [x] vtrn_u8 - * [x] vtrn_u16 - * [x] vtrn_s32 - * [x] vtrn_f32 - * [x] vtrn_u32 - * [ ] vtrn_f16 - * [x] vtrn_s8 - * [x] vtrn_s16 - * [x] vtrn_u8 - * [x] vtrn_u16 - * [x] vtrn_s32 - * [x] vtrn_f32 - * [x] vtrn_u32 - * [x] vtrnq_s8 - * [x] vtrnq_s16 - * [x] vtrnq_s32 - * [x] vtrnq_f32 - * [x] vtrnq_u8 - * [x] vtrnq_u16 - * [x] vtrnq_u32 - * [ ] vtrn_f16 - * [ ] vtrnq_f16 - -### trn1 - -SIMDe currently implements 24 of 27 (88.89%) functions, not counting 7 which require currently unsupported types. - - * [x] vtrn1_s8 - * [x] vtrn1_s16 - * [x] vtrn1_s32 - * [x] vtrn1_u8 - * [x] vtrn1_u16 - * [x] vtrn1_u32 - * [x] vtrn1_f32 - * [ ] vtrn1_f16 - * [x] vtrn1_s8 - * [x] vtrn1q_s8 - * [x] vtrn1_s16 - * [x] vtrn1q_s16 - * [x] vtrn1_s32 - * [x] vtrn1q_s32 - * [x] vtrn1q_s64 - * [x] vtrn1_u8 - * [x] vtrn1q_u8 - * [x] vtrn1_u16 - * [x] vtrn1q_u16 - * [x] vtrn1_u32 - * [x] vtrn1q_u32 - * [x] vtrn1q_u64 - * [x] vtrn1_f32 - * [x] vtrn1q_f32 - * [x] vtrn1q_f64 - * [ ] vtrn1_f16 - * [ ] vtrn1q_f16 - -### trn2 - -SIMDe currently implements 24 of 27 (88.89%) functions, not counting 7 which require currently unsupported types. - - * [x] vtrn2_s8 - * [x] vtrn2_s16 - * [x] vtrn2_s32 - * [x] vtrn2_u8 - * [x] vtrn2_u16 - * [x] vtrn2_u32 - * [x] vtrn2_f32 - * [ ] vtrn2_f16 - * [x] vtrn2_s8 - * [x] vtrn2q_s8 - * [x] vtrn2_s16 - * [x] vtrn2q_s16 - * [x] vtrn2_s32 - * [x] vtrn2q_s32 - * [x] vtrn2q_s64 - * [x] vtrn2_u8 - * [x] vtrn2q_u8 - * [x] vtrn2_u16 - * [x] vtrn2q_u16 - * [x] vtrn2_u32 - * [x] vtrn2q_u32 - * [x] vtrn2q_u64 - * [x] vtrn2_f32 - * [x] vtrn2q_f32 - * [x] vtrn2q_f64 - * [ ] vtrn2_f16 - * [ ] vtrn2q_f16 - -### uzp - -SIMDe currently implements 21 of 24 (87.50%) functions, not counting 6 which require currently unsupported types. - - * [x] vuzp_s8 - * [x] vuzp_s16 - * [x] vuzp_s32 - * [x] vuzp_f32 - * [x] vuzp_u8 - * [x] vuzp_u16 - * [x] vuzp_u32 - * [ ] vuzp_f16 - * [x] vuzp_s8 - * [x] vuzp_s16 - * [x] vuzp_s32 - * [x] vuzp_f32 - * [x] vuzp_u8 - * [x] vuzp_u16 - * [x] vuzp_u32 - * [x] vuzpq_s8 - * [x] vuzpq_s16 - * [x] vuzpq_s32 - * [x] vuzpq_f32 - * [x] vuzpq_u8 - * [x] vuzpq_u16 - * [x] vuzpq_u32 - * [ ] vuzp_f16 - * [ ] vuzpq_f16 - -### uzp1 - -SIMDe currently implements 26 of 27 (96.30%) functions, not counting 7 which require currently unsupported types. - - * [x] vuzp1_s8 - * [x] vuzp1_s16 - * [x] vuzp1_s32 - * [x] vuzp1_u8 - * [x] vuzp1_u16 - * [x] vuzp1_u32 - * [x] vuzp1_f32 - * [x] vuzp1_f16 - * [x] vuzp1_s8 - * [x] vuzp1q_s8 - * [x] vuzp1_s16 - * [x] vuzp1q_s16 - * [x] vuzp1_s32 - * [x] vuzp1q_s32 - * [x] vuzp1q_s64 - * [x] vuzp1_u8 - * [x] vuzp1q_u8 - * [x] vuzp1_u16 - * [x] vuzp1q_u16 - * [x] vuzp1_u32 - * [x] vuzp1q_u32 - * [x] vuzp1q_u64 - * [x] vuzp1_f32 - * [x] vuzp1q_f32 - * [x] vuzp1q_f64 - * [x] vuzp1_f16 - * [ ] vuzp1q_f16 - -### uzp2 - -SIMDe currently implements 26 of 27 (96.30%) functions, not counting 7 which require currently unsupported types. - - * [x] vuzp2_s8 - * [x] vuzp2_s16 - * [x] vuzp2_s32 - * [x] vuzp2_u8 - * [x] vuzp2_u16 - * [x] vuzp2_u32 - * [x] vuzp2_f32 - * [x] vuzp2_f16 - * [x] vuzp2_s8 - * [x] vuzp2q_s8 - * [x] vuzp2_s16 - * [x] vuzp2q_s16 - * [x] vuzp2_s32 - * [x] vuzp2q_s32 - * [x] vuzp2q_s64 - * [x] vuzp2_u8 - * [x] vuzp2q_u8 - * [x] vuzp2_u16 - * [x] vuzp2q_u16 - * [x] vuzp2_u32 - * [x] vuzp2q_u32 - * [x] vuzp2q_u64 - * [x] vuzp2_f32 - * [x] vuzp2q_f32 - * [x] vuzp2q_f64 - * [x] vuzp2_f16 - * [ ] vuzp2q_f16 - ## Unimplemented Families -There are currently 43 unimplemented families. +There are currently 16 unimplemented families. - * abdh (2 functions) - * abdl_high (12 functions) - * addhn_high (12 functions) * aes (8 functions) * bfdot (3 functions) * bfdot_lane (6 functions) - * cgezh (2 functions) - * cgtzh (2 functions) - * cleh (2 functions) - * cltzh (2 functions) - * copy_lane (60 functions, plus 24 functions with unsupported types) - * cvt_high (8 functions, plus 2 functions with unsupported types) - * cvtah (12 functions, plus 2 functions with unsupported types) - * cvth_n (24 functions) - * cvtm (22 functions) - * cvtmh (12 functions) - * cvtp (22 functions) - * cvtph (12 functions) - * cvtx (3 functions) - * cvtx_high (2 functions) * divh (2 functions) * dupb_lane (4 functions, plus 4 functions with unsupported types) * eor3 (9 functions) @@ -1713,19 +461,9 @@ There are currently 43 unimplemented families. * qrdmlshh (2 functions) * qrdmlshh_lane (4 functions) * qrdmulhh_lane (4 functions) - * qrshl (30 functions) - * qrshlh (4 functions) - * qrshrn_high_n (12 functions) - * qrshrun_high_n (6 functions) - * qshl_n (30 functions) - * qshlh_n (4 functions) * qshluh_n (2 functions) - * qshrn_high_n (12 functions) - * qshrnh_n (4 functions) * qshrun_high_n (6 functions) * qshrunh_n (2 functions) - * raddhn (12 functions) - * raddhn_high (12 functions) * rax (2 functions) * recp (4 functions) * recpsh (2 functions) @@ -1742,16 +480,12 @@ There are currently 43 unimplemented families. * rndph (2 functions) * rndx (9 functions) * rndxh (2 functions) - * rshrn_high_n (12 functions) - * rsubhn (12 functions) - * rsubhn_high (12 functions) * sha1 (10 functions) * sha1h (2 functions) * sha256 (8 functions) * sha512 (8 functions) * shll_high_n (24 functions) * shrn_high_n (12 functions) - * sli_n (26 functions, plus 9 functions with unsupported types) * sm3 (14 functions) * sm4 (4 functions) * subhn_high (12 functions) @@ -1761,17 +495,21 @@ There are currently 43 unimplemented families. ## Complete Families -SIMDe contains complete implementations of 245 functions families. +SIMDe contains complete implementations of 303 functions families. * aba * abal * abal_high + * abd + * abdh * abdl + * abdl_high * abs * absh * add (10 functions with unsupported types) * addh * addhn + * addhn_high * addl * addl_high * addlv @@ -1797,23 +535,44 @@ SIMDe contains complete implementations of 245 functions families. * ceqzh * cge * cgeh + * cgez + * cgezh * cgt * cgth + * cgtz + * cgtzh + * cle + * cleh * clez * clezh * cls * clt * clth + * cltz + * cltzh * clz * cmla_lane * cmla_rot_lane * cnt (3 functions with unsupported types) * combine (8 functions with unsupported types) + * copy_lane (24 functions with unsupported types) * create (8 functions with unsupported types) * cvt (4 functions with unsupported types) + * cvt_high (2 functions with unsupported types) * cvt_low (3 functions with unsupported types) + * cvt_n + * cvta + * cvtah (2 functions with unsupported types) + * cvth (2 functions with unsupported types) + * cvth_n + * cvtm + * cvtmh * cvtn * cvtnh + * cvtp + * cvtph + * cvtx + * cvtx_high * dot * dot_lane * dup_n (12 functions with unsupported types) @@ -1843,8 +602,10 @@ SIMDe contains complete implementations of 245 functions families. * ld2 (12 functions with unsupported types) * ld2_dup (12 functions with unsupported types) * ld2_lane (12 functions with unsupported types) + * ld3 (12 functions with unsupported types) * ld3_dup (12 functions with unsupported types) * ld3_lane (12 functions with unsupported types) + * ld4 (12 functions with unsupported types) * ld4_dup (12 functions with unsupported types) * ld4_lane (12 functions with unsupported types) * ldr (2 functions with unsupported types) @@ -1935,14 +696,22 @@ SIMDe contains complete implementations of 245 functions families. * qrdmulh_lane * qrdmulh_n * qrdmulhh + * qrshl + * qrshlh + * qrshrn_high_n * qrshrn_n * qrshrnh_n + * qrshrun_high_n * qrshrun_n * qrshrunh_n * qshl + * qshl_n * qshlh + * qshlh_n * qshlu_n + * qshrn_high_n * qshrn_n + * qshrnh_n * qshrun_n * qsub * qsubh @@ -1954,29 +723,37 @@ SIMDe contains complete implementations of 245 functions families. * qtbx2 (3 functions with unsupported types) * qtbx3 (3 functions with unsupported types) * qtbx4 (3 functions with unsupported types) + * raddhn + * raddhn_high * rbit (3 functions with unsupported types) * recpe * recpeh * recps + * reinterpret (316 functions with unsupported types) * rev16 (3 functions with unsupported types) * rev32 (6 functions with unsupported types) + * rev64 (6 functions with unsupported types) * rhadd * rndn * rndnh * rshl * rshr_n + * rshrn_high_n * rshrn_n * rsqrte * rsqrteh * rsqrts * rsqrtsh * rsra_n + * rsubhn + * rsubhn_high * set_lane (12 functions with unsupported types) * shl * shl_n * shll_n * shr_n * shrn_n + * sli_n (9 functions with unsupported types) * sqadd * sqaddh * sqrt @@ -1984,7 +761,16 @@ SIMDe contains complete implementations of 245 functions families. * sra_n * sri_n (9 functions with unsupported types) * st1 (12 functions with unsupported types) + * st1_lane (12 functions with unsupported types) + * st1_x2 (12 functions with unsupported types) + * st1_x3 (12 functions with unsupported types) + * st1_x4 (12 functions with unsupported types) * st2 (12 functions with unsupported types) + * st2_lane (12 functions with unsupported types) + * st3 (12 functions with unsupported types) + * st3_lane (12 functions with unsupported types) + * st4 (12 functions with unsupported types) + * st4_lane (12 functions with unsupported types) * str (2 functions with unsupported types) * sub * subh @@ -2001,9 +787,15 @@ SIMDe contains complete implementations of 245 functions families. * tbx2 (2 functions with unsupported types) * tbx3 (2 functions with unsupported types) * tbx4 (2 functions with unsupported types) + * trn (6 functions with unsupported types) + * trn1 (7 functions with unsupported types) + * trn2 (7 functions with unsupported types) * tst (6 functions with unsupported types) * uqadd * uqaddh + * uzp (6 functions with unsupported types) + * uzp1 (7 functions with unsupported types) + * uzp2 (7 functions with unsupported types) * xar * zip (6 functions with unsupported types) * zip1 (7 functions with unsupported types)