-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm64: Implement SVE encodings #94549
Comments
cc: @dotnet/jit-contrib , @BruceForstall |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsSummaryBased upon the model I prototyped in #94529, let us try to use the boiler plate code that the tool generated to implement following methods.
I have split the implementation among Alan, Alan and Will. I will join the efforts once I get am done with register allocation support for predicate registers. Once I do some cleanup to the tool, I will share the repo of the tool so you can generate the boiler plate files on your own. PR expectationWe need to still figure out the encoding validation story, but that or the pending register allocation work should not stop us to start implementing these formats. The PRs should be small enough that implements all the methods listed above for the group of format names. AssignmentsAlan Hayward (@a74nh)Unique entries= 61, Total formats= 163
Aman Khalid (@amanasifkhalid)Unique entries= 30, Total formats= 71
Will Smith (@TIHan)Unique entries= 32, Total formats= 72
PS: Unique entries means number of implementation the person has to write for their assignment. Total entries means number for format names they will cover. Contributes to #93095
|
Link text is wrong. Not sure how many of these are fixable in the script and which ones are intentional we should be manually fixing up emitInsSanityCheck_sve.cpp.txt
emitDispInsHelp_sve.cpp.txt
emitArm64EmitterUnitTests_sve.cpp.txt
theEmitter->emitIns_R_R_R(INS_sve_add, EA_8BYTE, REG_R0, REG_R1, REG_R2, INS_OPTS_8B); // ADD <Zdn>.<T>, <Pg>/M, <Zdn>.<T>, <Zm>.<T> IF_SVE_AB_3A This way it's much easier to find all the lines required for a given group. |
This was done intentionally, so the developers fix them by hand while making sure that they are correct.
Yes, this can be fixed.
Sure.
Thanks for spotting that. Let me double check. Also as I pointed in #94811 (comment), I will see if I can differentiate |
I fixed this as well. |
@a74nh - if you get a chance, can you confirm if the encoding problem you were seeing was because of this issue? |
Done in #95105 and updated the |
These groups are currently unsupported in capstone. They are from the newer extensions, so we don't need them yet in coreclr. Probably best to not add the code yet until we can test with capstone? |
SVE instruction latency and throughput extracted from https://developer.arm.com/documentation/PJDOC-466751330-18256/0003.
|
We need to decide what to do for the instructions not in N2. |
This is the bash script I'm using to test my changes. It's not ideal - the awk bits are using hardcoded offsets to strip the non-sve parts from the output, which will change for a different source program. But it works for me for now. Note - the non-sve code needs stripping out because the output between capstone and coreclr is quite different (eg register names and offsets). Maybe this needs tidying up in coreclr so that the entirety of the codegen tests can be automatically tested.
|
Could you update the
Otherwise it's too easy to get C and X confused. |
Part of #94549. Implements the following encodings: IF_SVE_FL_3A IF_SVE_FM_3A IF_SVE_FN_3A IF_SVE_FN_3B IF_SVE_FO_3A IF_SVE_FP_3A IF_SVE_FQ_3A IF_SVE_FS_3A IF_SVE_FW_3A IF_SVE_FX_3A
Hey @a74nh @snickolls-arm , how should we coordinate who is going to do the rest of the formats? I ask because I don't want to implement formats that someone is already working on. I accidently did that with #98882 and I didn't see it marked in the list before I did the work. |
Part of #94549. Adds the following encodings: IF_SVE_AB_3B IF_SVE_HL_3B IF_SVE_GI_4A IF_SVE_HU_4A IF_SVE_GC_3A IF_SVE_GF_3A IF_SVE_GW_3A IF_SVE_HK_3A IF_SVE_AT_3B IF_SVE_AU_3B IF_SVE_BD_3B IF_SVE_EF_3A IF_SVE_EI_3A
Part of #94549. Adds the following encodings: IF_SVE_GJ_3A IF_SVE_GN_3A IF_SVE_GO_3A IF_SVE_GW_3B IF_SVE_HA_3A IF_SVE_HA_3A_E IF_SVE_HA_3A_F IF_SVE_HB_3A IF_SVE_HD_3A IF_SVE_HD_3A_A IF_SVE_HK_3B IF_SVE_AV_3A IF_SVE_BB_2A IF_SVE_BC_1A
@a74nh - Can you come with list of APIs that needs to be implemented first to support this scenario? |
Expanding the |
Part of #94549. Adds the following encodings: SVE_AW_2A SVE_AX_1A SVE_AY_2A SVE_AZ_2A SVE_BM_1A SVE_BN_1A
Thank you @a74nh , @amanasifkhalid , @TIHan , @snickolls-arm and @SwapnilGaikwad for your contribution! Great work. |
Summary
Based upon the model I prototyped in #94529, let us try to use the boilerplate code that the tool generated to implement following methods.
Code needed in various
emitIns_*
methods:I have split the implementation among Alan, Aman and Will. I will join the efforts once I get am done with register allocation support for predicate registers. Once I do some cleanup to the tool, I will share the repo of the tool so you can generate the boiler plate files on your own.
PR expectation
Start sequentially with the format names that are assigned and send PRs. In the PR, it will be useful to paste the disassembly produced from instructions in #94549 (comment). The expectation is to have the encoding validated before submitting the PR.
References
List of format patterns
..........iiiiii ...iiinnnnn.TTTT
..........iiiiii ...iiinnnnnttttt
........xx...... ...gggmmmmmddddd
........xx.mmmmm ......nnnnnddddd
........xx...... ..hiiiiiiiiddddd
...........mmmmm ......nnnnnddddd
..............ii iiiiiiiiiiiddddd
........xx..gggg ..hiiiiiiiiddddd
........ii.xxxxx ......nnnnnddddd
........xx...... ......nnnnnddddd
........xx...... ...gggnnnnnddddd
........xx.mmmmm ..VVVVnnnnnddddd
............MMMM ..gggg.NNNN.DDDD
........xx...... ...gggxxiiiddddd
........xx.xxiii ......nnnnnddddd
........xx...... ...iiiiiiiiddddd
.........x.mmmmm ....hhnnnnnddddd
...........mmmmm ....hhnnnnnddddd
................ ...gggmmmmmddddd
........xx...... .......NNNN.DDDD
.........i.iimmm ......nnnnnddddd
...........iimmm ......nnnnnddddd
...........immmm ......nnnnnddddd
...........iiiii ...iiinnnnnddddd
...........iiiii ...iiimmmmmddddd
........xx..gggg ...iiiiiiiiddddd
........xx..gggg ...........ddddd
........xx...... ...........ddddd
........xx..MMMM .......NNNN.DDDD
............iiii ......nnnn.ddddd
................ ...gggnnnnnddddd
........xx...... ...gggnnnnn.DDDD
........xx.mmmmm ...gggnnnnn.DDDD
........xx.mmmmm ...gggnnnnnddddd
........xx...... ...ggg....iddddd
........i..mmmmm ......nnnnnddddd
........ii.mmmmm ......nnnnnddddd
........ii.mmmmm ...i..nnnnnddddd
...........mmmmm ......kkkkkddddd
................ ......nnnn.ddddd
...........iimmm ....i.nnnnnddddd
........xx.mmmmm .rrgggnnnnnddddd
...........immmm ....rrnnnnnddddd
........xx.....r ...gggmmmmmddddd
...........iimmm ....iinnnnnddddd
................ ......mmmmmddddd
................ ...........ddddd
........xx.xxiii ......mmmmmddddd
........xx.....M ...gggnnnnnddddd
................ ......nnnnnddddd
........xx.mmmmm ...gggaaaaaddddd
........xx.iiiii ......iiiiiddddd
........xx.mmmmm ......iiiiiddddd
........xx.iiiii ......nnnnnddddd
...........nnnnn .....iiiiiiddddd
................ .....iiiiiiddddd
............iiii ......pppppddddd
...........Xiiii ......pppppddddd
...........ixxxx ......nnnnnddddd
............iiii ......mmmmmddddd
........xx...... ......mmmmmddddd
................ ......nnnnn.DDDD
.........i...ii. ......nnnnn.DDDD
..............i. ......nnnnn.DDDD
.............ii. ......nnnnn.DDDD
................ .......NNNNddddd
.........i...ii. .......NNNNddddd
..............i. .......NNNNddddd
.............ii. .......NNNNddddd
................ .......NNNN.DDDD
........xx...... ...VVVnnnnnddddd
........xx...... ...VVVmmmmmddddd
........xx.iiiii ...gggnnnnn.DDDD
........xx.iiiii ii.gggnnnnn.DDDD
................ ..gggg.NNNNMDDDD
................ ..gggg.NNNN.DDDD
................ ..gggg.NNNN.MMMM
................ .......gggg.DDDD
........xx...... ......ppppp.DDDD
........xx...... .............DDD
........xx...... .......VVVV.DDDD
................ ............DDDD
................ ..gggg.NNNN.....
........xx...... ..gggg.NNNNddddd
........xx...... .....l.NNNNddddd
........xx...... .......MMMMddddd
........xx...... .....X.MMMMddddd
................ ................
................ .......NNNN.....
.........x.mmmmm ......nnnnn.....
........xx.mmmmm ...X..nnnnn.DDDD
........xx.mmmmm ......nnnnn.DDD.
........xx.mmmmm ..l...nnnnn..DDD
........xx.mmmmm ......nnnnn.DDDD
........ix.xxxvv ..NNNN.MMMM.DDDD
........xx...... ......iiNNN.DDDD
........xx...... .......iNNN.DDDD
........xx.mmmmm ....rrnnnnnddddd
...........iimmm ....rrnnnnnddddd
...........immmm ....i.nnnnnddddd
...........mmmmm ......aaaaaddddd
.........x.xxiii ......nnnnnddddd
........xx...... .....rmmmmmddddd
.........x.mmmmm ......nnnnnddddd
.........x.xx... ......nnnnnddddd
...........mmmmm ...gggnnnnnddddd
........xx...iii ......mmmmmddddd
.............xx. ...gggnnnnnddddd
........xx.aaaaa ...gggmmmmmddddd
.........h.mmmmm ...gggnnnnnttttt
...........mmmmm ...gggnnnnnttttt
...........iiiii ...gggnnnnnttttt
............iiii ...gggnnnnnttttt
.........h.mmmmm ...gggnnnnn.oooo
...........mmmmm ...gggnnnnn.oooo
..........iiiiii ...gggnnnnn.oooo
...........iiiii ...gggnnnnn.oooo
..........iiiiii ...gggnnnnnttttt
.........xxmmmmm ...gggnnnnnttttt
...........mmmmm .h.gggnnnnnttttt
.........xx.iiii ...gggnnnnnttttt
..........xmmmmm ...gggnnnnnttttt
..........x.iiii ...gggnnnnnttttt
Contributes to #93095
Assignments
Alan Hayward (@a74nh)
Unique entries= 61, Total formats= 181
SVE_BF_2A
toSVE_FU_2A
#98968SVE_BF_2A
toSVE_FU_2A
#98968SVE_BF_2A
toSVE_FU_2A
#98968SVE_BI_2A
toSVE_HF_2A
#98784SVE_BI_2A
toSVE_HF_2A
#98784SVE_BI_2A
toSVE_HF_2A
#98784SVE_BI_2A
toSVE_HF_2A
#98784SVE_BI_2A
toSVE_HF_2A
#98784SVE_BI_2A
toSVE_HF_2A
#98784SVE_BI_2A
toSVE_HF_2A
#98784SVE_BX_2A
andSVE_BY_2A
#99087SVE_BX_2A
andSVE_BY_2A
#99087Aman Khalid (@amanasifkhalid)
Unique entries= 30, Total formats= 73
Will Smith (@TIHan)
Unique entries= 32, Total formats= 113
SVE_GD_2A
ARM64 encoding #95618SVE_GG_3A
toSVE_GH_3A
#98316SVE_GG_3A
toSVE_GH_3A
#98316SVE_GG_3A
toSVE_GH_3A
#98316SVE_GG_3A
toSVE_GH_3A
#98316SVE_GG_3A
toSVE_GH_3A
#98316fmlalb
,fmlalt
,fmlallbb
,fmlallbt
,fmlalltb
,fmlalltt
instructions capstone-engine/capstone#2270)fmlalb
,fmlalt
,fmlallbb
,fmlallbt
,fmlalltb
,fmlalltt
instructions capstone-engine/capstone#2270)SVE_GP_3A
toSVE_HV_4A
#98141SVE_GP_3A
toSVE_HV_4A
#98141SVE_GP_3A
toSVE_HV_4A
#98141SVE_GP_3A
toSVE_HV_4A
#98141SVE_GP_3A
toSVE_HV_4A
#98141SVE_GP_3A
toSVE_HV_4A
#98141SVE_GP_3A
toSVE_HV_4A
#98141SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_HW_4A
toSVE_HW_4B_D
#97433SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_IF_4A
toSVE_JK_4B
#97739SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HY_3A
toSVE_IB_3A
#98468SVE_HY_3A
toSVE_IB_3A
#98468SVE_HY_3A
toSVE_IB_3A
#98468SVE_HY_3A
toSVE_IB_3A
#98468SVE_HY_3A
toSVE_IB_3A
#98468SVE_HY_3A
toSVE_IB_3A
#98468SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_HX_3A_B
toSVE_JL_3A
andSVE_IC_3A
toSVE_IC_3A_C
#98332SVE_ID_2A
toSVE_JH_2A
#98015SVE_ID_2A
toSVE_JH_2A
#98015SVE_ID_2A
toSVE_JH_2A
#98015SVE_ID_2A
toSVE_JH_2A
#98015SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_IH_3A
toSVE_JO_3A
#95994SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129SVE_JD_4A
toSVE_JN_3B
#97129PS: Unique entries means number of implementation the person has to write for their assignment. Total entries means number for format names they will cover.
The text was updated successfully, but these errors were encountered: