-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISC-V] Add quirks for riscv to R2RDump #101683
Conversation
@dotnet-policy-service agree company="Samsung" |
cc @LuckyXu-HF, @shushanhf LA64 could use a similar approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you test this PR? Thank you.
Thanks so much for remind us! We will implement it next week as we will start a holiday tomorrow. |
9ed77c6
to
2f70d10
Compare
2f70d10
to
68cfb45
Compare
It wasn't straightforward so i tested it manually. I crossgened four libraries (incl. |
auipc | ||
addi | ||
ld | ||
jarl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update jarl
to jalr
.
Can you share code locations in runtime where you checked the patterns?
+ Please don't force-push. #100962 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no a strict line of code which checks this pattern.
Function ProbeRiscV64Quirks
is looking for jalr
instruction. If it finds it, then it goes back through the assembly code to calculate callee address. It is more analytical approach, than the fixed pattern. Unfortunately, instructions calculating this address do not always follow each other, sometimes unrelated instruction appears between them. Other problem was with multiple-wrapped calle, then more than one ld
or lw
are used to calculate callee address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well. you cannot give codes in runtime for the patterns.
However, in other archs, their implementation is different from yours. And it looks like they have some strict patterns.
And if this code should be checked manually without any tests, I cannot review whether this is correct implementation or not.
More, I don't know well about r2rdump. I am so sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know well about r2rdump
The point of this code in r2rdump is to make the disassembly output more human readable. Nothing in the rest of the runtime depends on it. There are no hard and fast rules for how it should behave.
@SzpejnaDawid Could you please share some examples for what the output looks like with this change, including the problematic cases like more than one ld or lw are used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that System.Private.CoreLib.dll
has some interesting examples:
System.Exception Interop.GetExceptionForIoErrno(Interop+ErrorInfo, string, bool)
...
14811c: 00e00397 auipc t2, 3584
148120: f443b383 ld t2, -188(t2)
148124: 000380e7 jalr t2 // WRITE_BARRIER (HELPER)
...
148528: 00e0cf17 auipc t5, 3596
14852c: 550f0f13 addi t5, t5, 1360
148530: 00000593 li a1, 0
148534: 000f3603 ld a2, 0(t5)
148538: 000600e7 jalr a2 // System.Exception Interop.GetIOException(Interop+ErrorInfo, string) (METHOD_ENTRY_DEF_TOKEN)
...
148840: 00e10f17 auipc t5, 3600
148844: a20f0f13 addi t5, t5, -1504
148848: 000f3583 ld a1, 0(t5)
14884c: 000580e7 jalr a1 // string System.SR.GetResourceString(string) (METHOD_ENTRY_DEF_TOKEN)
...
int Interop+Globalization.GetCalendars(string, System.Globalization.CalendarId[], int)
...
14935c: 00e22517 auipc a0, 3618
149360: c5450513 addi a0, a0, -940
149364: 00053503 ld a0, 0(a0)
149368: 00053683 ld a3, 0(a0)
14936c: fe443503 ld a0, -28(s0)
149370: fec42603 lw a2, -20(s0)
149374: fdc43583 ld a1, -36(s0)
149378: fd443f03 ld t5, -44(s0)
14937c: 000680e7 jalr a3 // int Interop+Globalization.<GetCalendars>g____PInvoke|0_0(ushort*, System.Globalization.CalendarId*, int) (INDIRECT_PINVOKE_TARGET)
I know that my way of calculation callee address is not fixed as for arm
. Nevertheless, the only way to be sure how jumps are generated is to read a code of coregen2
in order to see how it generates riscv
instructions. But the profit and loss ratio of this approach is unfavorable, imo.
{ | ||
instr = BitConverter.ToUInt32(_reader.Image, imageOffset + currentInstrOffset); | ||
|
||
if (IsRiscV64LdInstruction(instr) || IsRiscV64LwInstruction(instr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just curious. In your examples, check for lw
instruction is not necessary. Could you give an example with lw
instruction? Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As i know there is no function call which uses lw
to calculate jump address. However, this check for lw
is necessary because this first if
is trying to skip all ld
and lw
instructions which are used to push arguments (likely). For example
int Interop+Globalization.GetCalendars(string, System.Globalization.CalendarId[], int)
...
14935c: 00e22517 auipc a0, 3618
149360: c5450513 addi a0, a0, -940
149364: 00053503 ld a0, 0(a0)
149368: 00053683 ld a3, 0(a0)
14936c: fe443503 ld a0, -28(s0)
149370: fec42603 lw a2, -20(s0)
149374: fdc43583 ld a1, -36(s0)
149378: fd443f03 ld t5, -44(s0)
14937c: 000680e7 jalr a3 // <calle name>
....
bool Interop+Globalization.EnumCalendarInfo(IntPtr, string, System.Globalization.CalendarId, System.Globalization.CalendarDataType, IntPtr)n
...
149470: 00e22517 auipc a0, 3618
149474: b5050513 addi a0, a0, -1200
149478: 00053503 ld a0, 0(a0)
14947c: 00053783 ld a5, 0(a0)
149480: fe843583 ld a1, -24(s0)
149484: ff442603 lw a2, -12(s0)
149488: fe043503 ld a0, -32(s0)
14948c: fd843703 ld a4, -40(s0)
149490: ff042683 lw a3, -16(s0)
149494: fd043f03 ld t5, -48(s0)
149498: 000780e7 jalr a5 // <calle name>
...
void Interop+Globalization.InitOrdinalCasingPage(int, char*)
...
149634: 00e22517 auipc a0, 3618
149638: 91c50513 addi a0, a0, -1764
14963c: 00053503 ld a0, 0(a0)
149640: 00053603 ld a2, 0(a0)
149644: ff443583 ld a1, -12(s0)
149648: ffc42503 lw a0, -4(s0)
14964c: fec43f03 ld t5, -20(s0)
149650: 000600e7 jalr a2 // <calle name>
...
int Interop+Globalization.CompareString(IntPtr, char*, int, char*, int, System.Globalization.CompareOptions)
...
149aa4: 00e21517 auipc a0, 3617
149aa8: 4bc50513 addi a0, a0, 1212
149aac: 00053503 ld a0, 0(a0)
149ab0: 00053803 ld a6, 0(a0)
149ab4: ffc42783 lw a5, -4(s0)
149ab8: fec43503 ld a0, -20(s0)
149abc: fe443583 ld a1, -28(s0)
149ac0: ff842603 lw a2, -8(s0)
149ac4: fdc43683 ld a3, -36(s0)
149ac8: ff442703 lw a4, -12(s0)
149acc: fd443f03 ld t5, -44(s0)
149ad0: 000800e7 jalr a6 // <calle name>
As you can see, the sequence of lw
and ld
instructions is not fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well. IMO, lw
instructions can be skipped in line 1334 in StaticAnalyzeRiscV64Assembly
. Am I wrong?
else
{
// check if "register" is calculated using an unsupported instruction
uint rd = (instr >> 7) & 0b_11111U;
if (rd == register)
{
return false;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to skip lw
in StaticAnalyzeRiscV64Assembly
, we will have to skip ld
too. In other case, we would have a problem, for example, with this:
void Interop+Globalization.InitOrdinalCasingPage(int, char*)
...
149634: 00e22517 auipc a0, 3618
149638: 91c50513 addi a0, a0, -1764
14963c: 00053503 ld a0, 0(a0)
149640: 00053603 ld a2, 0(a0)
149644: ff443583 ld a1, -12(s0)
149648: ffc42503 lw a0, -4(s0)
14964c: fec43f03 ld t5, -20(s0)
149650: 000600e7 jalr a2 // <calle name>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. You are right. However, actually what I want are removing 'lw' and implementing simpler like #102146. Could you check loongarch PR and share your idea? Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified code which calculates callee address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem :D
else | ||
{ | ||
// check if "register" is calculated using an unsupported instruction | ||
uint rd = (instr >> 7) & 0b_11111U; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If instr is BType, JType or SType like sd
which 11-7 bits are offset. It can return false when immediate value is same to register. I think this pattern is not produced in our runtime. I just want to check it is intended?
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my honest opinion, i have never seen a situation when an instruction, like sd
, would broke the pattern which is acceptable by ProbeRiscV64Quirks(...)
. So I assume that it is not produced in the runtime.
AnalyzeRiscV64Itype(instr, out rd, out rs1, out imm); | ||
if (rd == register) | ||
{ | ||
target =+ imm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Have you seen a pattern which needs addition for addi
instruction? IMO, it doesn't generate two addi
s for an address calculation. And it doesn't need both immediate values in ld
and addi
because ld
and addi
has the same bit size (signed 12 bits) for immediate value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are patterns which use addi
. You can find them here, link. As you can see, last three examples show patterns which use addi
. Only the first one uses ld
to add an immediate value instead of addi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean you don't need target += imm
. target = imm
is enough because address is calculated with imm in auipc and one imm in ld or addi in patterns. If I am wrong, please let me know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you that target = imm
is enough. My explanation for target += imm
is that i wanted to cover even those patterns which could exist and i didn't meet them. If you think that we shouldn't worry so much for the future I can change it to target = imm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we don't need to worry such cases. If there are such cases, it is a bug we need to fix or optimization candidates. :)
If target already has a value when we check addi
, it means a value is set from other addi
or ld
before.
- We don't need to generate multiple
addi
for an address calculation. - And
ld
andaddi
have the same offset bits. Soaddi
is redundant code in case of that target is set byld
before.
However, if you want, you can leave it. That is what I set it a Nit
.
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you
* [RISC-V] Add quirks for riscv * [RISC-V] minimize code
This PR adds a functionality
ProbeRiscV64Quirks(...)
to R2RDump in order to provide target function labels in the form of comments at the end of anyjalr
instructions. it mimics a similar solution for x64 and arm.Part of #84834, cc @dotnet/samsung