Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zfinx Calling Convention #166

Closed
lenary opened this issue Dec 15, 2020 · 25 comments · Fixed by #198
Closed

Zfinx Calling Convention #166

lenary opened this issue Dec 15, 2020 · 25 comments · Fixed by #198
Milestone

Comments

@lenary
Copy link
Contributor

lenary commented Dec 15, 2020

LLVM Recently received a patch to add Zfinx support to the compiler: https://reviews.llvm.org/D93298

@jrtc27 correctly brings up the issue that this is going to cause some ABI compatibility issues, which we need to address going forwards.

At first thought, It seems to me that Zfinx code is compatible with the current calling convention if the ABI's FLEN is zero.

Beyond that, we probably need a single bit in the e_flags which represents if the F registers share storage with the X registers.

These are just very initial thoughts about the easy path forwards, in order to start a discussion. I know there are harder issues to touch on when enabling extensions that are this intrusive to how ISA state is preserved.

@jim-wilson
Copy link
Collaborator

Perhaps instead of dedicating a bit for zfinx, we expand the current 2-bit FLOAT_ABI field to 3 bits, and use one of the 4 new encoding for zfinx. That leaves us with 3 free encodings that can be used for other float ABIs. There is other FP related stuff coming, like zfh and alternate FP formats like posits and bfloat. So we might have need for another float abi later on Kito has made the same suggestion though I don't remember exactly where. This does have the minor downside that the 3 bits won't be contiguous which might be a little confusing.

It is valid to use -mabi=ilp32 with -march=rv32gc, but zinfx is not compatible with rv32gc/ilp32 code, so we should not consider FLOAT_ABI_SOFT to be compatible with zfinx code.

@kito-cheng
Copy link
Collaborator

I think we have 3 decision point here, 1) calling convention, 2) ELF flag allocation and 3) Naming, any missing?

  • Calling Convention
    1. Same as soft-float.
      • Pros:
        • Simple and consistent
      • Cons:
      • Less efficient if parameter not aligned.
        • double precision instruction require register align to even register.
    2. Align to even register for parameter with 2 * XLEN data.
      • e.g. double and long long on RV32 and long double and _int128 for RV64
      • Pros:
        • No re-align issue for parameter.
      • Cons:
        • More parameters passing rule
    • Example code:
      • void foo(int a, double b);
        • Option i: a passed in a0, b passed in a1 and a2 for RV32
        • Option ii: a passed in a0, b passed in a2 and a3 for RV32, a1 are skipped for padding.
  • ELF Flag
    1. Allocate dedicated bit for zfinx.
      • Pros:
        • Simple?
      • Cons:
        • Waste ELF flag bits compare to other options.
    2. Extend EF_RISCV_FLOAT_ABI
      • Extend EF_RISCV_FLOAT_ABI to 0x0026
      • Add EF_RISCV_FLOAT_ABI_FINX as 0x0020
      • Pros:
        • Extra 3 ABI slots
      • Cons:
        • Spend one bits.
    3. Merge EF_RISCV_RVE to EF_RISCV_FLOAT_ABI
      • Change EF_RISCV_FLOAT_ABI to 0x000e
      • Add EF_RISCV_FLOAT_ABI_FINX as 0x0009
      • Pros:
        • No need to spend extra bits
      • Cons:
        • Backward compatible concern?
  • Name
    1. ilp32x, lp64x
    2. Other proposals?

@jim-wilson
Copy link
Collaborator

The ISA spec allows RVE to be combined with F, D, or Q. We don't implement it in the GNU toolchain, but it is allowed by the spec. So merging the RVE flag with the float abi flags would cause trouble if anyone ever implements rv32ef in hardware.

@kito-cheng
Copy link
Collaborator

@kito-cheng
Copy link
Collaborator

The ISA spec allows RVE to be combined with F, D, or Q. We don't implement it in the GNU toolchain, but it is allowed by the spec. So merging the RVE flag with the float abi flags would cause trouble if anyone ever implements rv32ef in hardware.

Yeah, but -march=rv32ef still using -mabi=ilp32e which means EF_RISCV_RVE=1 and EF_RISCV_FLOAT_ABI=0, but I admit it's might made people confused about that, so personally I prefer extend EF_RISCV_FLOAT_ABI to 0x0026

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 15, 2021

We can probably ignore Zqinx for now (and I also don't think that's a particularly worthwhile thing to do anything special for; the extension can exist in theory as the "obvious" specification but is unlikely to ever be anything more than academic), which just leaves RV32Zdinx as the awkward case. For that, there are a couple important questions:

  1. Is RV32Zdinx something that even makes sense to build and thus worth ensuring the ABI is optimal?
  2. If so, how does performance (or at least code size) on various standard benchmarks (actual ones, not the pointless benchmarks that are just "how fast is your integer ALU and branch unit") change between the two obvious choices for the calling convention ((i) doing nothing different to soft float and (ii) always aligning to even register pairs like is done for soft float varargs already)?

2 is likely to be quite nuanced; there's already a 50% chance you happen to use an even register and thus don't require moves in the callee, and by aligning registers you waste integer registers and risk forcing more arguments to be stored to the stack. Unless there is clear evidence that it's worth adding the extra special case to make a new Zfinx "hard"-float ABI variant (which isn't that hard, you just reuse the varargs case, it's just mildly annoying and more complexity) I'd be inclined to say it should just follow the existing soft float ABI's calling convention.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 15, 2021

The ISA spec allows RVE to be combined with F, D, or Q. We don't implement it in the GNU toolchain, but it is allowed by the spec. So merging the RVE flag with the float abi flags would cause trouble if anyone ever implements rv32ef in hardware.

Yeah, but -march=rv32ef still using -mabi=ilp32e which means EF_RISCV_RVE=1 and EF_RISCV_FLOAT_ABI=0, but I admit it's might made people confused about that, so personally I prefer extend EF_RISCV_FLOAT_ABI to 0x0026

In theory someone could implement an ilp32ef for rv32ef, though I'm not sure it makes a huge amount of sense to do.

@kito-cheng
Copy link
Collaborator

Some ABI issue related to register/stack alignment

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82106

@kito-cheng
Copy link
Collaborator

In theory someone could implement an ilp32ef for rv32ef, though I'm not sure it makes a huge amount of sense to do.

My assumption is that: ilp32e is obsoleted, so ideally we should not have ilp32ef in future.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 15, 2021

Some ABI issue related to register/stack alignment

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82106

Interesting, though that's at least limited to an edge case (and LLVM gets that right, it's just a GCC bug).

@tariqkurd-repo
Copy link

I agree about ZQinx being of academic interest, but not particularly useful in reality.

RV32_ZDinx has real applications, e.g. running GPS software on small embedded cores in IoT chips. Another advantage of this config is that it allows all the instructions from my proposed code-size reduction extension Zce to be used (because Zce reuses some of the D encodings), so if you specificy RV32_ZDinx_Zce you get all of Zce, if you specify RV32D_Zce the Zced subset of Zce will not be configured.

For reference, here's my proposal so you can see the ISA subsets:

https://github.com/riscv/riscv-code-size-reduction/blob/master/ISA%20proposals/Huawei/Zce_spec.adoc

so ideally RV32_ZDinx would be well optimised.

Tariq

@kito-cheng
Copy link
Collaborator

@tariqkurd-repo OK, so my understanding you would prefer optimize calling convention for passing double precision value on RV32_ZDinx, do you have any preference on other stuffs?

And I think it's also another decision point: we might consider change type size for long double for zfinx ABI? My assumption is zfinx targeting the embedded cores, so it might be useful to change long double to 64-bits to further optimize that, but it also could be fixed via EABI, so we might have a normal zfinx ABI and zfinx EABI in future.

@tariqkurd-repo
Copy link

tariqkurd-repo commented Mar 19, 2021

Hi @kito-cheng , yes I'd prefer to have an optimised ABI for passing double precision values for RV32_ZDinx if possible.
For long double - setting it to 64-bits would be very sensible I think and would be comparable with ARM (we have found cases where the RISC-V code is much larger than ARM because of this type).

I don't have any other requests.

kito-cheng added a commit that referenced this issue May 5, 2021
- In this draft, we allocated a e_flags bit to represent FINX ABI,
  toolchian could use that to identify how to disassemble the object
  file corretly.

- We add ilp32x and lp64x to used for RV32 and RV64 with zfinx,
  this design is distinguish from existing ilp32 and lp64 calling
  convention.

- In #166, we discuss the potential optimization of this ABI, like register
  argument alignment and changing size of `long double`, but I think
  it's scope of EABI, although it's still unclear.

- Interaction with EABI:
  - Add EABI variant of ilp32x and lp64x: eilp32x and elp64x.
@tariqkurd-repo
Copy link

@kito-cheng I like the idea of reducing the size of long double to 64-bits for Zfinx platform, having long double as 128-bits is for server class machines not for embedded devices. What I've found is that embedded build systems don't link the long double library by accident, they only link it when necessary, so maybe it's not a problem.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 6, 2021

@kito-cheng I like the idea of reducing the size of long double to 64-bits for Zfinx platform, having long double as 128-bits is for server class machines not for embedded devices. What I've found is that embedded build systems don't link the long double library by accident, they only link it when necessary, so maybe it's not a problem.

RV32 doesn't have long double, only RV64, and making long double 64-bit when it exists is a pain. And yeah, if you don't want 128-bit floats, just don't use long double, use double.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 6, 2021

@kito-cheng I like the idea of reducing the size of long double to 64-bits for Zfinx platform, having long double as 128-bits is for server class machines not for embedded devices. What I've found is that embedded build systems don't link the long double library by accident, they only link it when necessary, so maybe it's not a problem.

RV32 doesn't have long double, only RV64, and making long double 64-bit when it exists is a pain. And yeah, if you don't want 128-bit floats, just don't use long double, use double.

Hm, I got confused, that's only true of __int128, long double still exists on RV32. But, yeah, don't use it, it makes no sense, and defining it to be 64-bit just papers over stupid software.

@tariqkurd-repo
Copy link

Our Hi3861 Wifi IoT platform compiles with RV32 and uses long double, it links in __addtf3 etc. That's 128-bit support isn't it? but it is explicitly used in the source code, it's not accidentally linked.

@tariqkurd-repo
Copy link

@kito-cheng I like the idea of reducing the size of long double to 64-bits for Zfinx platform, having long double as 128-bits is for server class machines not for embedded devices. What I've found is that embedded build systems don't link the long double library by accident, they only link it when necessary, so maybe it's not a problem.

RV32 doesn't have long double, only RV64, and making long double 64-bit when it exists is a pain. And yeah, if you don't want 128-bit floats, just don't use long double, use double.

Hm, I got confused, that's only true of __int128, long double still exists on RV32. But, yeah, don't use it, it makes no sense, and defining it to be 64-bit just papers over stupid software.

Ha ha! I agree with you, but that was a major motivation behind making the EABI in the first place - reducing the size of the long double type for embedded, which I think is questionable. Yes - so let's leave it unchanged for Zfinx.

@aswaterman
Copy link
Contributor

@jrtc27 as much as I’d like to agree, it isn’t that simple in practice. The printf family brings in tens of KB of code because format specifiers might refer to long doubles in the argument list, even though they essentially never do.

In any case, my take is that the long double matter should be orthogonal. There’s nothing wrong with ABIs that have Zfinx and 128b long double, nor is there anything wrong with ABIs that have Zfinx and 64b long double. I know the latter will almost always be more popular in practice, but we shouldn’t arbitrarily hem ourselves in here.

We’ve already agreed that EABI has 64b long double, so the Zfinx variant of the EABI should follow suit.

We probably don’t need to define a Zfinx variant of the UABI at this time, but my intuition is that, if we did, we should keep the data types consistent across UABIs (i.e. 128b long double).

@tariqkurd-repo
Copy link

I agree that the Zfinx version of the ABI shouldn't change the width of long double so the data types should be consistent. We need an new ABI for RV32_Zdinx because of the register pair handling, but for other configurations e.g. RV32_Zfinx the current ABI is ok as far as I know.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 6, 2021

I agree that the Zfinx version of the ABI shouldn't change the width of long double so the data types should be consistent. We need an new ABI for RV32_Zdinx because of the register pair handling, but for other configurations e.g. RV32_Zfinx the current ABI is ok as far as I know.

This keeps being a pain point. I really would like to see actual empirical evidence that the additional complexity matters in practice before we commit to adding a new ABI variant. Having it be the same as the existing soft-float ABI would be much more elegant if possible.

@kito-cheng
Copy link
Collaborator

Yeah, my intension is to keep that orthogonal with other UABI if possible at this moment, although this decision made we have 2x more zfinx ABI variant, but consider the current progress of EABI, I think it's best option from my point of view.

For zfinx, zdinx and zqinx I would strongly prefer just using same calling convention, and address the register pair issue by EABI, since P-extension also need the register pair too.

@tariqkurd-repo
Copy link

HI @kito-cheng does that mean you would require RV32IMC_Zdinx cores to use the EABI?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 10, 2021

HI @kito-cheng does that mean you would require RV32IMC_Zdinx cores to use the EABI?

No. The proposal is that the Zdinx UABI uses the exact same calling convention as soft-float today, but the EABI working group is free to define something else if it makes sense for them, though I'd still encourage them to only do so if truly necessary. Please, if you want to have a different calling convention in the UABI for Zdinx then go and conduct the scientific experiments to demonstrate whether or not it's needed, as I do not want to define a whole new calling convention in the UABI based on idle speculation. That is, run some benchmarks with and without a modified calling convention and compare instructions retired and number of loads and stores executed (to see the overhead from moving registers around and from potentially having to pass slightly more on the stack), or at least get static instruction counts from the compiled code.

@tariqkurd-repo
Copy link

that's fine for me - I see no problems with the soft-float calling convention (it's what I had in mind from the beginning, and I think was in the first version of the Zfinx spec), it makes sense to me that instead of calling the softfloat function you use the instruction instead. The layout in the registers is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants