Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add E_FLAGS for the Atomic ABIs #383

Closed
wants to merge 1 commit into from

Conversation

palmer-dabbelt
Copy link
Contributor

I wrote this before having coffee.

I wrote this before having coffee.

Signed-off-by: Palmer Dabbelt <[email protected]>
with the A.6 atomic mappings.

EF_RISCV_ATOMIC_A7 (0x0040)::: This bit is set when the binary is compatible
with the A.6 atomic mappings.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"I wrote this before having coffee." => I think you need to change A.6 to A.7 here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's been NAK'd, so whatever ;)

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

We're not using 1/16th of the encoding space for this. If it's going in the file somewhere, put it in the attributes section, or fold it into an OSABI bump that there's been mumbling of a while ago.

@palmer-dabbelt
Copy link
Contributor Author

We're not using 1/16th of the encoding space for this. If it's going in the file somewhere, put it in the attributes section, or fold it into an OSABI bump that there's been mumbling of a while ago.

OK, so do you just want me to close this?

@patrick-rivos
Copy link
Contributor

Using 2 bits seems sane and future-proof to me. It correctly describes the current state of things (where 00 is unknown since GCC has been emitting invalid mappings) and is able to describe all the mappings that we might want.

The challenge is with A.6 compatibility (which is why this PR is very related to #378).
IIUC, if A.6+A.7 compatible (A.6 with a trailing fence on SEQ_CST stores) is not a performance issue and is relatively sane, we could get away with only one bit to denote A.7 compatibility (since all A.6 mappings would be A.7 compatible).

There is a lot of discussion on #378 about that added trailing fence, so if we do care about that then it makes sense to add both bits.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

Just two states also doesn't work, because the A.6+fence case is compatible with both schemes, but by only having two states you can only declare compatibility with one and are artificially restricting linking against the other (whether at link time or load time).

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

We're not using 1/16th of the encoding space for this. If it's going in the file somewhere, put it in the attributes section, or fold it into an OSABI bump that there's been mumbling of a while ago.

OK, so do you just want me to close this?

I am just one person, others may disagree, so let's leave it open for discussion at least. And if you intend to rework it to use an alternative scheme then you can reuse the PR.

@palmer-dabbelt
Copy link
Contributor Author

We're not using 1/16th of the encoding space for this. If it's going in the file somewhere, put it in the attributes section, or fold it into an OSABI bump that there's been mumbling of a while ago.

OK, so do you just want me to close this?

I am just one person, others may disagree, so let's leave it open for discussion at least. And if you intend to rework it to use an alternative scheme then you can reuse the PR.

You were pretty clear this is not happening, though, so what's the point of discussing it?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

We're not using 1/16th of the encoding space for this. If it's going in the file somewhere, put it in the attributes section, or fold it into an OSABI bump that there's been mumbling of a while ago.

OK, so do you just want me to close this?

I am just one person, others may disagree, so let's leave it open for discussion at least. And if you intend to rework it to use an alternative scheme then you can reuse the PR.

You were pretty clear this is not happening, though, so what's the point of discussing it?

That is my personal opinion. Others may persuade me otherwise if they feel strongly (though unlikely given how precious e_flags bits are). And that says nothing about using an alternative scheme to indicate the same information without burning e_flags bits.

@palmer-dabbelt
Copy link
Contributor Author

We're not using 1/16th of the encoding space for this. If it's going in the file somewhere, put it in the attributes section, or fold it into an OSABI bump that there's been mumbling of a while ago.

OK, so do you just want me to close this?

I am just one person, others may disagree, so let's leave it open for discussion at least. And if you intend to rework it to use an alternative scheme then you can reuse the PR.

You were pretty clear this is not happening, though, so what's the point of discussing it?

That is my personal opinion. Others may persuade me otherwise if they feel strongly (though unlikely given how precious e_flags bits are). And that says nothing about using an alternative scheme to indicate the same information without burning e_flags bits.

OK, so I guess I just leave this sitting in the hopes that folks scroll past the "we're not doing this" comment?

@patrick-rivos
Copy link
Contributor

Just two states also doesn't work, because the A.6+fence case is compatible with both schemes, but by only having two states you can only declare compatibility with one and are artificially restricting linking against the other (whether at link time or load time).

If it turns out that the added fence is not a performance issue then A.6 will have no clear performance advantage over A.6+fence. If that's the case, there's not reason to distinguish between A.6/A.7 compatible since A.6 can be upgraded to A.6+fence with no performance issues (so it makes sense to always upgrade it).

@palmer-dabbelt
Copy link
Contributor Author

Just two states also doesn't work, because the A.6+fence case is compatible with both schemes, but by only having two states you can only declare compatibility with one and are artificially restricting linking against the other (whether at link time or load time).

If it turns out that the added fence is not a performance issue then A.6 will have no clear performance advantage over A.6+fence. If that's the case, there's not reason to distinguish between A.6/A.7 compatible since A.6 can be upgraded to A.6+fence with no performance issues (so it makes sense to always upgrade it).

IMO i t's pretty unlikely that the extra fence is not a performance problem, we just need someone who actually cares about the performance of their HW to go demonstrate it...

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

Just two states also doesn't work, because the A.6+fence case is compatible with both schemes, but by only having two states you can only declare compatibility with one and are artificially restricting linking against the other (whether at link time or load time).

If it turns out that the added fence is not a performance issue then A.6 will have no clear performance advantage over A.6+fence. If that's the case, there's not reason to distinguish between A.6/A.7 compatible since A.6 can be upgraded to A.6+fence with no performance issues (so it makes sense to always upgrade it).

So how do you mark an A.6+fence binary? Do you mark it as A.7-compatible? If so, how do you distinguish true A.7 from it, which is incompatible with A.6 (the whole point of having this flag)? Or if not, how do you distinguish it from A.6, given you need to detect A.6 as incompatible but A.6+fence as not? You have three cases to detect:

  1. A.6 mixing with A.6+fence
  2. A.6 mixing with A.7
  3. A.6+fence mixing with A.7

You need to be able to permit 1 and 3 but reject 2.

@patrick-rivos
Copy link
Contributor

If it turns out that the added fence is not a performance issue then A.6 will have no clear performance advantage over A.6+fence. If that's the case, there's not reason to distinguish between A.6/A.7 compatible since A.6 can be upgraded to A.6+fence with no performance issues (so it makes sense to always upgrade it).

So how do you mark an A.6+fence binary? Do you mark it as A.7-compatible? If so, how do you distinguish true A.7 from it, which is incompatible with A.6 (the whole point of having this flag)? Or if not, how do you distinguish it from A.6, given you need to detect A.6 as incompatible but A.6+fence as not? You have three cases to detect:

1. A.6 mixing with A.6+fence

2. A.6 mixing with A.7

3. A.6+fence mixing with A.7

You need to be able to permit 1 and 3 but reject 2.

We would just mark the binary as A.7 compatible or not. In this case we would only permit 3. The idea being that all existing A.6 binaries have been emitted with none of these flags set, so 1 is going to restricted for all of existing binaries anyways. It's only after these bits are added that A.6 binaries can be marked as A.6 compatible. At that same time, we'll know it doesn't make sense to only emit A.6, so A.6+fence will become standard and we won't be emitting A.6 binaries anymore.

Current state:
All binaries are unmarked

PSABI change:
A.6 binaries start getting marked (but very few people use A.6 since you can get A.6+fence with no perf issue).

So as a result the A.6 compatibility bit only sees minimal usage.

This situation is a hypothetical - as Palmer said, it's unlikely that we get this fence for free.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

If it turns out that the added fence is not a performance issue then A.6 will have no clear performance advantage over A.6+fence. If that's the case, there's not reason to distinguish between A.6/A.7 compatible since A.6 can be upgraded to A.6+fence with no performance issues (so it makes sense to always upgrade it).

So how do you mark an A.6+fence binary? Do you mark it as A.7-compatible? If so, how do you distinguish true A.7 from it, which is incompatible with A.6 (the whole point of having this flag)? Or if not, how do you distinguish it from A.6, given you need to detect A.6 as incompatible but A.6+fence as not? You have three cases to detect:

1. A.6 mixing with A.6+fence

2. A.6 mixing with A.7

3. A.6+fence mixing with A.7

You need to be able to permit 1 and 3 but reject 2.

We would just mark the binary as A.7 compatible or not. In this case we would only permit 3. The idea being that all existing A.6 binaries have been emitted with none of these flags set, so 1 is going to restricted for all of existing binaries anyways. It's only after these bits are added that A.6 binaries can be marked as A.6 compatible. At that same time, we'll know it doesn't make sense to only emit A.6, so A.6+fence will become standard and we won't be emitting A.6 binaries anymore.

Current state:
All binaries are unmarked

PSABI change:
A.6 binaries start getting marked (but very few people use A.6 since you can get A.6+fence with no perf issue).

So as a result the A.6 compatibility bit only sees minimal usage.

This situation is a hypothetical - as Palmer said, it's unlikely that we get this fence for free.

How do you enforce 2 then? The whole point of having the bit is so you can stop people mixing incompatible things.

@patrick-rivos
Copy link
Contributor

A.6 would not be marked as A.7 compatible.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

A.6 would not be marked as A.7 compatible.

So how do you permit 1?

@palmer-dabbelt
Copy link
Contributor Author

If it turns out that the added fence is not a performance issue then A.6 will have no clear performance advantage over A.6+fence. If that's the case, there's not reason to distinguish between A.6/A.7 compatible since A.6 can be upgraded to A.6+fence with no performance issues (so it makes sense to always upgrade it).

So how do you mark an A.6+fence binary? Do you mark it as A.7-compatible? If so, how do you distinguish true A.7 from it, which is incompatible with A.6 (the whole point of having this flag)? Or if not, how do you distinguish it from A.6, given you need to detect A.6 as incompatible but A.6+fence as not? You have three cases to detect:

1. A.6 mixing with A.6+fence

2. A.6 mixing with A.7

3. A.6+fence mixing with A.7

You need to be able to permit 1 and 3 but reject 2.

We would just mark the binary as A.7 compatible or not. In this case we would only permit 3. The idea being that all existing A.6 binaries have been emitted with none of these flags set, so 1 is going to restricted for all of existing binaries anyways. It's only after these bits are added that A.6 binaries can be marked as A.6 compatible. At that same time, we'll know it doesn't make sense to only emit A.6, so A.6+fence will become standard and we won't be emitting A.6 binaries anymore.

Current state:
All binaries are unmarked

PSABI change:
A.6 binaries start getting marked (but very few people use A.6 since you can get A.6+fence with no perf issue).

So as a result the A.6 compatibility bit only sees minimal usage.

This situation is a hypothetical - as Palmer said, it's unlikely that we get this fence for free.

Ya, it's essentially just using one of the bits in what's in the proposal. If nobody cares about A.6 alone that might be fine. This also has the advantage of avoiding the ambiguous state: in my proposal 0 is both the A6 mappings from LLVM and the broken mappings from GCC, but both of those are incompatible with A7.

We could always start with the single bit and see, I think Jeff was looking to get performance numbers before the GCC release so we don't end up regressing.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

If it turns out that the added fence is not a performance issue then A.6 will have no clear performance advantage over A.6+fence. If that's the case, there's not reason to distinguish between A.6/A.7 compatible since A.6 can be upgraded to A.6+fence with no performance issues (so it makes sense to always upgrade it).

So how do you mark an A.6+fence binary? Do you mark it as A.7-compatible? If so, how do you distinguish true A.7 from it, which is incompatible with A.6 (the whole point of having this flag)? Or if not, how do you distinguish it from A.6, given you need to detect A.6 as incompatible but A.6+fence as not? You have three cases to detect:

1. A.6 mixing with A.6+fence

2. A.6 mixing with A.7

3. A.6+fence mixing with A.7

You need to be able to permit 1 and 3 but reject 2.

We would just mark the binary as A.7 compatible or not. In this case we would only permit 3. The idea being that all existing A.6 binaries have been emitted with none of these flags set, so 1 is going to restricted for all of existing binaries anyways. It's only after these bits are added that A.6 binaries can be marked as A.6 compatible. At that same time, we'll know it doesn't make sense to only emit A.6, so A.6+fence will become standard and we won't be emitting A.6 binaries anymore.

Current state:
All binaries are unmarked

PSABI change:
A.6 binaries start getting marked (but very few people use A.6 since you can get A.6+fence with no perf issue).

So as a result the A.6 compatibility bit only sees minimal usage.
This situation is a hypothetical - as Palmer said, it's unlikely that we get this fence for free.

Ya, it's essentially just using one of the bits in what's in the proposal. If nobody cares about A.6 alone that might be fine. This also has the advantage of avoiding the ambiguous state: in my proposal 0 is both the A6 mappings from LLVM and the broken mappings from GCC, but both of those are incompatible with A7.

We could always start with the single bit and see, I think Jeff was looking to get performance numbers before the GCC release so we don't end up regressing.

Well, from the sounds of it GCC's was buggy so wasn't even compatible with itself. You can't really do anything about that except encourage distributions to rebuild miscompiled packages (e.g. in Debian by looking at the buildinfo files to find all packages built with a broken GCC).

@patrick-rivos
Copy link
Contributor

A.6 would not be marked as A.7 compatible.

So how do you permit 1?

We don't. The idea being that existing A.6 binaries wouldn't have that bit set and after the PSABI change the bit would see minimal usage (A.6 would just be set in concert with A.7) since we would just emit A.6+fence binaries instead.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

A.6 would not be marked as A.7 compatible.

So how do you permit 1?

We don't. The idea being that existing A.6 binaries wouldn't have that bit set and after the PSABI change the bit would see minimal usage (A.6 would just be set in concert with A.7) since we would just emit A.6+fence binaries instead.

Then there is no transition period, you have a flag day where you need to switch everything over all at once. Which is definitely not going to fly in any of the distribution ecosystems, FreeBSD or Linux.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

And I'll point out that the whole idea behind doing A.6+fence in the first place was to allow for a transition. If you want to forbid a smooth transition and force a flag day then you might as well jump straight to A.7.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 19, 2023

To be extremely clear:

A.6+fence exists solely to allow for interoperability with both A.6 and A.7 (though obviously not simultaneously). If A.6 is not detected as compatible with A.6+fence or A.7 is not detected as compatible with A.6+fence then this objective is not achieved, regardless of whether the generated code actually is, since the tooling/run-time will reject it and render it broken by virtue of emitting an error. Therefore any solution to detect the incompatibility between A.6 and A.7 must be able to identify A.6+fence in order to permit it being used with both A.6 and A.7. Otherwise A.6+fence serves zero purpose.

(And the proposed change here, whilst it burns 1/16th of the e_flags encoding space, does achieve that objective by having 0b11 as the third distinguishing state)

@patrick-rivos
Copy link
Contributor

A.6+fence also exists because some systems (all existing ones) can't be moved to A.7 as it includes new instructions (Here's the proposal draft for those new instructions)

kito-cheng added a commit that referenced this pull request May 24, 2023
Alternative version of #383

The diffrences are:

- Compatible with exising definitions of TSO field.
- Merge TSO field, so that we can have 4 more atomic ABI in the future.
- Specify the merge rule for atomic ABI.

EF_RISCV_ATOMIC_ABI_A6C means A.6 classic mapping.
EF_RISCV_ATOMIC_ABI_A6F means strengthened A.6 mapping.

EF_RISCV_ATOMIC_ABI_A7 and corresponding merge rule are not added,
I would like to defer that until we offically add the A7 mapping
into psABI.
kito-cheng added a commit that referenced this pull request Jun 9, 2023
Alternative version of #383

Defined 4 atomic abi variant.

- unknown, nonatomic, A6C and A6S

Unknown is compatilbe with exising object file.
Nonatomic is used when A-extension is absent
A6C is the table A.6 mapping defined in ISA spec.
And the A6S is the mapping defined in psABI spec.
@kito-cheng
Copy link
Collaborator

Let moving to #385

@kito-cheng kito-cheng closed this Jun 9, 2023
kito-cheng added a commit that referenced this pull request Jun 9, 2023
Alternative version of #383

Defined 4 atomic abi variant.

- unknown, nonatomic, A6C and A6S

Unknown is compatilbe with exising object file.
Nonatomic is used when A-extension is absent
A6C is the table A.6 mapping defined in ISA spec.
And the A6S is the mapping defined in psABI spec.
kito-cheng added a commit that referenced this pull request Jun 9, 2023
Alternative version of #383

Defined 4 atomic abi variant.

- unknown, nonatomic, A6C and A6S

Unknown is compatilbe with exising object file.
Nonatomic is used when A-extension is absent
A6C is the table A.6 mapping defined in ISA spec.
And the A6S is the mapping defined in psABI spec.
kito-cheng added a commit that referenced this pull request Jul 24, 2023
Alternative version of #383

Defined 4 atomic abi variant.

- unknown, nonatomic, A6C and A6S

Unknown is compatilbe with exising object file.
Nonatomic is used when A-extension is absent
A6C is the table A.6 mapping defined in ISA spec.
And the A6S is the mapping defined in psABI spec.
kito-cheng added a commit that referenced this pull request Aug 18, 2023
Alternative version of #383

Defined 4 atomic abi variant.

- unknown, nonatomic, A6C and A6S

Unknown is compatilbe with exising object file.
Nonatomic is used when A-extension is absent
A6C is the table A.6 mapping defined in ISA spec.
And the A6S is the mapping defined in psABI spec.
kito-cheng added a commit that referenced this pull request Sep 6, 2023
Alternative version of #383

Defined 4 atomic abi variant.

- unknown, nonatomic, A6C and A6S

Unknown is compatilbe with exising object file.
Nonatomic is used when A-extension is absent
A6C is the table A.6 mapping defined in ISA spec.
And the A6S is the mapping defined in psABI spec.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants