Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSetBoundsRoundDown #74

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

CSetBoundsRoundDown #74

wants to merge 2 commits into from

Conversation

nwf
Copy link
Member

@nwf nwf commented Oct 3, 2024

See #72.

@nwf nwf force-pushed the 202410-nwf-csetboundsrounddown branch from e1b377d to 0d2c6b6 Compare October 3, 2024 19:50
@nwf nwf force-pushed the 202410-nwf-csetboundsrounddown branch from 0d2c6b6 to 8ad5dfc Compare October 4, 2024 16:33
nwf added a commit to CHERIoT-Platform/cheriot-rtos that referenced this pull request Oct 7, 2024
This copies "the TLS stack buffer trick" into the base RTOS for broader
use. The implementation can be replaced with
[CSetBoundsRoundDown](CHERIoT-Platform/cheriot-sail#74)
if and when that lands in the ISA.
@kliuMsft
Copy link
Contributor

@rmn30 can this be accomplished by making smaller changes to setCapBounds, for example by removing the T=T+1 part of the following code?

https://github.com/CHERIoT-Platform/cheriot-sail/blob/64b2563e2ffc19d6bfb5a9e97c47a2b7a9207cf8/src/cheri_cap_common.sail#L456C2-L463C1

If we implement this instruction using the same way it is defined in PR, the new length has to be computed before the setCapBounds logic. This would impact both critical path timing and area.

@rmn30
Copy link
Collaborator

rmn30 commented Nov 11, 2024

I think it might take a little more than just eliminating the T increment (because we also want to avoid rounding base down) but I think we should be able to come up with an implementation. It may even be a bit simpler than existing CSetBounds.

@nwf
Copy link
Member Author

nwf commented Nov 15, 2024

So, musing aloud... given CSetBoundsRoundDown cd, cs, rs, the resulting (decoded) cd.base (and cd.address) is cs.address and cd.top is min(cs.address + rs, [some expression involving the mantissa width and cs.address]). Does it follow that we can set the (encoded) cd.E to be min(ctz(cs.address), ctz(cs.address + rs)), to ensure that the least significant 1 in either cd.base or cd.top is the minimum bit of the (shifted, encoded) cd.T and cd.B fields? Is that then enough to give us fast computation of cd.B (extract mantissa width from cs.address at cd.E shift) and cd.T (cd.B + rs >> cd.E, if that doesn't overflow the mantissa width, or cd.B + (1 << mantissa_width) - 1 if it does)? We'd still need to check that cd.T is within bounds of cs, I think?

@kliuMsft
Copy link
Contributor

Not quite sure I am following - wouldn't we want to compute the exponent of length (rs2) first and round the length down?

@rmn30
Copy link
Collaborator

rmn30 commented Nov 15, 2024

I think we could use 23 - clz(len) to calculate the preferred exponent, e_l, as per the existing CSetBounds. If this allows us to represent base exactly then I think we can use it as is (although we shouldn't increment T by one in case of inexact top). We can compare e_l to e_b = ctz(base) to work this out: if e_l <= e_b we are good. Otherwise we should use e_b and check whether we can represent the requested range (modulo possibly inexact top) or whether we should return a maxlen cap for e_b.

I've not thought this through entirely and would need to do formal checks once we have it in Sail.

Edit: definitely needs more thought. base can be made more than e_b aligned by using a value of B with trailing zeros, so this may be suboptimal. I'm also worried we could end up generating non-canonical encodings (that wouldn't be generated by existing CSetBounds) which could confuse matters.

After more thought: Since e_l is the smallest exponent that can represent the requested length if we have to use the smaller e_b to align the base we know we have to return a max length cap. The only other thing we have to deal with if is e_l is 24 (max e) and 14 < e_b < 24 : in this case we return a maxlen cap with e=14. If we adopted the optimised bounds encoding in #45 we wouldn't need that special case.

@rmn30
Copy link
Collaborator

rmn30 commented Nov 16, 2024

Attempt at Sail for above: 823e75b

@nwf
Copy link
Member Author

nwf commented Nov 17, 2024

That (comment and 823e75b) looks sensible to me and corrects a bug in my original attempt (I'd missed the 14 < e_b < 24 case).

@kliuMsft
Copy link
Contributor

Ok this looks good. I am still mulling ways to merge this with the existing setCapBounds logic to save some area (a little tricky there since setCapBounds is also the critical timing path). But even if that doesn't work out it may not be too bad (maybe adding additional 2% of area or so?).

@kliuMsft
Copy link
Contributor

@rmn30 also want to confirm - looks like the inCapBounds check is the still the same, i.e., compare the "requested" top vs the cs1.top and the new base (cs1.address) vs cs1.base?

@rmn30
Copy link
Collaborator

rmn30 commented Nov 18, 2024

@rmn30 also want to confirm - looks like the inCapBounds check is the still the same, i.e., compare the "requested" top vs the cs1.top and the new base (cs1.address) vs cs1.base?

Yes, this should never return a length that is greater than the requested length so the existing check works fine. Would like to have a proof of this, though.

@rmn30
Copy link
Collaborator

rmn30 commented Nov 18, 2024

I am still mulling ways to merge this with the existing setCapBounds logic to save some area (a little tricky there since setCapBounds is also the critical timing path)

Could you look at the last commit in this branch that combines this with some encoding changes? I think this simplifies setCapBounds (both versions) as well as making the encoding more efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants