Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[csrng] Future improvement for reseed blocking #14505

Closed
tjaychen opened this issue Aug 23, 2022 · 13 comments
Closed

[csrng] Future improvement for reseed blocking #14505

tjaychen opened this issue Aug 23, 2022 · 13 comments
Assignees
Labels
Component:RTL IP:csrng Subsystem:Entropy entropy_src, csrng, or edn related issues triaged-security Type:FutureRelease Not relevant to currently planned releases/milestones

Comments

@tjaychen
Copy link

tjaychen commented Aug 23, 2022

In the current csrng, if one of the edn interfaces requests for reseed and wins arbitration, it blocks csrng from servicing any other request until the entropy material required for the reseed is ready. This means in the worst case, a generate request from another interface could be blocked for multiple ms while entropy is slowly being accumulated.

Instead of making entropy collection an action downstream of arbitration, we can move entropy collection as a function into the command handling stages. The various requests to the entropy_src can then arbitrate among themselves.

By moving the entropy collection out of the main data path, we can ask the various command stages to only arbitrate for drbg when all of its inputs are ready (inclusive of the entropy it is collecting). This would shrink the worst case latency from the entropy collection time down to the drbg run time in case of a generate command. The only case where we might actually wait many ms is if multiple interfaces requested reseed / instantiation at the same time and they serialized on the entropy_src's ability to generate random bits.

@tjaychen tjaychen added Priority:P2 Priority: medium Component:RTL IP:entropy_src Type:FutureRelease Not relevant to currently planned releases/milestones labels Aug 23, 2022
@tjaychen
Copy link
Author

tjaychen commented Aug 23, 2022

@moidx @mwbranstad just an idea for a future enhancement.
Let me know if this makes sense to you. It can be something we look into long term if we find it could be useful.

tjaychen pushed a commit to tjaychen/opentitan that referenced this issue Aug 23, 2022
- remove testutil's direct access to entropy complex registers
- added dis_entropy_src_stop to emulate the function in edn / csrng
- sever the dependency between entropy_testutil_boot_mode and auto_mode

- edn1 is currently set to reseed once every 4 generates, this is because
  if it were set to 1, artificial test conditions could end up significantly
  stalling entropy distribution due to lowRISC#14505.

Signed-off-by: Timothy Chen <[email protected]>
@mwbranstad
Copy link
Contributor

@tjaychen I think this generally could make sense. Some effort was put into csrng to make requests non-blocking, but this works only between GEN bits deliveries. I think the case you are talking about in when a seed request is blocking all requests, which I assumed would inevitably happen.
However, re-thinking the general arbitration at this point is a good idea. Some flow diagrams showing the specific cases would be very useful as well.

@tjaychen
Copy link
Author

yeah i just don't want a reseed operation from one edn to block the generate command from another edn.
It technically doesn't have to, and things should only become truly blocking if both interfaces decide to request reseed / instantiations.

Is my understanding of the current design correct?

@mwbranstad
Copy link
Contributor

I think the case you speak of can happen. Maybe this is a possible scenario:
edn0 has done an instanitate cmd, and now getting 10000 gen bit words return based on the gen req. meanwhile, edn1 make a seed-dependent request, which will get it into the arbitration (since csrng is non-blocking for that phase). But the command will block big time waiting for that seed, a step which blocks the command in the cmd_stage cmd, which i believe locks others out at that point. Maybe your solution was to only grant this request if a) a seed is requested, and b) the seed if available. I think this could be done.

@tjaychen
Copy link
Author

yeah that's right. What I'm basically suggesting here is that a csrng hw app only arbitrate for the actual drbg hardawre when all of its inputs (including seed) is available. So this would mean that as long as multiple interfaces are not asking for seed (as you say, an exercise of instantiate / reseeds happening at the same time would do this), our blocking time would be short. So this would mean that we possibly have a very long start-up time, but as long as one edn has a very high reseed count afterwards, we will not block more than the drbg generate latency. Practically speaking this means edn0, which in theory is used more heavily, will always be able to complete generate commands fairly quickly.

btw, i'm not suggesting we do any of this now, i feel like it could be some pretty major surgery. I just wanted to file the issue to talk about whether it made sense for the future.

tjaychen pushed a commit to tjaychen/opentitan that referenced this issue Aug 23, 2022
- remove testutil's direct access to entropy complex registers
- added dis_entropy_src_stop to emulate the function in edn / csrng
- sever the dependency between entropy_testutil_boot_mode and auto_mode

- edn1 is currently set to reseed once every 4 generates, this is because
  if it were set to 1, artificial test conditions could end up significantly
  stalling entropy distribution due to lowRISC#14505.

Signed-off-by: Timothy Chen <[email protected]>
tjaychen pushed a commit to tjaychen/opentitan that referenced this issue Aug 23, 2022
- remove testutil's direct access to entropy complex registers
- added dis_entropy_src_stop to emulate the function in edn / csrng
- sever the dependency between entropy_testutil_boot_mode and auto_mode

- edn1 is currently set to reseed once every 4 generates, this is because
  if it were set to 1, artificial test conditions could end up significantly
  stalling entropy distribution due to lowRISC#14505.

Signed-off-by: Timothy Chen <[email protected]>
tjaychen pushed a commit to tjaychen/opentitan that referenced this issue Aug 24, 2022
- remove testutil's direct access to entropy complex registers
- added dis_entropy_src_stop to emulate the function in edn / csrng
- sever the dependency between entropy_testutil_boot_mode and auto_mode

- edn1 is currently set to reseed once every 4 generates, this is because
  if it were set to 1, artificial test conditions could end up significantly
  stalling entropy distribution due to lowRISC#14505.

Signed-off-by: Timothy Chen <[email protected]>
tjaychen pushed a commit to tjaychen/opentitan that referenced this issue Aug 24, 2022
- remove testutil's direct access to entropy complex registers
- added dis_entropy_src_stop to emulate the function in edn / csrng
- sever the dependency between entropy_testutil_boot_mode and auto_mode

- edn1 is currently set to reseed once every 4 generates, this is because
  if it were set to 1, artificial test conditions could end up significantly
  stalling entropy distribution due to lowRISC#14505.

Signed-off-by: Timothy Chen <[email protected]>
tjaychen pushed a commit that referenced this issue Aug 25, 2022
- remove testutil's direct access to entropy complex registers
- added dis_entropy_src_stop to emulate the function in edn / csrng
- sever the dependency between entropy_testutil_boot_mode and auto_mode

- edn1 is currently set to reseed once every 4 generates, this is because
  if it were set to 1, artificial test conditions could end up significantly
  stalling entropy distribution due to #14505.

Signed-off-by: Timothy Chen <[email protected]>
@johngt
Copy link

johngt commented Nov 15, 2022

@tjaychen / @mwbranstad I'm trying to go through older issues that may have an impact on M2.
This discussion potentially has impact on RTL so would just like an update here on if there are changes here that will need to go into the Engineering Sample and as such should be worked on.
Also flagging for attention of @andreaskurth / @vogelpi

If this is not urgent I'll tag it explicitly as M3.

@vogelpi vogelpi added this to the Project: M3 milestone Nov 15, 2022
@vogelpi
Copy link
Contributor

vogelpi commented Nov 15, 2022

Thanks @johngt , I think this is a very interesting topic but certainly infeasible for M2. This would require RTL changes and DV effort. Let's tag it as M3 for now.

@moidx
Copy link
Contributor

moidx commented Nov 15, 2022

This was originally labeled as future release, so probably not even for M3. Most future release issues are been assigned to the backlog milestone.

@vogelpi vogelpi modified the milestones: Project: M3, Backlog Nov 15, 2022
@tjaychen
Copy link
Author

yeah what's described here i think would be a fairly substantial change of csrng. I dont think it's something we should try to take on for M3.

@andreaskurth
Copy link
Contributor

Triaged for csrng, keeping Type:FutureRelease Not relevant to currently planned releases/milestones as this seems out of scope for this release.

@msfschaffner msfschaffner added the Earlgrey-PROD Candidate Temporary label to triage issues into Earlgrey-PROD Milestones label Oct 6, 2023
@msfschaffner msfschaffner added the Hotlist:Security Security Opinion Needed label Nov 8, 2023
@msfschaffner msfschaffner modified the milestones: Backlog, Earlgrey-PROD.M2 Nov 8, 2023
@msfschaffner msfschaffner added the Subsystem:Entropy entropy_src, csrng, or edn related issues label Dec 21, 2023
@msfschaffner msfschaffner removed the Type:FutureRelease Not relevant to currently planned releases/milestones label Dec 21, 2023
@vogelpi
Copy link
Contributor

vogelpi commented Dec 21, 2023

Triaged with @msfschaffner and @h-filali . On one hand, we have a 4-entry FIFO in entropy_src holding 4 seeds, so after ~20ms of operating without reseeding this FIFO should be full and no CSRNG context should be blocked for long. However, having this feature would allow giving better worst-case guarantees. It might be relevant.

@johannheyszl johannheyszl removed the Priority:P2 Priority: medium label Jan 12, 2024
@vogelpi vogelpi added the Priority:P3 Priority: low label Jan 12, 2024
@vogelpi
Copy link
Contributor

vogelpi commented Jan 12, 2024

After having gone through all relevant issues on the entropy complex, I think this one is not relevant enough to give it higher priority. We can leave it in M2 for now and re-evaluate once more of the higher-priority issues have been solved.

@johannheyszl johannheyszl removed the Priority:P3 Priority: low label Jan 15, 2024
@vogelpi
Copy link
Contributor

vogelpi commented Jan 15, 2024

@johannheyszl and I discussed this. There doesn't seem to be a clear security benefit here.

This rather seems to be potential performance improvement but the effective gains seem minimal given the 4-entry deep FIFO at the ENTROPY_SRC output. We can revisit this if we later find out that the current behavior indeed negatively impacts performance. At the moment, we don't have any evidence for a real performance issue here. Let's close as not planned.

@vogelpi vogelpi closed this as not planned Won't fix, can't repro, duplicate, stale Jan 15, 2024
@vogelpi vogelpi added Type:FutureRelease Not relevant to currently planned releases/milestones triaged-security and removed Hotlist:Security Security Opinion Needed Earlgrey-PROD Candidate Temporary label to triage issues into Earlgrey-PROD Milestones labels Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component:RTL IP:csrng Subsystem:Entropy entropy_src, csrng, or edn related issues triaged-security Type:FutureRelease Not relevant to currently planned releases/milestones
Projects
None yet
Development

No branches or pull requests

9 participants