Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chip, dv] Update test description of chip_sw_keymgr_sideload_kmac_error #16661

Merged
merged 1 commit into from
Dec 12, 2022

Conversation

weicaiyang
Copy link
Contributor

Addressed the TBD in the testplan

Signed-off-by: Weicai Yang [email protected]

Copy link
Contributor

@ballifatih ballifatih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the test plan @weicaiyang. I looked a bit around the error interaction between KMAC and Keymgr, and I hope it makes sense to ask them here!

Comment on lines +2443 to +2446
- Verify that KMAC returns an error signal to the keymgr via checking keymgr CSRs, when
the operation is done:
- Check `op_status` is set to `DONE_ERROR`.
- Check `fault_status.kmac_done` is set to 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be out of the scope of the test plan, but...

I couldn't figure out how faulting kmac_app's FSM propagates to keymgr's error/fault CSR. I wonder which of the following is wrong about my reasoning:

  1. Keymgr starts KMAC operation, and kmac_if moves to StOpWait after sending all message chunks. Note that kmac_data_i.error here will propagate to keymgr_ctrl when kmac_data_i.done is raised to 1. So, this is condition on kmac_app raising done signal.
  2. On the KMAC side, faulting kmac_app's FSM causes kmac_data_i.error = 1, but kmac_data_i.done is not raised to 1 (because faulty state is StTerminalError).
  3. So glitching kmac_app's FSM leads to kmac_if being stuck at StOpWait, and kmac_app sending error, but not done.

I might be missing something here because I couldn't properly go through keymgr_ctrl yet.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eunchan yeah i think your observation is right.
In the case of a glitched fsm, I guess the kmac interface would just never complete then? Maybe we need to change this to glitch an error, or update the kmac app fsm behavior to always say "done and error" when in error state.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i had a quick look. It looks like we either need to make a keymgr update to always look at error regardless of done, or update kmac to output done whenever it's in the same error state.

The latter feels a bit more correct to me, but let me know what you think.
@cindychip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter sounds good to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first approach is better. Consider this:

  1. kmac_app FSM glitch happens before all message chunks are sent, so assume that kmac_if FSM is at StTx when glitch happens.
  2. If we follow the second approach, I think kmac_if will be stuck at StTx even if both error and done are 1.

I think kmac_if FSM should always move to error state if error is asserted by kmac_app, regardless of done signal.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i thought @ballifatih was suggesting making the "error" signal more like an interrupt into keymgr.
I think that would be fine too. I suggested the done approach mostly because it keeps the interface assumptions the same and would allow us to get around cases of being more flexible when the error condition when the transaction isn't done (like a bus).

@weicaiyang can chime in, but if we make the error more like an interrupt, we would also need to add a test case probably on keymgr block level, and it would be nice to avoid that if we can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @tjaychen, if the error becomes an interrupt, DV needs to verify it happens in any place (not just along with done). I slightly prefer to either send a done with error, or keep it as is so that keymgr gets stuck and the fatal alert eventually leads to a reset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda got stuck on the part where kmac_app is supposed to flush the incoming message blocks despite the glitch on kmac_app's FSM. I still think that kmac_if might get stuck at StTx, while waiting for kmac_data_i.ready. In StTerminalError state, I couldn't figure out how ready signal is asserted by kmac_app. StError seems to gracefully perform flushing during error, but I think FSM moves to StTerminalError when glitched.

I think if we go with the latter solution (i.e. assert done when there is error), we probably also want to make sure none of kmac_if's FSM ends up waiting for another control signal from kmac_app. Besides done, should we also assert ready in this approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's continue the discussion in this issue #16792.

Comment on lines 2447 to 2448
- Note: at end of the C test, issue a reset if the test is running in DV, otherwise, the
fatal alert will prevent the simulation from finishing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify if I understand it correctly:

  1. Error at KMAC and fault at Keymgr does not create an alert that blocks Ibex,
  2. Therefore, we can perform reset from SW side.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm... performing a reset just for that here seems a bit overkill though.
@weicaiyang couldn't we use the plusarg to just allow hanging alerts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Kmac and keymgr will fire alerts, but the alerts shouldn't block ibex.

we could use this plusarg too. Let me remove this note. either a reset or plusarg is fine to me.

run_opts: ["+bypass_alert_ready_to_end_check=1"]


- Follow the steps in `chip_sw_keymgr_sideload_kmac` test.
- Configure keymgr to enter any of the 3 working states.
- Issue a keymgr operations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: we can drop the plural (a keymgr operation.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks.

Addressed the TBD in the testplan

Signed-off-by: Weicai Yang <[email protected]>
Copy link
Contributor

@ballifatih ballifatih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @weicaiyang.

@weicaiyang weicaiyang merged commit c2d5fa8 into lowRISC:master Dec 12, 2022
@weicaiyang weicaiyang deleted the tp_km branch December 12, 2022 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants