Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spi_device] Fix RDC Metastability #15773

Merged
merged 2 commits into from
Oct 31, 2022
Merged

Conversation

eunchan
Copy link
Contributor

@eunchan eunchan commented Oct 27, 2022

The Read FIFO in SPI TPM module uses TPM CS# as a reset for both write and read ports. The intention is to reset the FIFO at the end of every SPI TPM transaction.

As TPM CS# is controlled by TPM host system, the timing of CS# de-assertion (0 -> 1) is independant of the SYS_CLK. If release of CS# coincidently hit the posedge of the SYS_CLK, there's chance the Read FIFO depth signal may trigger metastability issues.

To avoid the issue, I may suggest two solutions quickly.

  1. Convert the CS# rst synchronizer to sync assert, sync de-assert.
  2. Add 2FF sync to the depth signal to eliminate metastability.

In this commit, I implement the option 2. Here's why I avoid the option 1.

Option 1 delays the assertion time of the reset on the write port of the read FIFO by 2 SYS_CLK. To reset the internal pointers correctly, both write/ read ports should be reset. To satisfy the requirement, CS# should remain high (inactive) for at least 2 SYS_CLK (+ 1 SYS_CLK at worst case). In real time scale, it is around 120ns. The minimum TPM CS# inactive time can be as low as 50ns. So, I wanted to avoid the case. However, in other logic inside TPM may require 2 SYS_CLK inactive time.

With the option 2, the TPM CS# inactive requirement is not needed. However, the status represented in CSRs will be delayed by 2 SYS_CLK. Fortunately, the read FIFO depth or notempty status are not latency sensitive signals.

@eunchan
Copy link
Contributor Author

eunchan commented Oct 27, 2022

Hey @a-will, please let me know if 2 clk delays of the read FIFO depth may cause issues on DIF.

.d_i (sys_rdfifo_wdepth),
.q_o (sys_rdfifo_depth_o)
);
assign sys_rdfifo_notempty_o = |sys_rdfifo_depth_o;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm...assuming wdepth went metastable, wouldn't you get a potentially corrupted value for a cycle? If that's the case, is it safe to have it calculate notempty_o? I think usually when I encounter issues like this, I sample one more cycle and check for signal stability before allowing the output to be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdepth may go metastable. After 2FF, the sys_rdfifo_depth_o is stable (but cannot guarantee if the value is correct or not). So it is OK to use the 2FF'd signal directly if the value is not a concern.

If value is a concern and latency is not that important (compare to the value correctness), then as you said [one more latch and compare with prev, then use the value only when both are same] is the best bet. Do you think it is better? I can quickly change the design.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah honestly you would know better than me whether the one cycle incorrect value would hurt.. i was just pointing out that it could happen. I can't remember how the read fifo write depth is used... i guess it's used for the wait comparison? And if that's the case, if CSB really released, I guess we really don't care right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After increase the read fifo size, I don't think any usecase of the depth nor rdfifo_notempty. Hum... we may just remove them from CSR also..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah especially since we reset so that there can't be any left over..there might not be a need for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tjaychen It's just a nice-to-have, I think, so no need to bend over backwards to implement the detection of an aborted transaction. With the command in TPM_CMD_ADDR, SW knows what should have been transferred, and it's probably fine that a mismatch vs. the actual transaction simply produces undefined behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, are you all OK with removing them from CSRs? Then I can revise this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK as everyone agreed, I will remove the fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, I forgot about determining how much data we could put in. If this FIFO were dedicated to TPM_FIFO_XDATA / TPM_FIFO_DATA, then we would actually need the depth and the data would stay across multiple transactions. However, this FIFO is also used for a software response to various registers, right?

In that case, we'll just have to have all data for a transaction written in one call, and the check will be to ensure the size isn't larger than the maximum capacity. I'll note that this architecture is likely pretty slow, though--We are forced to insert considerably-long wait states for every transaction.

The Read FIFO in SPI TPM module uses TPM CS# as a reset for both write
and read ports. The intention is to reset the FIFO at the end of every
SPI TPM transaction.

As TPM CS# is controlled by TPM host system, the timing of CS#
de-assertion (0 -> 1) is independant of the SYS_CLK. If release of CS#
coincidently hit the posedge of the SYS_CLK, there's chance the Read
FIFO depth signal may trigger metastability issues.

To avoid the issue, I may suggest two solutions quickly.

1. Convert the CS# rst synchronizer to sync assert, sync de-assert.
2. Add 2FF sync to the depth signal to eliminate metastability.

In this commit, I implement the option 2. Here's why I avoid the option 1.

Option 1 delays the assertion time of the reset on the write port of the
read FIFO by 2 SYS_CLK. To reset the internal pointers correctly, both
write/ read ports should be reset. To satisfy the requirement, CS#
should remain high (inactive) for at least 2 SYS_CLK (+ 1 SYS_CLK at
worst case). In real time scale, it is around 120ns. The minimum TPM CS#
inactive time can be as low as 50ns. So, I wanted to avoid the case.
However, in other logic inside TPM may require 2 SYS_CLK inactive time.

With the option 2, the TPM CS# inactive requirement is not needed.
However, the status represented in CSRs will be delayed by 2 SYS_CLK.
Fortunately, the read FIFO depth or notempty status are not latency
sensitive signals.

Signed-off-by: Eli Kim <[email protected]>
@eunchan eunchan requested a review from a team as a code owner October 28, 2022 00:24
@eunchan eunchan requested review from timothytrippel and removed request for a team October 28, 2022 00:24
As the Read FIFO has been increased to max TPM transfer size, 64B,
current DIF, DV test sequence do not use the rdfifo not empty signal nor
rdfifo depth field in `TPM_STATUS`.

In previous commit, the approach to resolve the metastability issue on
RDC was to add 2FF. However, this commit removes the field so that the
data never crosses the reset domain.

Signed-off-by: Eli Kim <[email protected]>
Comment on lines -1832 to -1853
{SPI_DEVICE_TPM_STATUS_RDFIFO_NOTEMPTY_BIT, 1},
{SPI_DEVICE_TPM_STATUS_RDFIFO_DEPTH_OFFSET, 14},
{SPI_DEVICE_TPM_STATUS_WRFIFO_DEPTH_OFFSET, 4},
});
EXPECT_EQ(dif_spi_device_tpm_write_data(&spi_, /*length=*/9, data),
kDifOutOfRange);

EXPECT_READ32(SPI_DEVICE_TPM_STATUS_REG_OFFSET,
{
{SPI_DEVICE_TPM_STATUS_CMDADDR_NOTEMPTY_BIT, 0},
{SPI_DEVICE_TPM_STATUS_RDFIFO_NOTEMPTY_BIT, 1},
{SPI_DEVICE_TPM_STATUS_RDFIFO_DEPTH_OFFSET, 15},
{SPI_DEVICE_TPM_STATUS_WRFIFO_DEPTH_OFFSET, 4},
});
EXPECT_EQ(dif_spi_device_tpm_write_data(&spi_, /*length=*/5, data),
kDifOutOfRange);

EXPECT_READ32(SPI_DEVICE_TPM_STATUS_REG_OFFSET,
{
{SPI_DEVICE_TPM_STATUS_CMDADDR_NOTEMPTY_BIT, 0},
{SPI_DEVICE_TPM_STATUS_RDFIFO_NOTEMPTY_BIT, 1},
{SPI_DEVICE_TPM_STATUS_RDFIFO_DEPTH_OFFSET, 1},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a-will I removed these checkers as DIF has been changed to check only if SW writes more than the FIFO size. Please take a look.

@eunchan eunchan merged commit 3142c5d into lowRISC:master Oct 31, 2022
@eunchan eunchan deleted the tpm_rdfifo_meta_fix branch October 31, 2022 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants