-
Notifications
You must be signed in to change notification settings - Fork 781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spi_device] Fix RDC Metastability #15773
Conversation
Hey @a-will, please let me know if 2 clk delays of the read FIFO depth may cause issues on DIF. |
hw/ip/spi_device/rtl/spi_tpm.sv
Outdated
.d_i (sys_rdfifo_wdepth), | ||
.q_o (sys_rdfifo_depth_o) | ||
); | ||
assign sys_rdfifo_notempty_o = |sys_rdfifo_depth_o; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm...assuming wdepth
went metastable, wouldn't you get a potentially corrupted value for a cycle? If that's the case, is it safe to have it calculate notempty_o
? I think usually when I encounter issues like this, I sample one more cycle and check for signal stability before allowing the output to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdepth
may go metastable. After 2FF, the sys_rdfifo_depth_o
is stable (but cannot guarantee if the value is correct or not). So it is OK to use the 2FF'd signal directly if the value is not a concern.
If value is a concern and latency is not that important (compare to the value correctness), then as you said [one more latch and compare with prev, then use the value only when both are same] is the best bet. Do you think it is better? I can quickly change the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah honestly you would know better than me whether the one cycle incorrect value would hurt.. i was just pointing out that it could happen. I can't remember how the read fifo write depth is used... i guess it's used for the wait comparison? And if that's the case, if CSB really released, I guess we really don't care right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After increase the read fifo size, I don't think any usecase of the depth
nor rdfifo_notempty
. Hum... we may just remove them from CSR also..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah especially since we reset so that there can't be any left over..there might not be a need for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjaychen It's just a nice-to-have, I think, so no need to bend over backwards to implement the detection of an aborted transaction. With the command in TPM_CMD_ADDR, SW knows what should have been transferred, and it's probably fine that a mismatch vs. the actual transaction simply produces undefined behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, are you all OK with removing them from CSRs? Then I can revise this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK as everyone agreed, I will remove the fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@a-will https://cs.opensource.google/opentitan/opentitan/+/master:sw/device/lib/dif/dif_spi_device.c;l=1577
This DIF uses read fifo depth.. Could you guide me how to revise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, I forgot about determining how much data we could put in. If this FIFO were dedicated to TPM_FIFO_XDATA / TPM_FIFO_DATA, then we would actually need the depth and the data would stay across multiple transactions. However, this FIFO is also used for a software response to various registers, right?
In that case, we'll just have to have all data for a transaction written in one call, and the check will be to ensure the size isn't larger than the maximum capacity. I'll note that this architecture is likely pretty slow, though--We are forced to insert considerably-long wait states for every transaction.
The Read FIFO in SPI TPM module uses TPM CS# as a reset for both write and read ports. The intention is to reset the FIFO at the end of every SPI TPM transaction. As TPM CS# is controlled by TPM host system, the timing of CS# de-assertion (0 -> 1) is independant of the SYS_CLK. If release of CS# coincidently hit the posedge of the SYS_CLK, there's chance the Read FIFO depth signal may trigger metastability issues. To avoid the issue, I may suggest two solutions quickly. 1. Convert the CS# rst synchronizer to sync assert, sync de-assert. 2. Add 2FF sync to the depth signal to eliminate metastability. In this commit, I implement the option 2. Here's why I avoid the option 1. Option 1 delays the assertion time of the reset on the write port of the read FIFO by 2 SYS_CLK. To reset the internal pointers correctly, both write/ read ports should be reset. To satisfy the requirement, CS# should remain high (inactive) for at least 2 SYS_CLK (+ 1 SYS_CLK at worst case). In real time scale, it is around 120ns. The minimum TPM CS# inactive time can be as low as 50ns. So, I wanted to avoid the case. However, in other logic inside TPM may require 2 SYS_CLK inactive time. With the option 2, the TPM CS# inactive requirement is not needed. However, the status represented in CSRs will be delayed by 2 SYS_CLK. Fortunately, the read FIFO depth or notempty status are not latency sensitive signals. Signed-off-by: Eli Kim <[email protected]>
10d9600
to
fa8c27c
Compare
fa8c27c
to
f706837
Compare
f706837
to
2b17c40
Compare
2b17c40
to
a7ddf55
Compare
As the Read FIFO has been increased to max TPM transfer size, 64B, current DIF, DV test sequence do not use the rdfifo not empty signal nor rdfifo depth field in `TPM_STATUS`. In previous commit, the approach to resolve the metastability issue on RDC was to add 2FF. However, this commit removes the field so that the data never crosses the reset domain. Signed-off-by: Eli Kim <[email protected]>
a7ddf55
to
aea6fe9
Compare
{SPI_DEVICE_TPM_STATUS_RDFIFO_NOTEMPTY_BIT, 1}, | ||
{SPI_DEVICE_TPM_STATUS_RDFIFO_DEPTH_OFFSET, 14}, | ||
{SPI_DEVICE_TPM_STATUS_WRFIFO_DEPTH_OFFSET, 4}, | ||
}); | ||
EXPECT_EQ(dif_spi_device_tpm_write_data(&spi_, /*length=*/9, data), | ||
kDifOutOfRange); | ||
|
||
EXPECT_READ32(SPI_DEVICE_TPM_STATUS_REG_OFFSET, | ||
{ | ||
{SPI_DEVICE_TPM_STATUS_CMDADDR_NOTEMPTY_BIT, 0}, | ||
{SPI_DEVICE_TPM_STATUS_RDFIFO_NOTEMPTY_BIT, 1}, | ||
{SPI_DEVICE_TPM_STATUS_RDFIFO_DEPTH_OFFSET, 15}, | ||
{SPI_DEVICE_TPM_STATUS_WRFIFO_DEPTH_OFFSET, 4}, | ||
}); | ||
EXPECT_EQ(dif_spi_device_tpm_write_data(&spi_, /*length=*/5, data), | ||
kDifOutOfRange); | ||
|
||
EXPECT_READ32(SPI_DEVICE_TPM_STATUS_REG_OFFSET, | ||
{ | ||
{SPI_DEVICE_TPM_STATUS_CMDADDR_NOTEMPTY_BIT, 0}, | ||
{SPI_DEVICE_TPM_STATUS_RDFIFO_NOTEMPTY_BIT, 1}, | ||
{SPI_DEVICE_TPM_STATUS_RDFIFO_DEPTH_OFFSET, 1}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@a-will I removed these checkers as DIF has been changed to check only if SW writes more than the FIFO size. Please take a look.
The Read FIFO in SPI TPM module uses TPM CS# as a reset for both write and read ports. The intention is to reset the FIFO at the end of every SPI TPM transaction.
As TPM CS# is controlled by TPM host system, the timing of CS# de-assertion (0 -> 1) is independant of the SYS_CLK. If release of CS# coincidently hit the posedge of the SYS_CLK, there's chance the Read FIFO depth signal may trigger metastability issues.
To avoid the issue, I may suggest two solutions quickly.
In this commit, I implement the option 2. Here's why I avoid the option 1.
Option 1 delays the assertion time of the reset on the write port of the read FIFO by 2 SYS_CLK. To reset the internal pointers correctly, both write/ read ports should be reset. To satisfy the requirement, CS# should remain high (inactive) for at least 2 SYS_CLK (+ 1 SYS_CLK at worst case). In real time scale, it is around 120ns. The minimum TPM CS# inactive time can be as low as 50ns. So, I wanted to avoid the case. However, in other logic inside TPM may require 2 SYS_CLK inactive time.
With the option 2, the TPM CS# inactive requirement is not needed. However, the status represented in CSRs will be delayed by 2 SYS_CLK. Fortunately, the read FIFO depth or notempty status are not latency sensitive signals.