[spi_device] TPM poorly handles aborted transactions #20654

a-will · 2023-12-14T16:00:42Z

Description

In spi_device, the TPM state machine and FIFOs poorly handle aborted transactions, creating ambiguity and race conditions that cannot be readily solved by the information left behind.

For write commands, if a sequence of two commands enters the command FIFO and data is in the write FIFO, there is no way to know which command is associated with the data. It could be the first or the second command that was aborted, and the separate FIFOs for commands and data cause that association to be dropped.

For read commands, the read data FIFO is reset whenever CSB is de-asserted, allowing it to clear any data written in the FIFO for a past command. However, in a sequence of two read commands, if the first read command is aborted, software can potentially write a response to the first read command while the second is active (CSB is asserted). Thus, the host may see read data for the wrong command.

a-will · 2023-12-14T16:01:00Z

CC @msfschaffner

jesultra · 2023-12-19T19:51:36Z

CC @mazurek-michal
This sounds as if we may need a comprehensive review of the software/hardware interface of the TPM SPI component. I have not taken the time to myself study the details of the interface. I suggest that we regroup in the beginning of January, and see if we can work out some reasonable modifications.

mazurek-michal · 2023-12-20T10:02:28Z

@jesultra ACK
Current tests, even in the modified version by @a-will for hyperdebug310, do not support sending stand-alone read transaction. We need test program that work on device and give more control over what happens with transaction.

I am actively working on more robust device code for SPI-TPM test.

jesultra · 2024-01-25T18:35:34Z

I have not yet looked into the details of the SPI TPM device hardware functionality. But at first glance, it sounds as if some of the features we put into the I2C device IP block could be appropriate.

Basically, the I2C block has an "incoming" queue of requests received from the AP, which could be reads or writes. For writes, the data is inlined into the same queue as the transaction headers (commands).

Then we added a funny feature, which was to make the hardware treat the data read FIFO (outgoing data from OT) as if it was empty, at any time when the incoming queue was non-empty. The intent was the same as clearing the data read FIFO when chip select is deasserted, but as Alex points out, doing so at the edge is insufficient to guard against the software populating a response just after the AP has given up and deasserted CS. If instead the hardware would decide based on the incoming queue being non-empty, then a second read by the AP, which the software has not yet seen or processed, would mean that the incoming queue would be non-empty, with the effect that any stale data put in the data read FIFO by software would be ignored by the hardware, until the software has had a chance to respond to the second read command.

Not having a foolproof way of telling requests apart, or guarding against stale data being transmitted, would potentially seriously impact the ability of ChromeOS to use the OT chip in designs that use the SPI bus for TPM. In my opinion, this needs to be addressed before the first production tapeout.

mazurek-michal · 2024-01-25T19:08:34Z

@jesultra Is it possible to manually abort spi transaction (in-flight) in current hyperdebug firmware?

jesultra · 2024-01-25T19:19:18Z

@jesultra Is it possible to manually abort spi transaction (in-flight) in current hyperdebug firmware?

HyperDebug firmware just knows about SPI read/write transactions. Opentitantool has low-level commands e.g. opentitantool spi raw-write ..., and it has a higher level commands such as opentitantool spi tpm .... The latter will instruct HyperDebug do several separate writes and reads on the spi bus, to e.g. execute a TPM command, or fewer to read/write a single TPM register. The high-level commands to not have support for issuing "invalid" or truncated sequences of SPI operations on the SPI TPM bus.

You could probably manually compose the SPI sequence to read a TPM register with opentitantool spi raw-write-read ..., though it would not properly do the repeated polling to see when the actual response begun. You could certainly issue a request to read a register, without reading any data (aborting the register read), using just opentitantool spi raw-write ....

If you want more advanced things, like aborting a TPM command after doing part of the dance around the STATUS and FIFO register, you may need to implement something in Rust that interfaces with opentitanlib, since opentitantool does not have support for keeping chip select asserted across multiple separate calls to raw-write and raw-read. (It is possible that such support could be added, if needed.)

msfschaffner · 2024-01-31T22:34:57Z

Note: if we're changing this part we should probably look at fixing #13996 and #14045 at the same time.

a-will · 2024-02-10T04:17:26Z

The plan currently in motion (and generally completed, except for the WrFIFO buffer release CSR and the little shift in RdFIFO reset release):

The WrFIFO and RdFIFO move to the SRAM.
- The command/address FIFO is still separate from the SRAM.
- Saves 128 Bytes of flops.
For write commands, the command/address FIFO gets written only when the last bit of the last byte is clocked in.
- There is then no need for a separate interrupt for the WrFIFO. All required data is present when the command/address FIFO entry arrives.
- Write commands continue to be gated by a non-empty command/address FIFO.
- Aborted write commands never reach the FIFO.
The WrFIFO's interface changes to a simple addressable memory, just like the flash modes' payload upload.
- Unlike the flash mode case, the payload size is known ahead of time.
- With the command/address FIFO only getting written at the time of the last bit, the entry in that FIFO describes the size of the payload in the memory.
The WrFIFO / write payload buffer changes to have an explicit "buffer release" CSR.
- The "buffer busy" status is another input that blocks write commands at wait states.
- The RAM interface + "buffer release" CSR is a little simpler for handling resets to the WrFIFO. When released to begin the data phase, the next command always begins at offset 0. Relying purely on circular buffer occupancy appears to require more awkward back-and-forth communication between the two domains.
The RdFIFO reset is extended until a read command pops from the command / address FIFO, and that FIFO is empty.
- Previously, it would reset only while CSB was inactive.
- This prevents firmware from writing stale data to the RdFIFO.
Otherwise...
- The interfaces to the command/address FIFO and RdFIFO remain the same.

I believe this will take care of the two hazards mentioned here. Only one command can be active, and both command types should have the synchronization points necessary to prevent ambiguous associations of commands and data.

a-will added Type:Bug Bugs Component:RTL IP:spi_device labels Dec 14, 2023

msfschaffner added this to the Earlgrey-PROD.M2 milestone Dec 19, 2023

msfschaffner added the Earlgrey-PROD Candidate Temporary label to triage issues into Earlgrey-PROD Milestones label Dec 19, 2023

msfschaffner mentioned this issue Dec 19, 2023

[spi_device,rtl] Consider adding signalling mechanism to abort #18357

Closed

msfschaffner added the Priority:P2 Priority: medium label Dec 19, 2023

msfschaffner assigned a-will Jan 12, 2024

a-will mentioned this issue Feb 13, 2024

[spi_device] Fix up TPM command handling #21322

Merged

a-will linked a pull request Feb 13, 2024 that will close this issue

[spi_device] Fix up TPM command handling #21322

Merged

a-will closed this as completed in #21322 Feb 22, 2024

This was referenced Feb 27, 2024

[aon_timer] V2(S) Signoff #21030

Closed

[spi_device] V1 Signoff #21009

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[spi_device] TPM poorly handles aborted transactions #20654

[spi_device] TPM poorly handles aborted transactions #20654

a-will commented Dec 14, 2023 •

edited

Loading

a-will commented Dec 14, 2023

jesultra commented Dec 19, 2023

mazurek-michal commented Dec 20, 2023 •

edited

Loading

jesultra commented Jan 25, 2024 •

edited

Loading

mazurek-michal commented Jan 25, 2024

jesultra commented Jan 25, 2024 •

edited

Loading

msfschaffner commented Jan 31, 2024

a-will commented Feb 10, 2024 •

edited

Loading

[spi_device] TPM poorly handles aborted transactions #20654

[spi_device] TPM poorly handles aborted transactions #20654

Comments

a-will commented Dec 14, 2023 • edited Loading

Description

a-will commented Dec 14, 2023

jesultra commented Dec 19, 2023

mazurek-michal commented Dec 20, 2023 • edited Loading

jesultra commented Jan 25, 2024 • edited Loading

mazurek-michal commented Jan 25, 2024

jesultra commented Jan 25, 2024 • edited Loading

msfschaffner commented Jan 31, 2024

a-will commented Feb 10, 2024 • edited Loading

a-will commented Dec 14, 2023 •

edited

Loading

mazurek-michal commented Dec 20, 2023 •

edited

Loading

jesultra commented Jan 25, 2024 •

edited

Loading

jesultra commented Jan 25, 2024 •

edited

Loading

a-will commented Feb 10, 2024 •

edited

Loading