Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Credit system in TL-UL to not block the fabric #1644

Open
eunchan opened this issue Feb 28, 2020 · 3 comments
Open

Credit system in TL-UL to not block the fabric #1644

eunchan opened this issue Feb 28, 2020 · 3 comments
Assignees
Labels
Component:RTL Earlgrey-PROD Triaged Temporary label to triage issues into Earlgrey-PROD Milestones Hotlist:Wishlist Wishlist items IP:rv_core_ibex Priority:P3 Priority: low Type:Enhancement Feature requests, enhancements Type:FutureRelease Not relevant to currently planned releases/milestones

Comments

@eunchan
Copy link
Contributor

eunchan commented Feb 28, 2020

Just to remind me to discuss credit system in TL-UL using d_user signal after dropping the early netlist or even after ES tape-out.

CC: @sjgitty @tjaychen @msfschaffner


Yeah. The returned credit value is not that reliable if multiple hosts access
the same device. The true credit system will look like below:

  1. Dedicated broadcast interface that reports current credit periodically, and
    every device should keep track of it.
  2. The fabric ordering rule allows the latter write access to bypass prior one
    if the target interface is stalled or run out-of-credit.
  3. Return response of AccessAck can bypass AccessAckData.

Or, we could follow the solution of producer-consumer problem, which is similar
to the TileLink Cached protocol. Host gives a hint of the size to write and
device choose which host can send the data among the multiple hosts. And grant
it through dedicated channel.

But, in my opinion, OpenTitan won't have multiple hosts writing to the same
device. So above scheme is too much I think. Just a hint may be okay.

On Thu, Feb 27, 2020 at 12:02:21PM -0800, Michael Schaffner wrote:

This sounds like a good idea Eunchan! I would however try to make sure that
the protocol is "non-persistent" like an optional QoS protocol. I.e., I
fear that if we went for a true credit system, we need to be careful about
multiple master agents on the bus and credits that could potentially be
locked in under special error conditions... I have the feeling that even
just returning the current FIFO fill state in the user metadata could
provide enough guidance to the DMA to "intelligently" issue the writes or
back off for a few cycles. That could use 1. and a variant of 2. that you
described above.

My 2 cents :)...

On Thu, Feb 27, 2020 at 11:44 AM Timothy Chen [email protected] wrote:

Yeah I like either 1 or 2. I think a dma host can always know that it
doesn't know how many it can send yet, and send almost like a ping request
to get that information.

Both your suggestions (assuming 1 is possible) sound reasonable to me!

On Thu, Feb 27, 2020, 11:35 Eunchan Kim [email protected] wrote:

I hope that the way to not block the fabric could be systematical by
adopting
this scheme.

The way to get the first buffer update is, actually what I am currently
thinking. :) The issue is not bounded by the first transaction only but
everytime when DMA paused by the credit run-out and re-initiates the
transaction.

Couple of ideas here. Could be nonsense at all :)

  1. Support zero write transaction to acquire the buffer space. Didn't
    check
    TL-UL alows zero write (PutPartial with 0 strobe) or not.
  2. Add buffer space checking register for any devices that support credit
    system.
  3. weird wacky broadcast system that allows any device can send the
    available
    space if the device exits from full condition.

And, yes, it is not d_ready but a_ready :) Thanks for catching it.

Eunchan

On Wed, Feb 26, 2020 at 11:02:49PM -0800, Timothy Chen wrote:

i'm also kind of wondering...since our system doesn't support burst, is
the
credit system significantly better than if hmac just sent a !full
indication directly to the DMA (i think these two would also be on the
same
clock domain)? Kind of like the various handshaking channels you see on a
lot of these modules.
I do think using user signal is cleaner...since it's in-band, and the
"not
knowing" issue I guess really only exists the very first time...

On Wed, Feb 26, 2020 at 10:59 PM Timothy Chen [email protected]
wrote:

That sounds like a good idea. Since the information is conveyed via
user
though, does that mean the DMA does not get the information until it
sends
at least one transaction first?
Also do you mean a_ready?

On Wed, Feb 26, 2020 at 10:42 PM Eunchan Kim [email protected]
wrote:

Hi Tim and Michael,

I have thought about the HMAC/SHA2's performance a little bit deeper
tonight and am sending this email to not forget. :)

I think we would better utilize the d_user signal having the credit
information. The credit will tell how many words the device available
when
the device returns the response (for the Put*Data opcode).

The benefit is that the DMA could halt the write data before it
actually
hangs the fabric. Let's assume that the hash compute case by the
software.
As I explained to Tim today, current Ibex takes around 5 clocks to
read/calc the address/ write data to MSG_FIFO. To fill out 16 depth of
MSG_FIFO (a block size), it takes 80 clocks, which is the same as the
time
of hashing a block.

So, I expect that the software may utilize DMA to relieve from the
hashing operation. I said to Tim that the software may create a
descriptor
per block (64B) to not block the fabric. Then, that doesn't gain much
to
the software as the software need to trigger DMA everytime when a
block
computation is done. It means DMA should be smart enough to feed only
necessary amount of data at a time and the software creates a
descriptor
for the entire message.

When DMA sees the d_ready drop, it is already too late as the next
Put*Data transaction sent out (back-to-back requests). If credit
information is given, then the DMA engine can send only available
data to
the device and in the meantime, the DMA can read next chunk of data
(if it
has a buffer).

Regards,
Eunchan

@eunchan eunchan added Hotlist:Wishlist Wishlist items Priority:P3 Priority: low Type:Enhancement Feature requests, enhancements Component:RTL labels Feb 28, 2020
@eunchan eunchan self-assigned this Feb 28, 2020
@GregAC
Copy link
Contributor

GregAC commented Mar 11, 2020

Would this apply to iside interface on Ibex as well? This could be useful for the icache design.

@eunchan
Copy link
Contributor Author

eunchan commented Mar 13, 2020

Would this apply to iside interface on Ibex as well? This could be useful for the icache design.

My plan is (if approved) to have every TL-UL host utilizing the d_user signal if they want to. If not, just it can behave as same as before. So, yes if Ibex supports.

@GregAC GregAC added the Type:FutureRelease Not relevant to currently planned releases/milestones label Feb 24, 2023
@GregAC GregAC added this to the Integrated: M0 milestone Feb 24, 2023
@GregAC
Copy link
Contributor

GregAC commented Feb 24, 2023

Worth considering this for integrated given the DMA

@msfschaffner msfschaffner added the Earlgrey-PROD Triaged Temporary label to triage issues into Earlgrey-PROD Milestones label Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component:RTL Earlgrey-PROD Triaged Temporary label to triage issues into Earlgrey-PROD Milestones Hotlist:Wishlist Wishlist items IP:rv_core_ibex Priority:P3 Priority: low Type:Enhancement Feature requests, enhancements Type:FutureRelease Not relevant to currently planned releases/milestones
Projects
None yet
Development

No branches or pull requests

3 participants