Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDP and STP forwarding #90

Open
BugraEryilmaz opened this issue Nov 26, 2024 · 2 comments
Open

LDP and STP forwarding #90

BugraEryilmaz opened this issue Nov 26, 2024 · 2 comments
Assignees

Comments

@BugraEryilmaz
Copy link
Contributor

Description
LDP and STP instructions load or store to 2 memory locations. It is normally implemented as two uOps in hardware, but it is a single uOp in flexus. This causes some unintended bugs where store-load forwarding breaks etc.

Steps to Reproduce
Run data caching image with 2 cores
After some time, there will be a sequence of instructions as follows:
str to address x + 8
ldp to address x

Expected Behavior
Load pair should get a part of its value from str because the second load value of the pair address matches the str address.

Actual Behavior
It does not forward because the addresses x and x+8 are not the same.

Solution:
After a lot of back and forth with Yuanlong and Shanqing about solution, we came up with a couple solutions:

  1. We can divide the pair operation into 2 micro ops like a real machine
  2. We can have a single instruction flowing with multiple LSQ entries
  3. We can modify the forwarding checks so that it forwards properly
@BugraEryilmaz BugraEryilmaz self-assigned this Nov 26, 2024
@BugraEryilmaz
Copy link
Contributor Author

I decided to move on with option 1. Option 2 had a lot of problems because some searches and snoops in LSQ depends on each instruction having single entry. Option 3 required a lot of changes to the forwarding logic.

Overall changes:

  1. Added logic to support microops
  2. Modified the decode tree for LDP and STP to generate 2 different microops each loading and storing separately
  3. Moved the validation to the second microop so that we do not have validation errors with the first microop
  4. Moved the branch misspeculation correction to the second microop

Fix: afcf2fc

branylagaffe pushed a commit that referenced this issue Nov 28, 2024
*Description*
LDP and STP instructions load or store to 2 memory locations. It is
normally implemented as two uOps in hardware, but it is a single uOp in
flexus. This causes some unintended bugs where store-load forwarding
breaks etc.

*Steps to Reproduce*
Run data caching image with 2 coresAfter some time, there will be a
sequence of instructions as follows:str to address x + 8ldp to address x

*Expected Behavior*
Load pair should get a part of its value from str because the second
load value of the pair address matches the str address.

*Actual Behavior*
It does not forward because the addresses x and x+8 are not the same.
@BugraEryilmaz
Copy link
Contributor Author

BugraEryilmaz commented Dec 3, 2024

Very small bug in this thread. Used auto for a value used in offset calculation. Compiler assumed unsigned 32 bits but it needed to be signed 64 bits. Causing a bug in a special case where we use the negative version of it. Fix: a64c475

branylagaffe pushed a commit that referenced this issue Dec 18, 2024
*Description*
LDP and STP instructions load or store to 2 memory locations. It is
normally implemented as two uOps in hardware, but it is a single uOp in
flexus. This causes some unintended bugs where store-load forwarding
breaks etc.

*Steps to Reproduce*
Run data caching image with 2 coresAfter some time, there will be a
sequence of instructions as follows:str to address x + 8ldp to address x

*Expected Behavior*
Load pair should get a part of its value from str because the second
load value of the pair address matches the str address.

*Actual Behavior*
It does not forward because the addresses x and x+8 are not the same.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant