Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repo: created a new branch for review #275

Open
wants to merge 25 commits into
base: develop
Choose a base branch
from

Conversation

kaustav-goswami
Copy link
Contributor

A setup to do disaggregated memory simulation using gem5 and SST.

@kaustav-goswami kaustav-goswami requested a review from powerjg April 9, 2024 08:18
@kaustav-goswami kaustav-goswami self-assigned this Apr 9, 2024
@kaustav-goswami kaustav-goswami requested a review from mbabaie April 9, 2024 09:14
Comment on lines +147 to +148
self.local_memory = local_memory
self.remote_memory = remote_memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why renaming these here? Above you called them _local_memory and here it's renamed. Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I am using self._localMemory and self._remoteMemory for the init phase. I wanted local_memory and remote_memory conventions to exist, similar to the other board, so I kept them as well. I can get rid of these and test the code.

# The remote memory starts anywhere after the local memory ends. We
# rely on the user to start and end this range.
self._remote_mem_ranges = [
self.get_remote_memory().get_mem_ports()[0][0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why [0][0]? Should the interface to get_mem_ports be changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So get_mem_ports' returns a tuple of address range and a port, which is directly used while setting up ruby caches. I did not want to create another function just to get the port or the range. I agree [0][0] does look hard-coded.

Copy link
Member

@powerjg powerjg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks very good! Great job cleaning things up.

Let's focus on the experiments now, and we can clean things up futher soon.

Let's focus on writing a paper on this and improving the code so we can release it.

disaggregated_memory/memories/external_remote_memory_v2.py Outdated Show resolved Hide resolved
# by the user.
else:
# There is no range information provided by the user. Depending
# upon the ISA, we have to fix the address.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's ISA/system dependent, then do it in the Board class, not the memory class!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that we need to discuss. Ranges should be set on the board imo. But we need a way to plug and play ExternalMemory with a range. This is because the remote memory may not be contiguous. So the question is: Do the board or the memory need to know where the remote memory?

# If an XBar is required, it should be added in the connect_things to
# avoid initializing an orphan node.
raise NotImplementedError("This is a deprecated method for v2")
return self._xbar_required
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this in the next patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is completely deleted now! Even the methods are gone.

disaggregated_memory/memories/external_remote_memory_v2.py Outdated Show resolved Hide resolved
disaggregated_memory/configs/riscv-main.py Outdated Show resolved Hide resolved
@@ -79,14 +80,22 @@ SSTResponderSubComponent::setOutputStream(SST::Output* output_)
{
output = output_;
}

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete unused code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this in the next patch.

responseReceiver->getBackdoor(data);
assert(data->readable());

// this can be a huge number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this in the next patch. The comment is not really relevant but I'll try to word this in a better way.

@@ -238,11 +309,59 @@ SSTResponderSubComponent::handleRecvRespRetry()
responseQueue.pop();
}

// void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete unused code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this in the next patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added a small comment.

Comment on lines +342 to +344
case gem5::MemCmd::ReadReq: {
request = new SST::Interfaces::StandardMem::Read(addr, size);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I buy this is reasonable. Does this ever happen? If so, what is the result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to mirror functional accesses in SST. I'd still need to decode the gem5 packet so that I can send corresponding timing requests to SST's memory.

// gem5::OutgoingRequestBridge* responseReceiver;
// responseReceiver for this branch is hardcoded to ExternalMemory*.
// TODO: We need to make a better design to handle multiple types of
// outgoing request classes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be done sooner rather than later. We can chat about the best approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I ran into this issue yesterday so I tried to solve it hastily.

kaustav-goswami and others added 21 commits April 11, 2024 02:16
This change addresses most of the comments from the PR.

Signed-off-by: Kaustav Goswami <[email protected]>
This change adds a document to explain how this system works and how a
user can start simulations using this platform. In addition, this change
also adds a SST-side runscript to start the phase 2 simulations.

Signed-off-by: Kaustav Goswami <[email protected]>
This change includes the following:
- Checkpoint restore/base clock was missing in the previous update
- The L3 crossbar bug was fixed by increasing the size of the snoop
  filter.
- Remote link is now deprecated. A fatal message is now displayed.
- Remote memory connection with the membus is moved back to the
  cachehierarchies.

Signed-off-by: Kaustav Goswami <[email protected]>
Signed-off-by: Kaustav Goswami <[email protected]>
Signed-off-by: Kaustav Goswami <[email protected]>
…o count outstanding packets

Signed-off-by: Kaustav Goswami <[email protected]>
This change adds a X86 Composable Memory board to the repository. A
sample script is also added. The user needs to fake NUMA information to
the OS/board to enable gem5 + SSt simulations.

Signed-off-by: Kaustav Goswami <[email protected]>
This change adds a new ARM shared memory board, which allows users to
perform shared memory programs across two independent systems. The board
uses a UIO device to expose a physical memory address range to the OS,
which can be then mmaped by a C program in each of the systems.

Signed-off-by: Kaustav Goswami <[email protected]>
This change fixes the kernel bug on startup and adds madt entries to the
X86 board.

Change-Id: Idd60cc485f8b38628d10998036f034e519eb11e6
Signed-off-by: Kaustav Goswami <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants