Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gem5+SST13 Changes #269

Open
wants to merge 26 commits into
base: develop
Choose a base branch
from
Open

gem5+SST13 Changes #269

wants to merge 26 commits into from

Conversation

kaustav-goswami
Copy link
Contributor

This PR compares the disaggregated memory changes.

This change updates the gem5 SST Bridge to use SST 13.0.0. Changes
are made to replace SimpleMem class to StandardMem class as
SimpleMem will be deprecated in SST 14 and above. In addition, the
translator.hh is updated to translate more types of gem5 packets.
A new parameter `ports` was added on SST's side when invoking the
gem5 component which does not require recompiling the gem5
component whenever a new outgoing bridge is added in a gem5 config.

Change-Id: I45f0013bc35d088df0aa5a71951422cabab4d7f7
Signed-off-by: Kaustav Goswami <[email protected]>
This change adds necessasry tools to simulate multiple gem5
nodes to simulate a disaggregated memory setup using gem5 and SST.

Change-Id: I6e1e52d4ba8df7c161b3151c9b2c02b72fc7cc31
Signed-off-by: Kaustav Goswami <[email protected]>
This change adds functional accesses to the gem5/SST bridge to
boot disk images using the bridge. The change also adds more
information in INSTALL.md for MacOS users on how to get the bridge
setup working.

Signed-off-by: Kaustav Goswami <[email protected]>
This change updates the gem5 SST Bridge to use SST 13.0.0. Changes
are made to replace SimpleMem class to StandardMem class as
SimpleMem will be deprecated in SST 14 and above. In addition, the
translator.hh is updated to translate more types of gem5 packets.
A new parameter `ports` was added on SST's side when invoking the
gem5 component which does not require recompiling the gem5
component whenever a new outgoing bridge is added in a gem5 config.

Change-Id: I45f0013bc35d088df0aa5a71951422cabab4d7f7
Signed-off-by: Kaustav Goswami <[email protected]>
This change adds necessasry tools to simulate multiple gem5
nodes to simulate a disaggregated memory setup using gem5 and SST.

Change-Id: I6e1e52d4ba8df7c161b3151c9b2c02b72fc7cc31
Signed-off-by: Kaustav Goswami <[email protected]>
This change adds functional accesses to the gem5/SST bridge to
boot disk images using the bridge. The change also adds more
information in INSTALL.md for MacOS users on how to get the bridge
setup working.

Signed-off-by: Kaustav Goswami <[email protected]>
@kaustav-goswami kaustav-goswami self-assigned this Nov 17, 2023
Copy link
Member

@powerjg powerjg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very quick review... can you please clean up the commits before sending this out for review? There are changes in commits that should be in other commits, and please don't use "merge" and instead use rebase. When there are merge commits in a PR, I cannot tell what are changes that I should be reviewing vs changes that are actually already in upstream.

ext/sst/translator.hh Outdated Show resolved Hide resolved
src/sst/outgoing_request_bridge.cc Show resolved Hide resolved
src/sst/outgoing_request_bridge.cc Show resolved Hide resolved
src/sst/outgoing_request_bridge.cc Outdated Show resolved Hide resolved
break;
}
default:
panic("handleRecvFunctional: Unable to convert gem5 packet: %s\n", pkt->cmd.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is >80 chars

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fixed in the latest patchset

src/sst/outgoing_request_bridge.cc Show resolved Hide resolved
ext/sst/sst_responder_subcomponent.cc Outdated Show resolved Hide resolved
ext/sst/Makefile Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this file deleted in this commit?

```sh
# go to the base gem5 directory
cd ../..
export DYLD_LIBRARY_PATH=:`pwd`/build/RISCV/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why specifically build/RISCV/?

This would be better if it was generic. E.g., export DYLD_LIBRARY_PATH=:$GEM5_BASE/<build directory>

This being said, this change has nothing to do with "added functional accesses to gem5/SST bridge" it should be in a different change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are supporting the RISCV version as the default one. We can definitely do this but we'd also need to change the README and the INSTALL files to make it more generic. I agree on making it more generic as the default examples we have are example.py -> RISCV/default and arm_example.py -> ARM.

ext/sst/gem5.hh Outdated
{"system_port", "Connection to gem5 system_port", "gem5.gem5Bridge"},
{"cache_port", "Connection to gem5 CPU", "gem5.gem5Bridge"}
)
// SST_ELI_DOCUMENT_SUBCOMPONENT_SLOTS(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHy this change? and why in this commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was updated in this patchset. It spilled over to this change as I felt it was no longer necessary to explicitly document system_port and memory_port as we are now using a vector of outgoing ports

This change adds a lot of new features to the gem5-SST bridge.
The following list of changes describes each of them:
1. The bridge now supports limited functional accesses, which
enables users to work with vda devices in ARM/RISCV systems.
2. There are new methods added to the bridge which allows future
extension of the bridge to swap ports to enable delayed INIT
phase for SST. This will allow the bridge to start the system in
gem5 and then swap the memory ports to SST at RUN phase.
3. A lot of stray debug statements/cout statements were removed
in this patch.
4. New documentation was added to the SST component for gem5.

Signed-off-by: Kaustav Goswami <[email protected]>
This change fixes minor changes with the gem5's outgoing bridge,
mostly in comments and gem5 syle.

Signed-off-by: Kaustav Goswami <[email protected]>
This change drops all the boards and configs created previously
to simulate disaggregated memory systems as those were not strictly
adhering to the stdlib rules. This new change posts most of the
previously written code to create a new abstract disaggregated
memory board for both ARM and RISCV architectures, which minimizes
code duplication and can be easily extended. Extension suggestions
and tips are now commented thoughout the python files. This is a
proper directory structure to navigate these changes.

Signed-off-by: Kaustav Goswami <[email protected]>
This change create config script for both gem5 and SST and allows
SST to run these scripts using one or more ISAs at the same time.

Signed-off-by: Kaustav Goswami <[email protected]>
@kaustav-goswami
Copy link
Contributor Author

@powerjg I am sorry that I had such a messy code. I restructured the whole disaggregated memory code. I had a conversation with @studyztp who was trying to extend this code for her work and it was not trivial for her. In this version, I have split the whole setup into boards, cachehierarchies, memories and configs, which should be easy to read and extend. The code extends the stdlib in a systematic manner. I have also removed a lot of unnecessary config files for both gem5 and SST and made the new configs flexible.

This change updates the numactl calls to correctly allocate memory
in the local and the remote memory nodes.

Signed-off-by: Kaustav Goswami <[email protected]>
This change adds stats counters to the gem5 SST external bridge to
monitor the number and size of packets going out and coming into
gem5. This numbers should also be reflected in SST's stats output
file.

Signed-off-by: Kaustav Goswami <[email protected]>
This change adds a private l1 private l2 and shared l3 cache
hierarchy as a stdlib component.

Signed-off-by: Kaustav Goswami <[email protected]>
This change updates the disaggregated memory scripts to use a 3-level
cache and dumps starts after each STREAM kernel.

Signed-off-by: Kaustav Goswami <[email protected]>
This commit fixes a compilation error on the gem5 SST bridge.

Signed-off-by: Kaustav Goswami <[email protected]>
This patch fixes the previous git merge that duplicated a few lines
in the ext library of sst.

Signed-off-by: Kaustav Goswami <[email protected]>
This commit fixes a compilation error on the gem5 SST bridge.

Signed-off-by: Kaustav Goswami <[email protected]>
This commit further fixes repeted lines of code.

Signed-off-by: Kaustav Goswami <[email protected]>

@overrides(ArmAbstractDMBoard)
def _set_remote_memory_ranges(self):
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you add an error here?

clk_freq=clk_freq,
processor=processor,
local_memory=local_memory,
remote_memory_addr_range=self._remoteMemory.remote_memory.physical_address_ranges[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a function of the remote memory.

# we need to find whether there is any external latency. if yes, then
# add xbar to add this latency.

if self.get_remote_memory().is_xbar_required():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to make this more generic

)

@overrides(RiscvBoard)
def generate_device_tree(self, outdir: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from the base?


:returns: The memory system.
"""
raise NotImplementedError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? This is odd. Can you explain why this is not implemented?

# we need to find whether there is any external latency. if yes, then
# add xbar to add this latency.

if self.get_remote_memory().is_xbar_required():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This xbar should bein the remote memory object and simply exposed as a single port.

Let's generalize this by assuming the remote memory will always be a single port (i.e., never channeled). Then, you can make this function be exactly the same for sst/gem5 and arm/riscv.

In fact, there should be a "mixin" class that allows disaggregated memory.

@@ -0,0 +1,218 @@
# Copyright (c) 2023 The Regents of the University of California
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are "dm" caches and why are they different from the normal cache hiearchies?

]

@overrides(ChanneledMemory)
def get_mem_ports(self) -> Sequence[Tuple[AddrRange, Port]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a single port and put the xbar in this object.

from m5.objects import OutgoingRequestBridge, AddrRange, Tick


class ExternalRemoteMemoryInterface:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should inherit from AbstractMemorySystem

board._pre_instantiate()
root = Root(full_system=True, board=board)
board._post_instantiate()
m5.instantiate()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These four lines are very odd. Can you explain what it's doing?

This change makes the remote memory a 4 channel DDR4-like memory
installed on the remote node.

Signed-off-by: Kaustav Goswami <[email protected]>
This change restricts any functional reads at runtime to go to
SST.

Signed-off-by: Kaustav Goswami <[email protected]>
This change adds a KVM capable runscript for testing the NUMA
setup.

Signed-off-by: Kaustav Goswami <[email protected]>
This change adds a WIP working x86 board that allows memory
offlining with compatible kernel.

Signed-off-by: Kaustav Goswami <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants