Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Litmus tests #170

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Add Litmus tests #170

wants to merge 3 commits into from

Conversation

ezelioli
Copy link

@ezelioli ezelioli commented Dec 3, 2024

Contributions:

  • Refactor bootrom source code to be parametric (note no changes in actual bootrom content)
  • Add SMP support to software runtime
  • Add simple SMP hello software test
  • Add PULP's fork of Litmus tests as submodule (in sw/deps)
  • Add script with utility functions to parse output of Litmus tests (in utils/litmus)
  • Add make flow to run tests
  • Extend zero-stage boot-loader for SMP

@ezelioli ezelioli mentioned this pull request Dec 3, 2024
@ezelioli ezelioli marked this pull request as ready for review December 3, 2024 12:41
@ezelioli ezelioli requested review from paulsc96 and niwis December 3, 2024 12:41
@ezelioli ezelioli self-assigned this Dec 3, 2024
@ezelioli
Copy link
Author

ezelioli commented Dec 3, 2024

The bootrom SMP support consists of pausing all secondary cores after a first common reset sequence, and let the main core do the initialization process. The main (non-SMP) core is statically determined by a macro at the beginning of the bootrom. The secondary cores are then woken up before moving to the next boot stage, i.e. in boot_next_stage.

The wakeup sequence consists of:

  1. Sending a software interrupt to all cores (here). Note that this includes the main core itself, which will send an IPI to iself.
  2. Wait for IPI to be received in each core. First, each core waits in a WFI loop. When the IPI is received, the core clears the respective CLINT IPI register, clearing the interrupt. Then, each core reads all the CLINT IPI registers to check that all other cores have already cleared it.
  3. Both cores proceed to next stage

Possible problems:

  • We could avoid sending interrupts to the main core itself, simply resuming the other cores. This would then have some possible implications on how the "synch" step (point 2 above) happens, since secondary cores would not know when the first core has completed the wakeup sequence (by reading CLINT IPI registers). However, do we need this synchronization? Also, is this synchronization based on IPIs really race-free?

@ezelioli
Copy link
Author

ezelioli commented Dec 3, 2024

The SMP support in the software runtime (crt0.S) instead fixes the main core to core 0. All other cores are paused after some common required initialization steps in the crt0.S. Non-main cores wait in a WFI loop for software interrupts. The wake-up sequence in this case only sends IPIs to all cores except core 0. The smp_resume routine also waits for the interrupt to be cleared by the secondary cores before proceeding (here). This ensures that when the smp_resume returns, the IPIs have been propagated to all cores and that the cores have woke up. However, this has the downside of potentially deadlocking if another core does not wake up properly. Also, if another core has not reached the WFI loop for any reason, this will stall core 0 until then. Finally, is this really race-free?

@ezelioli
Copy link
Author

ezelioli commented Dec 3, 2024

Zero-stage bootloader also required some adaptations wrt #85 due to the different behavior upon resuming secondary harts in the Cheshire runtime (crt0.S). When calling smp_resume the secondary harts jump to main - exactly as for the primary hart, but skipping some cold init steps - instead of jumping to the point in the code where the smp_resume is placed.

niwis
niwis previously approved these changes Dec 4, 2024
Copy link
Contributor

@niwis niwis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can we add smp_hello to the Cheshire CI? I remember that we previously had issues with executing from either DRAM or SPM because of the way the stack was set up. Would be great to see if this is working now. Just out of curiosity, did you test the bootloader for a multicore configuration (e.g. SMP Linux?)

Regarding your comments:

We could avoid sending interrupts to the main core itself, simply resuming the other cores.

Why do you think this might be a problem? If synchronisation is needed, we could also add a barrier. Not sure if there would be a reason for it, though.

Zero-stage bootloader also required some adaptations wrt #85 due to the different behavior upon resuming secondary harts in the Cheshire runtime (crt0.S). When calling smp_resume the secondary harts jump to main - exactly as for the primary hart, but skipping some cold init steps - instead of jumping to the point in the code where the smp_resume is placed.

I think this makes sense!

sw/lib/crt0.S Show resolved Hide resolved
fence();
for (uint32_t i = 1; i < num_harts; i++) {
*reg32(&__base_clint, i << 2) = 0x1;
while (*reg32(&__base_clint, i << 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The smp_resume routine also waits for the interrupt to be cleared by the secondary cores before proceeding (here). This ensures that when the smp_resume returns, the IPIs have been propagated to all cores and that the cores have woke up. However, this has the downside of potentially deadlocking if another core does not wake up properly. Also, if another core has not reached the WFI loop for any reason, this will stall core 0 until then. Finally, is this really race-free?

The main possible drawback that I see here is that it might introduce a delay between waking up cores. Could the same be achieved by adding a barrier after smp_resume if necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could remove the CLINT register polling and leave the synchronization up to the programmer (e.g. by adding a barrier) if needed.

@ezelioli
Copy link
Author

ezelioli commented Dec 4, 2024

Regarding CI:

LGTM. Can we add smp_hello to the Cheshire CI? I remember that we previously had issues with executing from either DRAM or SPM because of the way the stack was set up. Would be great to see if this is working now. Just out of curiosity, did you test the bootloader for a multicore configuration (e.g. SMP Linux?)

Yes that should be possible. I have only tested the smp_hello myself, will add that to CI as well.

Regarding bootrom SMP:

We could avoid sending interrupts to the main core itself, simply resuming the other cores.

Why do you think this might be a problem? If synchronisation is needed, we could also add a barrier. Not sure if there would be a reason for it, though.

I think both approaches would be fine. I just am not sure whether we need to synchronize cores, and whether this way is a proper way of doing. However, the current approach is working and I don't see major issues with it.

- Add dual-core configuration in testbench
- Add number of cores parameter for consistent CLINT/PLIC generation
- Add PLIC configuration file generation according to number of cores
- Bump nonfree to version with baremetal SMP tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants