-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stacked vdso and interval timer support #3
base: master
Are you sure you want to change the base?
Conversation
This allows vdso functions to be virtualized within multiple program contexts, e.g., within signal handlers, where previously each thread had only one vdso state. A simple illustrative example of the problem is a program that sets an alarm() and then spins on gettimeofday(): If the program is in the vdso call when it is interrupted by SIGALRM, and the handler in turn calls gettimeofday() (or another vdso), there was no way to keep the two vdso functions distinct. Now new entries are pushed and popped as needed on context changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks good! Other than the inlined comments, there are two general comments:
-
Could you add a few test cases? You can put them under
misc/testProgs/
, similar to my pull request Implement thread affinity s5z/zsim#114. -
Is it better to move
interval_timer.h/cpp
intosrc/virt/
? I am fine with either way, but it would be good to consider the source tree organization.
src/scheduler.h
Outdated
@@ -169,7 +170,7 @@ class Scheduler : public GlobAlloc, public Callee { | |||
|
|||
public: | |||
Scheduler(void (*_atSyncFunc)(void), uint32_t _parallelThreads, uint32_t _numCores, uint32_t _schedQuantum) : | |||
atSyncFunc(_atSyncFunc), bar(_parallelThreads, this), numCores(_numCores), schedQuantum(_schedQuantum), rnd(0x5C73D9134) | |||
atSyncFunc(_atSyncFunc), bar(_parallelThreads, this), numCores(_numCores), schedQuantum(_schedQuantum), rnd(0x5C73D9134), intervalTimer(64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use (uint32_t)zinfo->lineSize
instead of const 64 for maxProcesses
. Because line size could change. See the constraint here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, although zinfo->lineSize was not initialized in time for this constructor!
src/interval_timer.cpp
Outdated
//Update per-process cycles if there is at least one running countdown | ||
//TODO: Dedup this work from process_stats | ||
if (processVTimers.size() > 0) { | ||
for (uint32_t cid = 0; cid < zinfo->numCores; cid++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to do the same thing as ProcessStats::getProcessCycles()
. Why re-implement this?
src/virt/time.cpp
Outdated
// SYS_getitimer | ||
|
||
PostPatchFn PatchGetitimerSyscall(PrePatchArgs args) { | ||
return NullPostPatch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this patch do nothing? We don't need to patch getitimer()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it was missing! Implemented.
break; | ||
case CONTEXT_CHANGE_REASON_SIGRETURN: | ||
reasonStr = "SIGRETURN"; | ||
assert(signalStackDepth[tid] > 0); | ||
signalStackDepth[tid]--; | ||
break; | ||
case CONTEXT_CHANGE_REASON_APC: | ||
reasonStr = "APC"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: do the other context switch reasons need to be similarly handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Linux, SIGNAL, FATALSIGNAL, and SIGRETURN are all that matter.
// info("vDSO function (%d) called from vdso (%d), level %d, skipping", func, vdsoPatchData[tid].func, vdsoPatchData[tid].level); | ||
uint32_t vdsoDepth = vdsoPatchData.size(tid) - 1; | ||
uint32_t sigStackDepth = signalStackDepth[tid]; | ||
if (vdsoDepth < sigStackDepth) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment above.
|
||
// Analysis functions | ||
|
||
// Helper function to laziliy clean up old vdso signal stacks. The alternative would be to register these actions for the SIGRETURN callback | ||
void vdsoDeferredStackCleanup(THREADID tid) { | ||
if ((vdsoPatchData.size(tid) - 1) > signalStackDepth[tid]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the depth of vdso calls and sigStackDepth
need to match? What if there are 4 signal contexts, but just the first and last one called a vdso? Shall we use while
instead of if
here?
In general, some comments need to be added to explain the assumed invariance between vdso depth and signal stack depth.
…value. 'lineSize' is an unfortunate name for a process limit, but it affects cache aliasing and thus (indirectly) the process count (see process_tree.h).
the wrong OS PID by using the internal process index. Also clarify the meaning of 'pid' in the interval timer logic. Prevent alarm(0) from queueing a zero-delay signal.
…mers. - Switch to using stacked syscall state in order to support system calls within signal handlers and auto-restarting syscalls. - Add additional syscall information, including whether it has been interrupted and what the effective (simulated) call is. - Handle the behavior of SYS_rt_sigreturn, which never exits. - Never call SyscallExit in the signal callback. - In the post-patch for nanosleep, assume EINTR (a signal interruption) if the current phase is earlier than the wakeup phase. [scheduler] Change notifySleepEnd to return the canceled wakeup phase. [tests] Update/extend the interval-timer test program. [misc] Make sure lineSize is read from the zsim config before initializing the scheduler (which depends on it via the interval timer).
I don't think I have fully understood and checked the reasoning behind the changes related to the syscall stack in commit acd9f70. The overall design is reasonable, so I think it is OK to merge them in. But I cannot guarantee it is bug-free since I myself knows little about those corner cases mentioned in the inlined comments. Besides that, I think most my comments in the last review have been resolved. So I am going to approve the changes. If you don't have further changes, I can merge the pull request. |
This is for improved general signal support as well as new system call virtualization for interval timers. First we allow multiple vdso contexts (e.g., calling a vdso from a signal handler that interrupted a vdso), then add an interval timer API that hooks into the scheduler to enable timer virtualization.