-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
thread: support mprotect-based stack guard #327
Merged
shintaro-iwasaki
merged 8 commits into
pmodels:main
from
shintaro-iwasaki:pr/mprotect_stack
Apr 20, 2021
Merged
thread: support mprotect-based stack guard #327
shintaro-iwasaki
merged 8 commits into
pmodels:main
from
shintaro-iwasaki:pr/mprotect_stack
Apr 20, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--enable-stack-overflow-check=mprotect and mprotect-strict is a configure-time option to enable the mprotect-based stack overflow check.
The new mprotect-based stack overflow check does not add a large overhead to the existing ULT mechanism if disabled. Since it should be useful for debugging, this patch makes the setting of the mprotect-based stack overflow dynamically configureable. The user can set ABT_STACK_OVERFLOW_CHECK=mprotect or mprotect_strict to enable this feature.
ABTU_mprotect() wraps mprotect().
shintaro-iwasaki
force-pushed
the
pr/mprotect_stack
branch
from
April 19, 2021 22:45
b19b27a
to
435422c
Compare
shintaro-iwasaki
changed the title
support mprotect-based stack guard
thread: support mprotect-based stack guard
Apr 19, 2021
shintaro-iwasaki
force-pushed
the
pr/mprotect_stack
branch
from
April 19, 2021 22:47
435422c
to
83097ba
Compare
test:argobots/all |
mprotect() needs the predefined system page size (usually 4KB). It should be automatically obtained by getpagesize() in most cases, but this patch also adds a new environmental variable to accept a user value to deal with cases where getpagesize() is undefined or mprotect() on that system has a different restriction. The next commit will use this value.
Since mprotect()'ed stacks are cached in a memory pool, the mprotect() cost is zero when using a cached stack.
shintaro-iwasaki
force-pushed
the
pr/mprotect_stack
branch
from
April 19, 2021 23:41
83097ba
to
93838a6
Compare
test:argobots/all |
shintaro-iwasaki
force-pushed
the
pr/mprotect_stack
branch
from
April 20, 2021 03:38
93838a6
to
c123db9
Compare
This patch supports the mprotect-based guard for stacks that are allocated without memory pools. A few branches are added to the fork-join path even if the stack guard is disabled, but this overhead should be acceptable considering that non-default ULT allocation is not performance critical. If this overhead is unacceptable, we should create another patch to remove those branches at configure time.
The user can know which stack overflow check mechanism is used by passing ABT_INFO_QUERY_KIND_ENABLED_STACK_OVERFLOW_CHECK to ABT_info_query_config().
stack_guard tests the mprotect() behavior by catching the SEGV signal that should be issued when a program accesses the mprotect()'ed stack guard.
shintaro-iwasaki
force-pushed
the
pr/mprotect_stack
branch
from
April 20, 2021 03:53
c123db9
to
be72bd6
Compare
test:argobots/all |
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description
As requested by many users (see #274, #324, or this thread), this patch introduces an
mprotect()
-based stack guard. The program will receive SEGV if the program accesses the protected 4KB page located at the bottom of the ULT stack.How to use this feature.
Method A.
Pass
--enable-stack-overflow-check=mprotect
at configure time. This enables themprotect()
-based stack guard by default.OR
Method B.
Set
ABT_STACK_OVERFLOW_CHECK=mprotect
at execution time. This does not require a configure-time setting.Expected behavior.
When the program accesses the protected 4KB page located at the bottom of the ULT stack, the program will receive SEGV. The user cannot continue this program (since it accesses an invalid memory region), but the user can know which thread crushes the stack. The user should increase the stack size (e.g., setting
ABT_THREAD_STACKSIZE=XXX
for ULTs with the default attribute) and run this program again.Resource consumption by
mprotect()
.In a typical environment this
mprotect()
works only for 30,000 ULTs.One distinct
mprotect()
region consumes anmmap()
system resource, while typically the system allows a process to create at most 60,000 regions (seesysctl vm.max_map_count
). Since thismprotect()
mechanism creates protected and unprotected regions per ULT, the user can use this mechanism for 30,000 ULTs in total.By default, Argobots ignores the failure of
mprotect()
, which happens when the user maintains more than 30,000 ULTs, but at this point, the user should be aware that the program cannot allocate newmmap()
'ed regions, so an error can happen in different places.Alternatively, the user can set
ABT_STACK_OVERFLOW_CHECK=mprotect_strict
(or pass--enable-stack-overflow-check=mprotect-strict
) to assert thismprotect()
error. Currently, this error is unrecoverable; the Argobots runtime aborts.Most programs do not create as many as 30,000 ULTs, but if the program manages more than 30,000 ULTs at the same time (precisely speaking,
vm.max_map_count / 2
ULTs), the user should consider either disabling this feature or increasingvm.max_map_count
.Stack size
This mechanism uses at most 8KBs (precisely speaking,
system page size * 2
) from the stack. The default stack size is automatically increased, while the stack size is not increased if the user gives a stack size or a stack memory. This is the user's responsibility to set a proper stack size.Stack canary vs. mprotect-based stack guard.
Stack canary is good because it is lightweight and does not use any system resources. Stack canary cannot pinpoint when the ULT crashes the stack, though. Sometimes another problem is caused before the runtime detects the death of the stack canary. This stack canary is useful to check if there is no problem: if a stack canary is alive, the user can know that the program did not break the ULT stacks. To enable this stack canary feature, set
--enable-stack-overflow-check=canary
or--enable-stack-overflow-check=canary-XXX
whereXXX
is the canary size.mprotect()
-based stack guard is good for debugging since the user can know where the stack smash happens. However, this is heavy (especially when the ULT stack is not obtained from a built-in stack pool) and limited bymmap
resources. This should be useful only for debugging.Performance impact.
If
mprotect()
'ed stacks are cached, there's no overhead. A pair ofmprotect()
costs around 1 microsecond on an enterprise Intel machine, so an additional overhead is 1 microsecond per ULT fork-join if that ULT is not cached (e.g., a user-given stack size / stack memory). The Argobots ULT fork-join operation can be a few hundred nanoseconds (except for externalmalloc()
/free()
overheads), so this microsecond-order cost is not very small. The user should be aware of this overhead.Checklist
module: short description
and follows good practice