-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow nursery size to be proportional to the heap size #1087
Conversation
/// Return upper bound of the nursery size (in number of bytes)
pub fn get_max_nursery_bytes(&self) -> usize {
self.nursery.max.unwrap_or_else(|| {
if !vm_layout().force_use_contiguous_spaces {
DEFAULT_MAX_NURSERY_32
} else {
DEFAULT_MAX_NURSERY
}
})
} This check is removed. I am not sure about the consequences. Maybe I should change the default value for max nursery, depending on the VM layout. |
I quickly tried |
The performance results for now (using 2.5x minheap from GenImmixNormalized to Immix/master Normalized to GenImmix/master StickyImmixNormalized to Immix/master Normalized to StickyImmix/master Issues I will look into
Issues I am not going to look into
|
Hmm. |
I fixed a few mistakes based on the previous evaluation. The following is the new results. I ran with 2.5x of the G1 heap size ( The results show that the PR is an improvement for both GenImmix and StickyImmix in most benchmarks, 8% improvement for GenImmix and 3% improvement for StickyImmix (using 0.25x heap size as the min nursery). However, both plans are still performing worse than Immix (2% and 1.5% slower, respectively). There are still other performance issues with our generational plans that should be looked into. The results also show slowdown in GenImmixStickyImmix |
src/util/options.rs
Outdated
max: usize, | ||
}, | ||
/// A bounded nursery that is porportional to the current heap size. By default, a proportional bounded | ||
// nursery has a lower bound of 20% of the heap size, and has an upper bound of 100% of the heap size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// nursery has a lower bound of 20% of the heap size, and has an upper bound of 100% of the heap size. | |
// nursery has a lower bound of 25% of the heap size, and has an upper bound of 100% of the heap size. |
/// The default min nursery size. This does not affect the actual space we create as nursery. It is | ||
/// only used in the GC trigger check. | ||
#[cfg(target_pointer_width = "32")] | ||
pub const DEFAULT_MIN_NURSERY: usize = 2 << LOG_BYTES_IN_MBYTE; | ||
const DEFAULT_MAX_NURSERY_32: usize = 32 << LOG_BYTES_IN_MBYTE; | ||
/// The default max nursery size for 32 bits. | ||
pub const DEFAULT_MAX_NURSERY_32: usize = 32 << LOG_BYTES_IN_MBYTE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These numbers were based off of JikesRVM and an older DaCapo benchmark suite. I'm not sure if they reflect good defaults for the larger workloads + compressed pointers would also use these defaults instead of the 64-bit one (which might explain the performance issues you saw).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah wait -- this was a bit confusing: for estimate_virtual_memory_in_bytes
, if force_use_contiguous_space
is false
(which it is for compressed pointers and 32-bit VMs), then 32 MB is used as the max nursery size when creating the space. But then in the above code in GCTrigger
the DEFAULT_MAX_NURSERY
is used which is much larger than 32 MB for 64-bit VMs with compressed pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One major problem is the virtual memory size of the nursery for confined address space. On 32-bit systems, or 64-bit systems with compressed oops, the nurser of GenCopy
and GenImmix
occupies a contiguous range separated from discontiguous range shared by other spaces. The size is determined by estimate_virtual_memory_in_bytes
, and it returns 32MB on systems with confined address space. This size may be smaller than the desired nursery size, regardless whether the nursery size is fixed, bounded or proportionally bounded.
I think the 32MB limitation is only relevant if the nursery is contiguous and the address space is confined.
- If the nursery can be discontiguous (this is possible in theory, but I don't know why we didn't implement that. But the conceptual nursery for StickyImmix is indeed discontiguous.), we can just let it share the discontiguous address range (managed by FreeList) with other spaces.
- If the address space is not confined, the code beneath
VMRequest::fixed_extent
will ignore the size and use1TB
instead (inVMRequest::common64bit
).
I don't think we want to make the nursery discontiguous for now, because that may require changes to many different places, but not impossible. And it may affect performance.
The easiest way to make the design consistent is making two explicit rules (at least temporarily) that
- The nursery of
GenCopy
andGenImmix
is always contiguous, and - If the address space is confined, the nursery of
GenCopy
andGenImmix
can not be larger than 32MB, regardless whether the nursery size is fixed, bounded or proportional.
We can document the second rule on Options::nursery
. This may not be ideal, but we don't need to change the 32MB size limit for now.
Alternatively, instead of making 32MB a constant, we can let the VM configure it using VMLayout
.
A slightly more elegant but more difficult way to fix it is making use of VMRequest::fraction
which is currently unused in the Rust MMTk, and probably needs testing.
src/plan/generational/global.rs
Outdated
args.global_args | ||
.options | ||
.nursery | ||
.estimate_virtual_memory_in_bytes(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may make use of VMRequest::fraction
if it is appropriate.
src/util/options.rs
Outdated
// Just use the default max nursery size -- the nursery won't get larger than that. | ||
NurserySize::ProportionalBounded { min: _, max: _ } => { | ||
if !crate::util::heap::vm_layout::vm_layout().force_use_contiguous_spaces { | ||
DEFAULT_MAX_NURSERY_32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the space is confined (not using contiguous spaces), the return value will be DEFAULT_MAX_NURSERY_32
which is 32MB. But the Options::nursery
can be set to a larger proportion and can be larger than this. As @k-sareen commented in #1087 (comment), GCTrigger::get_max_nursery_bytes
may return a larger size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. You are right. I have changed this. I added a check: if we do not have pressure about virtual memory, the virtual memory for nursery can be as large as the max nursery (even if the max nursery is 100% heap size). Otherwise, it just panics.
Let me know what you think. @wks @k-sareen
I will rerun the benchmarks for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Panicking is pretty drastic, but if we don't have a nice way of saying "the rest of the heap" then maybe for the time being it is fine. We should file an issue though. Also is there any way to test for a 32-bit architecture? The JikesRVM binding doesn't support generational collectors and the MockVM doesn't stress test the GC much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be better to panic instead of running with an unexpected value silently -- we could change this if the panic turns out to be an issue.
What do you think we should test? The OpenJDK uses compressed pointer which tests most of the implementation for 32bits layout. I also added two unit tests that check the virtual memory we set for nursery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I mainly wanted to test the virtual memory and ensuring that full-heap GCs happen more promptly. I guess OpenJDK with compressed pointers and the unit tests are good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But we should make an issue for resolving the virtual memory issue. Also did you get the performance results in the end?
I am running the benchmarks. I will post the results once I got it. |
GenImmixStickyImmixThe improvement for GenImmix is 9.5% and the improvement for Stickyimmix is 3.1% (compared to the 8% and 3% improvement in the previous evaluation). However, one issue is that If the nursery size is set explicitly, I think panicking may be a good idea when we cannot satisfy the virtual memory for the nursery. But when we use the default nursery size, we would expect it run, rather than panicking. Performance-wise, I think this PR achieved what we want. |
Yes. If the user doesn't specify a nursery size, we shouldn't blame the user for providing a bad value. One way to solve this is to provide another option "force_nursery_size" which defaults to false. When true, it will panic if the give nursery size cannot be satisfied; when false, MMTk core will adjust the size to the nearest possible value. |
I am thinking about adding |
The performance looks good (better than master for most things) but still not much better than Immix which is odd. But this PR does not need to solve all the issues at once. I agree with @wks and your |
We checked with Steve. The reason that nursery was using a contiguous space is that we can do range check for mature objects in barriers. However, this is not important for now. I changed the code to use discontiguous nursery. I am running another round of evaluation. |
The results with a discontiguous nursery. GenImmixStickyImmixThe results are pretty similar to the previous run (with contiguous nursery), but slower. The improvement is 8.7% for GenImmix and 2.4% for StickyImmix, compared with 9.5% and 3.1% in the previous run. There is a consistent 0.8%/0.7% difference. The PR currently uses discontiguous nursery. But I can change it to contiguous nursery, and only use discontiguous nursery if we may run out of virtual address. Just let me know if that's desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR introduces different kinds of nursery size options, and by default, we use a proportion of the heap size as the min nursery. This PR should generally improve the generational plans' performance by triggering a full heap GC more promptly. This PR mitigates the issue identified in mmtk#594, but does not fully fix the problem.
This PR introduces different kinds of nursery size options, and by default, we use a proportion of the heap size as the min nursery. This PR should generally improve the generational plans' performance by triggering a full heap GC more promptly. This PR mitigates the issue identified in #594, but does not fully fix the problem.