Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JEP 318: Epsilon: An Arbitrarily Low-Overhead Garbage Collector #1370

Closed
DanHeidinga opened this issue Mar 7, 2018 · 34 comments
Closed

JEP 318: Epsilon: An Arbitrarily Low-Overhead Garbage Collector #1370

DanHeidinga opened this issue Mar 7, 2018 · 34 comments

Comments

@DanHeidinga
Copy link
Member

http://openjdk.java.net/jeps/318

This is an "Epic" to contain requirements for supporting this JEP on OpenJ9.

@DanHeidinga
Copy link
Member Author

DanHeidinga commented Mar 7, 2018

Discussed this idea with @charliegracie and it should be relatively straightforward to create a "zero gc" policy based off optthruput.

Likely by extending one of the classes (TODO: remember which one) and adding a flag to determine whether or not to GC on allocation failure.

Update:
Add a "allowed to GC" boolean to each of the sub (semi?) space classes.

@ashu-mehra
Copy link
Contributor

@Param-S and I were discussing something similar few weeks back and that it could be useful in OpenWhisk (serverless) kind of environment where the JVM does not run long enough to exhaust heap. We won't need to worry about GC cycle at all; JVM would just handle allocation.
Any idea if that's the scenario this JEP is also targeting, or there are other use cases?

@Param-S - fyi

@DanHeidinga
Copy link
Member Author

@ashu-mehra Possibly provided the Whisk container has sufficient memory to live through the request(s). My understanding of OpenWhisk is that the container is reused for subsequent requests and killed every ~15 minutes.

The use of this kind of GC in that environment would require tuning of the max heap to live through multiple requests.

@JamesKingdon
Copy link
Contributor

I wonder if there's an opportunity to reduce the number of async checks the JIT inserts if we were running in a no-GC environment. The JEP seems to be looking at two scenarios, one for testing and one for short lived JVMs wanting to extract maximum performance.

@pshipton pshipton added the jdk11 label Mar 8, 2018
@DanHeidinga
Copy link
Member Author

@dmitripivkine Can someone on the GC team take a look at this? It's (tentatively?) targeted to JDK11 but would be nice to support all the way back to 8.

@dmitripivkine dmitripivkine self-assigned this Apr 9, 2018
@dmitripivkine
Copy link
Contributor

dmitripivkine commented Apr 11, 2018

Optthruput should be good base because of all barriers are off. Call of GC in allocation failure path might be disabled. Instead JVM must try to expand heap or, if it is not possible, take OOM path. The generation of core files for this event should be disabled by default. There is no need to create GC Slave threads at startup as well as reserve LOA. Collector should be setup to expand on allocation failure. New option parsing, new flag in GC_Extensions expected. JIT Team might provide number of optimizations as well. Testing plan required. This feature should be implemented for Java 8 and higher versions. The feature should be documented

@dmitripivkine
Copy link
Contributor

@LinHu2016 Would you please start work on this?

@LinHu2016
Copy link
Contributor

sure

@LinHu2016
Copy link
Contributor

not sure if we need to support finalization for "no-gc collector"

@dmitripivkine
Copy link
Contributor

I believe we should not touch finalization code for primary implementation. There are number of possible improvements (including finalization) we can do in future if necessary

@LinHu2016
Copy link
Contributor

first step:

  • add new gcpolicy:noop ( or match oracle naming -- "gcpolicy:epsilon", any another suggestion?, should we support -XX:+UseNoGC and -XX:+UseEpsilonGC for compatibility?)
  • initialize heap ,collector ,classmanager base on Optthruput with the configuration for no-op
    ( noScavenge, noConcurrentMark, noConcurrentSweep, noLOA, noSplitAddressOrderedList, disable processLargeAllocateStats/estimateFragmentation, disable system gc...., for now the ParallelGlobalCollector, markingScheme, SweepScheme, CompactScheme have been initialized for minimizing subspace and memorypool change, in future might use "empty colletor" for saving footprint)
  • disable garbage collection on allocation failure in MemorySubSpaceFlat, instead expend heap
  • disable dumps on OOM exception by default, can be enabled with the java options
  • update verbose gc and MXBeans for new gcpolicy
  • no finalization
  • test plan: try to reuse openjdk test/hotspot/jtreg/gc/epsilon, some tests need to be updated to work with OpenJ9
  • no extra work for supporting in Java 8

@pshipton
Copy link
Member

pshipton commented May 1, 2018

I suggest -gcpolocy:nogc instead of -gcpolicy:noop. +1 for supporting -XX:+UseNoGC for compatibility. -1 for supporting options using epsilon as this seems like a transient name that wouldn't necessarily be supported in the future.

@dmitripivkine
Copy link
Contributor

dmitripivkine commented May 1, 2018

I agree with Pete: -gcpolocy:nogc and -XX:+UseNoGC looks better
We don't need GC Slave threads to be created.

@LinHu2016
Copy link
Contributor

LinHu2016 commented May 3, 2018

first step: Updated

- add new gcpolicy:nogc and support -XX:+UseNoGCfor compatibility
- initialize heap ,collector ,classmanager base on Optthruput with the configuration for no-op
( noScavenge, noConcurrentMark, noConcurrentSweep, noLOA, noSplitAddressOrderedList, disable processLargeAllocateStats/estimateFragmentation, disable system gc...., for now the ParallelGlobalCollector, markingScheme, SweepScheme, CompactScheme have been initialized for minimizing subspace and memorypool change, in future might use "empty collector" for saving footprint)
- disable garbage collection on allocation failure in MemorySubSpaceFlat, instead trigger  expending heap
- disable dumps on OOM exception by default, can be enabled with the java options -Xdump
- update verbose gc and JMX for new gcpolicy
- enable reportGCStart and reportGCEnd for gcpolicy:nogc in order to report heap memory statistics for verbose gc and JMX
- no finalization
- test plan: try to reuse openjdk test/hotspot/jtreg/gc/epsilon, some tests need to be updated to work with OpenJ9
- no extra work for supporting in Java 8

@dmitripivkine
Copy link
Contributor

After discussion we decided that -Xgcpolicy:nogc should generate set of dumps on OOM by default.
This is a default behaviour for JVM and this is customer responsibility to setup application properly to avoid OOM.

@DanHeidinga
Copy link
Member Author

-Xgcpolicy:nogc should generate set of dumps on OOM by default.

Agreed. In a scenario tuned to run with the nogc policy, an OOM should generate dumps to allow the user to figure out the cause.

@fjeremic
Copy link
Contributor

fjeremic commented Jun 5, 2018

FYI if we see people starting to use this new mode to squeeze out maximum performance, there should be opportunities for the JIT to eliminate and relax some checks if we can assume no GC will ever happen.

@pshipton
Copy link
Member

pshipton commented Jun 5, 2018

Well, I was just about to ask @dmitripivkine @LinHu2016 if there more work to do, or we can close this Issue.

@fjeremic do we need to keep it open for potential future JIT work, or should we create another issue for that?

@fjeremic
Copy link
Contributor

fjeremic commented Jun 5, 2018

@fjeremic do we need to keep it open for potential future JIT work, or should we create another issue for that?

I would say create another issue or leave things as is since we already took note of it here. I see this as an opportunity for future improvement however we would need to justify investment into this area which means this GC policy has to see some traction first from users.

Edit:

Actually I don't think this warrants a new issue. We've already brought up the point here which we can reference in the future.

@dmitripivkine
Copy link
Contributor

I believe we do have good starting point and can close this issue. JIT support and GC further improvements can be done under another issues.

@pshipton
Copy link
Member

pshipton commented Jun 5, 2018

Seems the consensus is to close.

@pshipton pshipton closed this as completed Jun 5, 2018
@pshipton
Copy link
Member

pshipton commented Jun 5, 2018

Perhaps a little hasty. I should have confirmed, did the appropriate testing for the new gc policy get added?

@pshipton pshipton reopened this Jun 5, 2018
@LinHu2016
Copy link
Contributor

@pshipton testing plan has not completed, currently we are using openjdk hotspot jtreg epsilon tests for verifying JEP318 functions and SPECjbb2005 for performance test, but there are some limitations to use epsilon tests.
1, some epsilon tests have the dependence on new JAVA 9 API (for pid), could not run on Java 8
2, epsilon tests are not in openjdk main branch(it targets to Java 11)
3, some tests need to be updated to run with OpenJ9 due to some incompatible jvm options between hotspot and OpenJ9.

@pshipton
Copy link
Member

pshipton commented Sep 5, 2018

@LinHu2016 where are we for the testing? I'd like to assign this to a milestone and close it off. The 0.11.0 milestone closes Oct 8, which I'm guessing is too early. I'll tentatively assign this to the 0.12.0 milestone which closes in Jan.

@pshipton pshipton added this to the Release 0.12.0 milestone Sep 5, 2018
@LinHu2016
Copy link
Contributor

@pshipton original plan for testing is that trying to reuse openjdk test/hotspot/jtreg/gc/epsilon.
There are 12 of15 tests can be reuse without major code change, but some of tests have dependence on new JAVA 9 APIs, which would not work for Java 829, so I am going to write new pack tests for gcpolicy:nogc.

@pshipton
Copy link
Member

pshipton commented Sep 5, 2018

@LinHu2016 do you have a timeline for completing the work? Can it be completed by early January?

@LinHu2016
Copy link
Contributor

plan to complete it in October.

@pshipton
Copy link
Member

pshipton commented Sep 7, 2018

ok. I'll leave it in the Jan 0.12.0 milestone for now, since the Oct 0.11.0 milestone work needs to be completed by Oct 8. If it happens to be finished in time we can change the milestone.

@pshipton
Copy link
Member

pshipton commented Dec 3, 2018

@LinHu2016 is there still some testing to complete? I was under the impression it's not quite done yet.

@LinHu2016
Copy link
Contributor

@pshipton the tests for nogc are done(#3355), now they are part of extended tests, I will promote them to sanity test after waiting another week if there is no any issue on all of platforms

@smlambert
Copy link
Contributor

We are in the process of moving tests in the other direction (move more sanity tests to extended), to ensure sanity set stays small / runs fast.

Is there a specific request to move the nogc tests to sanity, or can we leave them where they are?

@pshipton
Copy link
Member

pshipton commented Dec 4, 2018

@LinHu2016 see Shelley's comment #1370 (comment)

Separately from the above, is there any reason to keep this Issue open or can it now be closed?

@LinHu2016
Copy link
Contributor

@smlambert keep the tests in extended level are fine, move to sanity just for catching regression earlier. BTW the most of tests only take under 10 secs, but 2 of them for each platform due to generate dumps might take a couple mins.

@LinHu2016
Copy link
Contributor

LinHu2016 commented Dec 4, 2018

@pshipton yes, the issue can be closed now. Thanks

@pshipton pshipton closed this as completed Dec 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants