-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native Image GC Improvements #2386
Comments
To your list, @christianwimmer, I'd like to add mechanisms that could be useful to language implementors, such as:
|
Several of these reference or object tracking features are currently provided in HotSpot through the JVMTI. If native-image eventually supports JVMTI (even if partially), these would come naturally. The downside of most of these features is that the more hooks you add to the GC implementation to call back some user code, the slower the GC becomes (more checks for callbacks) and the more difficult it becomes to update it because you are locked to the user-facing APIs. Tracing APIs (in JVMTI) are an example as they require header space which might also be necessary for the GC. Another issue is with finalizers, which are tricky to implement correctly and efficiently. Pinning, on the other hand, is a relatively simple feature to implement and can unlock very interesting use-cases where you allow accelerators to read/write directly from your managed heap. In HotSpot only Shenandoah supports pinning but G1 could also support it I think (and thus also native image could). |
Thanks for your comment, @rodrigo-bruno! I'm somewhat familiar with JVMTI and know that it supports, for example, But since this functionality exists, it shouldn't be too hard to provide similar APIs in Truffle, for example through |
Hi @fniephaus , that is correct, JVMTI was (AFAIK) originally intended to be an interface used by external tools. However, JVMTI is implemented on top of JNI, which you can use to build other abstractions. I am not super familiar with the Truffle internal workings but couldn't you create an interface on top of JNI and expose it to Truffle languages (some extra plumbing inside Truffle would be required)? Most likely the implementation of such interface would be JVM specific (i.e., one for HotSpot, one for native image). |
@fniephaus Some features you mention, like a SVM supports object pinning, this is actually a feature that we need in every GC due to the way the low-level C interface is implemented. |
|
Yup, you are right! I was under the impression that the JVMTI greatly relied on existing JNI functions but after checking the code it seems that it is quite tight with the JVM internals. |
Of course, there are always trade-offs. If some APIs needed for such mechanisms are not common enough and possibly hinder future optimizations, it may not make sense to provide them. My main goal here was to give some examples for what kind of things language implementers might want to do with the GC. Forcing incremental/full GCs and enumerating objects should be relatively easy to provide. Another thing to consider is API compatibility between JVM and SVM: it's probably undesirable if a language behaves significantly different when running on the JVM. As an example, there's no proper Java API to force incremental GCs in the JVM at the moment, and forcing full GCs instead may cause a significant performance penalty. |
hello, I have worked on mono long time ago and they work a lot on GC. And there is a cool article which describe feature. Some of them can be very interesting. I know that, at the beginning, they use an external GC (bohem GC) but it was not so accurate. So they choose to implement a new one and it take years to be done. So why not reuse a external library (or fork it?) just my 2 cents |
GraalVM Native Image CE currently provides a simple (non-parallel, non-concurrent) stop© GC. There are various areas where the GC can be improved. This issue captures ideas. Actual work should be done under separate issues, but linked from this issue so that everyone who wants to work on GC performance gets an overview of who is working on what.
TLAB implementation and sizing algorithm
The heap is divided into chunks, and currently full chunks are used as the TLAB. It is desirable to have a reasonably large chunk size (currently 1 MByte), which is often too big for a TLAB. Especially when many threads are started and some threads have very low allocation rates compared to others, GCs are started too often. It can also lead to pathological cases: when
ChunkSize * NumberOfThreads > YoungGenerationSize
, there are not enough chunks for all threads and the system starts a GC continuously.To improve this, the TLAB implementation should be decoupled from the chunk management so that many TLAB can be in one chunk. TLAB size can be adjusted per thread based on the allocation rate of a thread, i.e., threads that allocate a lot still get a whole chunk, while threads that barely allocate get a small TLAB.
Performance improvements in the GC/heap implementation itself
CommittedMemoryProvider
)Implement a mark&compact GC for the old generation
The stop© GC has a high memory overhead during GC. In the worst case, twice as much memory is needed when the whole heap is reachable, because all objects are copied during a full GC. If the OS cannot provide any memory during GC, then the VM exits because the heap is in an inconsistent state.
For the old generation, a mark&compact algorithm avoids additional memory overhead because compaction happens in place.
Error handling
The text was updated successfully, but these errors were encountered: