Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Truffle compiler control based on HotSpot's CompileBroker compilation activity #10135

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

simonis
Copy link

@simonis simonis commented Nov 21, 2024

Truffle compilations run in "hosted" mode, i.e. the Truffle runtimes triggers compilations independently of HotSpot's CompileBroker. But the results of Truffle compilations are still stored as ordinary nmethods in HotSpot's code cache (with the help of the JVMCI method jdk.vm.ci.hotspot.HotSpotCodeCacheProvider::installCode()). The regular JIT compilers are controlled by the CompileBroker which is aware of the code cache occupancy. If the code cache runs full, the CompileBroker temporary pauses any subsequent JIT compilations until the code cache gets swept (if running with -XX:+UseCodeCacheFlushing -XX:+MethodFlushing which is the default) or completely shuts down the JIT compilers if running with -XX:+UseCodeCacheFlushing.

Truffle compiled methods can contribute significantly to the overall code cache occupancy and they can trigger JIT compilation stalls if they fill the code cache up. But the Truffle framework itself is neither aware of the current code cache occupancy, nor of the compilation activity of the CompileBroker. If Truffle tries to install a compiled method through JVMCI and the code cache is full, it will silently fail. Currently Truffle interprets such failures as transient errors and basically ignores it. Whenever the corresponding method gets hot again (usually immediately at the next invocation), Truffle will recompile it again just to fail again in the nmethod installation step, if the code cache is still full.

When the code cache is tight, this can lead to situations, where Truffle is unnecessarily and repeatedly compiling methods which can't be installed in the code cache but produce a significant CPU load. Instead, Truffle should poll HotSpot's CompileBroker compilation activity and paus compilations for the time the CompileBroker is pausing JIT compilations (or completely shutdown Truffle compilations if the CompileBroker shut down the JIT compilers).

The corresponding JVMCI change is tracked under JDK-8344727: [JVMCI] Export the CompileBroker compilation activity mode for Truffle compiler control.

This PR fixes the problem by checking HotSpot's compilation activity mode in OptimizedTruffleRuntime::submitForCompilation() before actually submitting a compilation task to a compile queue. If the compilation activity mode is RUN_COMPILATION the task is submitted as before without any changes. However, if the compilation activity mode is STOP_COMPILATION (i.e. the CompileBroker has temporarily stopped JIT compilations, we flush the current compile queue (because compiled methods can not be installed anyway) and return null. We also start a timer which can be configured with the new StoppedCompilationRetryDelay parameter (defaults to 1000ms). After StoppedCompilationRetryDelay we submit a new compilation task, even if the compilation activity mode is not RUN_COMPILATION. This can help to trigger a code cache cleanup in situations when there are no JIT compilations, because code cache sweeping is only triggered when new nmethods are installed. Finally, when the compilation activity mode is SHUTDOWN_COMPILATION, we simply shutdown the compilation queue and issue a warning to inform users about the code cache shortage.

I've manually tested the change by running an octane benchmark with a very small code cache. Before the change, we could see the following results:

RayTrace: 484
----
Score (version 9): 484

[engine] Truffle runtime statistics for engine 1
    Compilations                : 2255
      Success                   : 549
      Temporary Bailouts        : 1703
        jdk.vm.ci.code.BailoutException: Code installation failed: code cache is full: 1702
      Permanent Bailouts        : 0
      Failed                    : 0
      Interrupted               : 3
    Invalidated                 : 0
    Queues                      : 2480
    Dequeues                    : 30
        Target inlined into only caller: 30
    Splits                      : 156
    Compilation Accuracy        : 1.000000
    Queue Accuracy              : 0.987903
    Compilation Utilization     : 2.948298
    Remaining Compilation Queue : 195
    Time to queue               : count=2480, sum=187025989722, min=   60029, average= 75413705.53, max=181007679, maxTarget=cross
    Time waiting in queue       : count=2254, sum=1385581671, min=      54, average=   614721.24, max=156182217, maxTarget=Function.prototype.apply <split-322>

    JVMCI CompileBroker Time:
       Compile:          0,000 s
       Install Code:     0,000 s (installs: 0, CodeBlob total size: 0, CodeBlob code size: 0)

    JVMCI Hosted Time:
       Install Code:    27,227 s (installs: 565, CodeBlob total size: 5916736, CodeBlob code size: 1713432)

  nmethod code size         :  6834896 bytes
  nmethod total size        : 14834520 bytes
CodeCache: size=3896Kb used=3458Kb max_used=3586Kb free=437Kb
 bounds [0x00007ffff436d000, 0x00007ffff473b000, 0x00007ffff473b000]
 total_blobs=1378 nmethods=566 adapters=714
 compilation: disabled (not enough contiguous free space left)
              stopped_count=26, restarted_count=25
 full_count=3200

real	3m11,398s
user	11m19,363s
sys	0m9,387s

As you can see, out of 2255 compilations, 1702 have been to no purpose, because they failed in the final installation step because of a full code cache (i.e. BailoutException: Code installation failed: code cache is full: 1702). The other interesting observation is that although the benchmark terminated after 3min11s wall clock time, it actually consumed 11m19s cpu time, because the Truffle compiler threads where continuously doing useless work.

With my changes applied, the picture looks as follows:

RayTrace: 531
----
Score (version 9): 531


[engine] Truffle runtime statistics for engine 1
    Compilations                : 128
      Success                   : 111
      Temporary Bailouts        : 17
        jdk.vm.ci.code.BailoutException: Code installation failed: code cache is full: 16
        org.graalvm.compiler.core.common.CancellationBailoutException: Compilation cancelled.: 1
      Permanent Bailouts        : 0
      Failed                    : 0
      Interrupted               : 0
    Invalidated                 : 0
    Queues                      : 410
    Dequeues                    : 283
        Compilation temporary disabled due to full code cache.: 277
        Target inlined into only caller: 6
    Splits                      : 156
    Compilation Accuracy        : 1.000000
    Queue Accuracy              : 0.309756
    Compilation Utilization     : 0.069104
    Remaining Compilation Queue : 0
    Time to queue               : count= 410, sum=24979066360, min=   88714, average= 60924552.10, max=186708180, maxTarget=blend
    Time waiting in queue       : count= 128, sum=  22955820, min=       6, average=   179342.35, max=  724639, maxTarget=:anonymous <split-344>

    JVMCI CompileBroker Time:
       Compile:          0,000 s
       Install Code:     0,000 s (installs: 0, CodeBlob total size: 0, CodeBlob code size: 0)

    JVMCI Hosted Time:
       Install Code:     0,637 s (installs: 124, CodeBlob total size: 1156760, CodeBlob code size: 321496)

  nmethod code size         :  2460456 bytes
  nmethod total size        :  4833272 bytes
CodeCache: size=3896Kb used=3464Kb max_used=3537Kb free=431Kb
 bounds [0x00007ffff436d000, 0x00007ffff473b000, 0x00007ffff473b000]
 total_blobs=1557 nmethods=745 adapters=714
 compilation: disabled (not enough contiguous free space left)
              stopped_count=5, restarted_count=4
 full_count=33

real	3m14,756s
user	3m22,013s
sys	0m1,044s

As you can see, we compile considerably fewer methods and we only have 16 compilation failures due to a full code cache. At the same time we can see that from the 410 methods which have been enqueued for compilation, 277 have been dequeued because of Compilation temporary disabled due to full code cache. Again, the wall clock time of the benchmark was 3m14s but this time, the overall cpu time was just slightly higher with 3m22s because we haven't done such a huge amount of unnecessary compilations any more. Also notice how HotSpot's code cache full_count, which is incremented every time when a nmethod can't be installed because of a full code cache, dropped from 3200 to 33.

These example were run with -XX:+UseCodeCacheFlushing -XX:+MethodFlushing. If the JVM is configured with -XX:-UseCodeCacheFlushing, the benefits of this change are even higher.

Fixes #10133

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Nov 21, 2024
@@ -912,10 +913,45 @@ private void notifyCompilationFailure(OptimizedCallTarget callTarget, Throwable
protected void onEngineCreated(EngineData engine) {
}

private long stoppedCompilationTime = 0;
private boolean logShutdownCompilations = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submitForCompilation is not thread-safe. We need to handle these fields to be modified from multpile threads. So we would need to synchronize accessing these fields.

// The logger can be null if the engine is closed.
if (logger != null && logShutdownCompilations) {
logShutdownCompilations = false;
logger.log(Level.WARNING, "Truffle host compilations permanently disabled because of full code cache. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not really call Truffle compilations host compilations, that is typically reserved for compilations of HotSpot directly. Maybe we should just change it to Truffle compilations to avoid confusion.
Sidenote: it would not be a problem to show this warning once per engine.

Priority priority = new Priority(optimizedCallTarget.getCallAndLoopCount(), lastTierCompilation ? Priority.Tier.LAST : Priority.Tier.FIRST);
return getCompileQueue().submitCompilation(priority, optimizedCallTarget);
BackgroundCompileQueue queue = getCompileQueue();
CompilationActivityMode compilationActivityMode = getCompilationActivityMode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should move this logic into OptimizedCallTarget#compile instead and have a more efficent check, similar to how we already check for OptimizedCallTarget.compilationTask? Currently we would take a lock for every OptimizedCallTarget.call that triggers a compilation if compilation is paused.

We should also use the OptimizedCallTrage.EngineData class to store any state associated with the retry time. I know it most likely will be the same for all engines, but its a principle we work by to avoid that engines can affect each other. So we also avoid sharing locks.

}
// Flush the compilations queue. There's still a chance that compilation will be re-enabled
// eventually, if the hosts code cache can be cleaned up.
for (OptimizedCallTarget target : queue.getQueuedTargets(optimizedCallTarget.engine)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just cancels compilation for one engine only, whereas the JVMCI indication is for all engines. Either we move this code to the engine level or we pass null to filter for all engines in the queue.

"Increase the code cache size using '-XX:ReservedCodeCacheSize=' and/or run with '-XX:+UseCodeCacheFlushing -XX:+MethodFlushing'.");
}
try {
queue.shutdownAndAwaitTermination(100 /* milliseconds */);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe a bit what you are trying to achieve here? why would it help to block the entire interpreter thread if this happens?

* Returns the current host compilation activity mode which is one of:
* {@code STOP_COMPILATION}, {@code RUN_COMPILATION} or {@code SHUTDOWN_COMPILATION}
*/
default CompilationActivityMode getCompilationActivityMode() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to have the getCompilationActivityMode in this interface. This interface is intended for methods that get called by the Graal compiler. You can move this abstract specification directly into OptimizedTruffleRuntime and implemented more concretely in HotSpotTruffleRutnime.

"compilation activity mode in the host VM. If the activity mode is 'STOP_COMPILATION' because " +
"of a full code cache, no new compilation requests are submitted and the compilation queue is flushed. " +
"After 'StoppedCompilationRetryDelay' milliseconds new compilations will be submitted again " +
"(which might trigger a sweep of the code cache and a reset of the compilation activity mode in the host JVM).",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this sounds like it would be supported in all runtimes. But it is not. SVM does not support this yet. So we should mention this only works for HotSpot right now. Note that OptimizedRuntimeOptions are shared across all optimizing runtimes this includes SVM.

@@ -158,6 +158,14 @@ public ExceptionAction apply(String s) {
// TODO: GR-29949
public static final OptionKey<Long> CompilerIdleDelay = new OptionKey<>(10000L);

@Option(help = "Before the Truffle runtime submits an OptimizedCallTarget for compilation, it checks for the " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: The help text is extensive but seems a bit implementation heavy. I think it is enough for a user of this option to know what happens that is observable to the user. In other words specific enum names like STOP_COMPILATION won't help understanding the semantics of this option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this option is ignored if not running on HotSpot right? Might be worth pointing that out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this option is ignored if not running on HotSpot right? Might be worth pointing that out.

Sorry - just saw now that @chumer already pointed this out.


/**
* Returns the current host compilation activity mode which is one of:
* {@code STOP_COMPILATION}, {@code RUN_COMPILATION} or {@code SHUTDOWN_COMPILATION}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not list the enum constants in this javadoc - just one more location you need to remember to update if enum constants are added (or deleted/renamed).

@@ -158,6 +158,14 @@ public ExceptionAction apply(String s) {
// TODO: GR-29949
public static final OptionKey<Long> CompilerIdleDelay = new OptionKey<>(10000L);

@Option(help = "Before the Truffle runtime submits an OptimizedCallTarget for compilation, it checks for the " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this option is ignored if not running on HotSpot right? Might be worth pointing that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Truffle compiler control based on HotSpot's CompileBroker compilation activity
3 participants