Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient use of GroovyCategorySupport #877

Merged
merged 2 commits into from
Apr 2, 2024

Conversation

jglick
Copy link
Member

@jglick jglick commented Mar 29, 2024

Follows up #124 / cloudbees/groovy-cps#56 by calling GroovyCategorySupport.use only once per thread, rather than on every chunk.

Background: CpsVmExecutorService has a subtle design in that it is physically hosted on a fairly generic elastic thread pool; the CPS VM “thread” associated with a Pipeline build is virtual (does not correspond to any permanent Java Thread) and just makes use of some free physical thread whenever it has a chunk of work to run. Typically a chunk would involve as much Groovy logic as can run synchronously (milliseconds’ worth, you hope), up to the point where an asynchronous operation needs to take place, such as starting a step. (parallel is actually coöperative multitasking: step implementations can certainly run concurrently, but any interactions with the Groovy program are single-threaded. Thus the maximum number of physical threads potentially needed is the number of concurrently running builds, regardless of the number of branches in a single build.)

Now in order to support some Groovy hacks we do to make normal-looking language idioms transparently run in CPS mode (#124), we use GroovyCategorySupport which sets up some hooks in a dynamic scope (applicable to the current Thread), and its sole API is to take a Closure to run with those hooks in effect. Normally setting up and tearing down the hooks is expected to be very quick, but in a heavily loaded controller with complex Pipeline libraries it seems it can be slow and even a source of contention: pool threads from unrelated builds are waiting on the same static (global) lock. (CloudBees-internal reference: SECO-3635)

waiting to lock <0x…> (a java.lang.Class)
owned by "Running CpsFlowExecution[…]]" id=… (0x…)
	at org.codehaus.groovy.vmplugin.v7.IndyInterface.invalidateSwitchPoints(IndyInterface.java:124)
	at org.codehaus.groovy.vmplugin.v7.Java7.invalidateCallSites(Java7.java:70)
	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.endScope(GroovyCategorySupport.java:109)
	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:138)
	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:146)

The repeated setup and teardown of categories is actually entirely wasted work, since we set the same hooks for all builds, but Groovy (at least the v2 we run) does not offer an API to permanently install a hook. So to minimize the frequency with which we acquire this global lock, this PR passes Groovy a closure which runs any number of chunks of work from one or more builds in sequence, saving most of the overhead. Java still manages the thread pool as before, creating new threads when there are a lot of concurrent builds doing significant Groovy work, and discarding them when idle.

Note that we do not currently support categories from user code (cloudbees/groovy-cps#16) and it was never clear how we could.

@jglick jglick requested a review from a team as a code owner March 29, 2024 19:08
}
});
public Outcome run0(final Outcome cn) {
Next n = cn.resumeFrom(e,k);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ignore WS)

Copy link
Member

@dwnusbaum dwnusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, my main concerns with this kind of change were that it might introduce complex cross-script memory leaks and/or security issues. Looking at GroovyCategorySupport.java though, I don't really see any obvious problems, mainly given "Groovy (at least the v2 we run) does not offer an API to permanently install a hook" and cloudbees/groovy-cps#16.

One thing to note though is that user-defined category methods and use do work in @NonCPS methods (although use is not allowed in the sandbox by default), so it is at least possible for users to interact with user-defined categories in limited ways. Calls like https://github.com/apache/groovy/blob/41b990d0a20e442f29247f0e04cbed900f3dcad4/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java#L187 in the implementation could end up storing user-defined classes in the global ClassInfo cache, but this seems no different than just using the class directly in a script, and CpsFlowExecution.cleanUpGlobalClassValue will clean all of that up when the Pipeline completes, so that seems fine.

@jglick
Copy link
Member Author

jglick commented Mar 29, 2024

Right. The categories is just a list of (plugin-defined) classes, and even a use call in @NonCPS from a trusted library would run endScope as soon as it is done, so I do not see anything that could leak.

This was not a flake. While CpsFlowExecutionMemoryTest.loaderReleased passes promptly in trunk, it passes only after many GC cycles in this branch:

Apparent soft references to org.jenkinsci.plugins.workflow.cps.CpsGroovyShell$CleanGroovyClassLoader@58beae71: {org.jenkinsci.plugins.workflow.cps.CpsGroovyShell$CleanGroovyClassLoader@58beae71=public static final java.util.concurrent.ExecutorService hudson.model.Computer.threadPoolForRemoting->
jenkins.util.ContextResettingExecutorService@1e05c0c3-base->
jenkins.security.ImpersonatingExecutorService@31029259-base->
jenkins.util.ErrorLoggingExecutorService@1c3661ef-base->
java.util.concurrent.ThreadPoolExecutor@4b0846a-workQueue->
java.util.concurrent.SynchronousQueue@70db6fb9-transferer->
java.util.concurrent.SynchronousQueue$Transferer@1a7a4c85-head->
java.util.concurrent.LinkedTransferQueue$DualNode@7679c893-waiter->
java.lang.Thread@24fe0a89-contextClassLoader->
org.jenkinsci.plugins.workflow.cps.CpsGroovyShell$CleanGroovyClassLoader@58beae71}

This seems to have been due to a core problem unrelated to GroovyCategorySupport, that another thread pool was initialized using whatever thread group it happened to have at the time. Probably core should ensure that stock executor services like Computer.threadPoolForRemoting and Timer use a special ThreadFactory that specifies a new top-level group?

I will run this through PCT.

@jglick jglick marked this pull request as draft March 29, 2024 21:12
… of things in a test case: `Computer.threadPoolForRemoting` was creating threads using an inappropriate group.
@jglick jglick marked this pull request as ready for review March 29, 2024 21:34
@jglick jglick requested a review from dwnusbaum March 29, 2024 21:34
@dwnusbaum dwnusbaum merged commit d0f0248 into jenkinsci:master Apr 2, 2024
14 checks passed
dwnusbaum added a commit to dwnusbaum/workflow-cps-plugin that referenced this pull request Apr 2, 2024
More efficient use of `GroovyCategorySupport`

(cherry picked from commit d0f0248)
@dwnusbaum dwnusbaum mentioned this pull request Apr 2, 2024
@jglick jglick deleted the GroovyCategorySupport branch April 4, 2024 12:35
basil added a commit to basil/bom that referenced this pull request Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants