Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce CPU usage of gradle run #49055

Merged
merged 2 commits into from
Nov 14, 2019

Conversation

jaymode
Copy link
Member

@jaymode jaymode commented Nov 13, 2019

The RunTask is responsible for logging output from nodes to the console
and also stays active since we want the cluster to keep running.
However, the implementation of the logging and waiting resulted in a
spin loop that continually polls for data to have been written to one
of the nodes' output files. On my laptop, this causes an idle
invocation of gradle run to consume an entire core.

The JDK provides a method to be notified of changes to files through
the use of a WatchService. While a WatchService based implementation
for logging and waiting works, a delay of up to ten seconds is
encountered when running on macOS. This is due to the lack of a native
WatchService implementation that uses kqueue or FSEvents; the current
WatchService implementation in the JDK uses polling with a default
interval of ten seconds. While the interval can be changed
programmatically it is not an acceptable solution due to the need to
access the com.sun.nio.file.SensitivityWatchEventModifier enum, which
is in an internal package.

The change in this commit instead introduces a check to see if any data
was available to read and log. If no data is available in any of the
node output files, the thread sleeps for 100ms. This is enough time to
prevent consuming large amounts of cpu while still providing output to
the console in a timely fashion.

The RunTask is responsible for logging output from nodes to the console
and also stays active since we want the cluster to keep running.
However, the implementation of the logging and waiting resulted in a
spin loop that continually polls for data to have been written to one
of the nodes' output files. On my laptop, this causes an idle
invocation of `gradle run` to consume an entire core.

The JDK provides a method to be notified of changes to files through
the use of a WatchService. While a WatchService based implementation
for logging and waiting works, a delay of up to ten seconds is
encountered when running on macOS. This is due to the lack of a native
WatchService implementation that uses kqueue or FSEvents; the current
WatchService implementation in the JDK uses polling with a default
interval of ten seconds. While the interval can be changed
programmatically it is not an acceptable solution due to the need to
access the com.sun.nio.file.SensitivityWatchEventModifier enum, which
is in an internal package.

The change in this commit instead introduces a check to see if any data
was available to read and log. If no data is available in any of the
node output files, the thread sleeps for 100ms. This is enough time to
prevent consuming large amounts of cpu while still providing output to
the console in a timely fashion.
@jaymode jaymode requested a review from alpar-t November 13, 2019 19:40
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Build)

Copy link
Contributor

@alpar-t alpar-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for catching it !

Copy link
Contributor

@pugnascotia pugnascotia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

In case we ever want to revisit this, there seems to be a couple of MacOS implementation that use JNA and the Carbon APIs to provide a native watch implementation. For example:

https://github.com/gmethvin/directory-watcher

@jasontedor
Copy link
Member

The Carbon APIs were never made 64-bit and have been removed since macOS Catalina when 32-bit support was dropped, we definitely wouldn't want to build a solution on them.

@mark-vieira
Copy link
Contributor

I remember we encountered these same limitations when implementing continuous builds in Gradle. In the end we just lived with the fact that the MacOS experience has potential for high latency.

Thanks for catching this Jay. The only other option I'd see is to refactor this to use JavaExec and just directly stream back stdout. That'd have potential to be significant though, as we'd effectively have to reimplement any desired logic in our launch script in the build.

@jaymode jaymode merged commit e0df257 into elastic:master Nov 14, 2019
@jaymode jaymode deleted the gradle_run_stop_burning_me branch November 14, 2019 17:18
jaymode added a commit that referenced this pull request Nov 14, 2019
The RunTask is responsible for logging output from nodes to the console
and also stays active since we want the cluster to keep running.
However, the implementation of the logging and waiting resulted in a
spin loop that continually polls for data to have been written to one
of the nodes' output files. On my laptop, this causes an idle
invocation of `gradle run` to consume an entire core.

The JDK provides a method to be notified of changes to files through
the use of a WatchService. While a WatchService based implementation
for logging and waiting works, a delay of up to ten seconds is
encountered when running on macOS. This is due to the lack of a native
WatchService implementation that uses kqueue or FSEvents; the current
WatchService implementation in the JDK uses polling with a default
interval of ten seconds. While the interval can be changed
programmatically it is not an acceptable solution due to the need to
access the com.sun.nio.file.SensitivityWatchEventModifier enum, which
is in an internal package.

The change in this commit instead introduces a check to see if any data
was available to read and log. If no data is available in any of the
node output files, the thread sleeps for 100ms. This is enough time to
prevent consuming large amounts of cpu while still providing output to
the console in a timely fashion.
@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure >non-issue Team:Delivery Meta label for Delivery team v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants