-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/parallelize file parsing #28
Feature/parallelize file parsing #28
Conversation
@@ -87,8 +90,10 @@ public void execute() throws MojoExecutionException, MojoFailureException { | |||
} | |||
} | |||
|
|||
Formatter formatter = getFormatter(); | |||
|
|||
for (File directoryToFormat : directoriesToFormat) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not flatten the files from multiple root directories in order to process all of them in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few comments, no showstoppers 👋
<artifactId>maven-compiler-plugin</artifactId> | ||
<configuration> | ||
<source>1.8</source> | ||
<target>1.8</target> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the project purposefully tried to use an earlier version of Java. I seem to remember avoiding lambdas.
Is this an okay change? Might require a major bump. Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The File.walk
is only available since java 8. If bumping is out of option then we might want to close/rework this PR.
Is bumping to a major version an ok option here @malaporte ?
formatter = new Formatter(JavaFormatterOptions.builder().style(style()).build()); | ||
} | ||
return formatter; | ||
private Formatter getFormatter() throws MojoFailureException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a change of behavior here, not sure how relevant it is:
It used to be that the formatter wouldn't be initialized if there was no file to format.
If we don't care, we should move the init of the formatter in the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something tells me that it was lazy initialized for a reason 🤔
I'd look into the history of that line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, it was done lazily to allow the formatter to take into account the JavaFormatterOption.Style that is now taken as a parameter (can be "oasp" or "google").
By providing the formatter as a parameter, I keep the same functionnality (initialize the Formatter only once the style is set) while removing the need for a synchronized getter.
try (Stream<Path> paths = Files.walk(Paths.get(directory.getPath()))) { | ||
paths | ||
.collect(Collectors.toList()) | ||
.parallelStream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using this approach you'll get a better performance gain if you have a single folder with many files vs multiple folders with few files.
I would suggest flatmap, as proposed by @malaporte
Also, I would recommend using an approach that allows specifying a thread pool / executor. Is the default one use by parallelStream
sufficient here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the vanilla parallelStream might not be optimal, but it does provide some improvement. I'll leave it as future work to tweek the necessary parameters to get the best performances out of it.
I just released 2.4.0 with this change included. Enjoy! |
Parallelize the file formatting, shows some interesting performance improvement on my setup:
These numbers were optained using jmh, on a medium sized projet that I own.