-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-XX:+UseConcMarkSweepGC JAVA_OPTS slows down logstash by ~20% #2914
Comments
I will put it as blocker for 1.5, we need to figure if using the CMS garbage collector is necessary and has benefits over using the default ParallelGC. Currently the default logstash
|
I tried to search the git log for the rational for using these settings but did not find anything. These settings are the same as in elasticsearch, maybe it was decided that these settings would also benefit logstash, and maybe in particular when using the embedded elasticsearch? In any case, I do not believe we actually need to enable the CMS GC since the heap usage signature of logstash is not like with elasticsearch. ES experiences memory usage spikes and want to really minimize stop-the-world situations. Also, using embedded elasticsearch in logstash is not recommended for production and is only there for easier prototyping/testing so we should not set default GC options for that use-case. logstash will typically have a rather stable heap usage pattern. given enough memory, I do not believe there is any benefit for using CMS GC at the cost of a ~20% overall performance drop. also, a stop-the-world situation in logstash is not critical as opposed to in elasticsearch. In fact, I think logstash would benefit from having a larger GC Young Generation heap space using either thoughts? I suggest we remove these JVM options by default and also make JAVA_OPTS="-Xmx500m -XX:NewSize=400m" bin/logstash ... Also, as in elasticsearch, I suggest we also set these option by default in logstash:
|
using simple pipeline performance configurations defined in #2870 I can see a ~10% performance increase between using only these
vs
for these configs, using a larger youg generation heap performs better. |
This is really interesting. It's also very important to understand and document well because JVM settings could have a profound impact on those who are still trying to use the embedded Elasticsearch. |
@colinsurprenant this makes complete sense to me to investigate prior to 1.5 GA, we have as @untergeek said understand better our specific GC needs, ... Not sure if this qualify, but I see our GC needs as something that depends highly on our pipeline usage, right? I mean what might be the difference with doing lots of Groks or not? for example. Definitely something important to dig into |
We agree on having different options + having the option to overwrite this options from the client side point of view. |
Test:
So my proposal: iff --git a/bin/logstash.lib.sh b/bin/logstash.lib.sh
index 1b1fc46..a6fe590 100755
--- a/bin/logstash.lib.sh
+++ b/bin/logstash.lib.sh
@@ -20,18 +20,15 @@ setup_java() {
fi
JAVA_OPTS="$JAVA_OPTS -Xmx${LS_HEAP_SIZE}"
- JAVA_OPTS="$JAVA_OPTS -XX:+UseParNewGC"
- JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"
+ JAVA_OPTS="$JAVA_OPTS -XX:NewRatio=2"
JAVA_OPTS="$JAVA_OPTS -Djava.awt.headless=true"
- JAVA_OPTS="$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75"
- JAVA_OPTS="$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
-
if [ ! -z "$LS_USE_GC_LOGGING" ] ; then
JAVA_OPTS="$JAVA_OPTS -XX:+PrintGCDetails"
JAVA_OPTS="$JAVA_OPTS -XX:+PrintGCTimeStamps" |
Could we do a check with real world data? Maybe using the HOW's complete.conf? |
@ph I'm doing this right now |
I'm using also different datasets that I've spare in here. |
❤️ |
Hi, Here is my report:
Executions are made with this set of ENV variables
and using the vendored JRuby. This is done using master and running it with as in
so as I said, using the vendored jruby. thoughts why do I get this numbers? |
Now I'm doing some micro benchmarking with more complex configuration and datasets, as in with a set of apache logs and a set of xml documents. Will report numbers in here soon. |
can you share the exact command lines you used to do the benchmarks. also exactly which LS version you used? master? |
Yes, sure, updating my comment so it stays all together. |
Updating the numbers as requested by @colinsurprenant so the reported number is the one given back by pv, not the timing as it's now. |
test: Vanilla JVM flags on master:
Disabling CMS:
Disabling CMS + Increasing New Gen size:
Vanilla flags again after previews tests:
So |
@jsvd note that a higher value of Since the only significative difference we can measure is by removing the CMS I would go ahead and remove that from the defaults but keep the rest as is. In other words, I would remove these options: JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"
JAVA_OPTS="$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75"
JAVA_OPTS="$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly" |
relates to #2942 |
we decided to not put it as blocker for RC3, this will be pushed back to the next release (either 1.5.1 or 1.6.0). OTH we will merge #2942 so that we have a way to completely override |
If I use elasticsearch output, with protocol option as node, has impact on this? |
@felipegs no it should not. the CMS GC option will be useful when memory pressure increases and that typically happens on ES data nodes. |
+1 on adding heap dump on oom flag like Elasticsearch, eg.
|
By setting This is under realworld conditions and a pretty large & complex production configuration. |
If I recall correctly, I made this change because it was already what Elasticsearch was using. |
bump. we should reassess this WRT ng-pipeline and logstash-core-event-java but I believe there are no benefit for using CMS - the Elasticsearch heap usage pattern is nowhere similar to logstash and so far all tests seems to indicate a non-negligible performance increase. |
+1 |
Hi all I have just remove CMS and leaving default GC behaviour and I can't imagine LS v5 release without GC that can be (easily) settable, or at least documented. For those interested in changing GC just through config, use
|
Based onthe evidence so far, I am +1 on changing the default GC parameters for Logstash 5.0.0 do whatever we find is best. |
+1... CMS is supposed to offer possibly short stop-the-world phases, which are important for real-time or event-based applications (ie. get response to ES request as soon as possible). Logstash is about throughput and having longer single GC pauses with smaller total GC overhead is definitely welcome. Possibly even earlier as 5.0 - it's a small change in init scripts, does not depend on new code and could be published soon |
I realize its late, but just to add something here, we haven't supported embedded ES for quite some time now. We pulled it out somewhere in 2.x |
The changes in #5341 should make it easy to both change this default, and also for users to change it to something they like. |
Have we run tests where a full GC is actually triggered during ingestion? It seems very reasonable that CMS is slower because it avoids full GCs, but I'd like to see how long that takes on a busy logstash instance, with the added caveat that such stops mean packet loss for plugins like UDP input. |
Inputs without ack mechanism (like udp) will be surely negatively affected. |
any suggest how do i fix issue [GC (Metadata GC Threshold) 8530310K->2065630K(31574016K), 0.3831399 secs] |
Seems UseConcMarkSweepGC is to be removed shortly, and currently produces this warning on Logstash startup on Java 11.
|
per #2859 we realized that using the
-XX:+UseConcMarkSweepGC
JAVA_OPTS
slows down logstash by about 20%.The text was updated successfully, but these errors were encountered: