Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java Entrypoint #8161

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Sep 7, 2017

Java entry point:

  • Created org.logstash.Logstash as entrypoint
  • Safely handle Ruby runtime (which sadly is still a singleton, moving away from that will require a few iterations on top of this)
  • Adjusted bat and sh entry point wrappers
  • Verified manually that performance is unchanged (i.e. all Java opts are still loaded properly)
  • Flattened .jar path to make it a little less bothersome to build the -cp string
  • Retained ability to load jars from Ruby via the global $LS_JARS_LOADED variable hack, to keep plugin specs that load LS as a .gem functional (like e.g. the ITs in LS itself)
  • No need for the gem jars magic anymore, the downloading and moving into place of jars is now all handled by Gradle

Description incoming

Works on Windows:

screen shot 2017-11-14 at 18 44 06

@original-brownbear
Copy link
Member Author

should work out of the box with /bin/logstash after running ./gradlew clean assemble btw

@jordansissel
Copy link
Contributor

The last time we had lots of ruby code (aka jruby-complete) living in a jar, it had really bad effects on Logstash startup time because (at the time) require from JRuby loading ruby code from a .jar was very slow. We accidentally worked around it by having jruby ruby source as files when we moved away from the monolithic jar (years ago).

import org.jruby.exceptions.RaiseException;
import org.jruby.runtime.builtin.IRubyObject;

public final class Main implements Runnable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this Logstash so that when folks do jps "Logstash" is what shows up instead of "Main"

@andrewvc
Copy link
Contributor

This is awesome.

What I think this needs to move to completion is a wrapper for arbitrary ruby tasks like rake and bin/rspec, + adaptation of the Gradle stuff to use this.

I'd rather move the whole thing over than just the main logstash entrypoint for environment consistency.

@original-brownbear
Copy link
Member Author

@andrewvc sure we can do that, already figured out how and played with that a little, will add that here soon. @jordansissel 's point is valid though, we probably shouldn't pack the .rb files into the .jar to not make the start-up crazy slow.
Also, just as a note for myself, I need to add the jvm.properties file handling to this thing, those settings are currently ignored making performance pretty meh :)

@original-brownbear
Copy link
Member Author

@andrewvc just to summarize what we talked about:

@original-brownbear
Copy link
Member Author

@jordansissel hah I learned a new thing :) Apparently the only reason loading .rbs from a Jar comes with a hit is the compression. Turning off the Jar compression via https://docs.gradle.org/current/javadoc/org/gradle/api/tasks/bundling/ZipEntryCompression.html#STORED should make packing everything into a single jar as fast as reading straight from the FS (if not faster, since it means only loading a single file instead of hundreds).

@original-brownbear
Copy link
Member Author

@andrewvc this version packages and works just fine (but there are a few open questions around the effects this will have on the IT and plugin test infrastructures ... ITs are red because loading Java first breaks the transitive including of jars for the ITs that include core a gem. But easy to fix, just a question of what we want to achieve).

Let's look into this whenever you have some time next week:)

@jordansissel
Copy link
Contributor

@original-brownbear I"m open to reviewing this; lemme know when you think it's ready for testing and review :)

@original-brownbear
Copy link
Member Author

@jordansissel thanks, hopefully today, will ping you then :)

@original-brownbear
Copy link
Member Author

@jordansissel alright, now it should be good to review :)

@original-brownbear original-brownbear changed the title POC: Java Entrypoint Java Entrypoint Nov 15, 2017
@original-brownbear
Copy link
Member Author

Jenkins test this please

@original-brownbear
Copy link
Member Author

original-brownbear commented Nov 15, 2017

Sorry, for some reason the Kafka IT is broken now (was all green before a few trivial adjustments), will have to look into that and ping once it's green.

@jordansissel
Copy link
Contributor

This is on my todo list for this week to test/review further.

Copy link
Contributor

@jordansissel jordansissel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a brief code review to check for user experience or deployment issues.

I'll test this separately.

bin/logstash Outdated
@@ -58,5 +58,8 @@ if [ "$1" = "-V" ] || [ "$1" = "--version" ]; then
fi
echo "logstash $LOGSTASH_VERSION"
else
ruby_exec "${LOGSTASH_HOME}/lib/bootstrap/environment.rb" "logstash/runner.rb" "$@"
function join_cp { local IFS=":"; echo "$*"; }
jars=(${LOGSTASH_HOME}/logstash-core/lib/jars/*.jar)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All systems may not have bash. For example, docker containers or other similarly-minimal Linux environments.

Alternatives

Option 1: Only use a single .jar

This means -cp logstash-core/lib/jars/logstash.jar and any build changes required to achieve it.

Option 2: Compute classpath at build-time, not runtime.

This would have the build process compute the classpath so that we wouldn't need to compute the classpath during bin/logstash execution.

Option 3: Same method as you, but with bourne shell (not bash)

If you need to build a class path, you can do this without Bash's array features using $@ which has special meaning in bourne shell.

There may be other ways to achieve this, but here is what I came up with without testing:

Example:

sh-4.4$ ls /tmp/*.jar
 /tmp/a.jar   /tmp/b.jar  '/tmp/c c.jar'   /tmp/d.jar

sh-4.4$ function classpath() {
>   for i in "$@" ; do
>     echo -n "${i}:"
>   done | sed -e 's/:$//'
> }

sh-4.4$ classpath /tmp/*.jar; echo
/tmp/a.jar:/tmp/b.jar:/tmp/c c.jar:/tmp/d.jar

## Alternative shell implementation 
function classpath() {  
  echo -n "$1"
  shift
  while [ $# -gt 0 ] ; do
    echo -n ":${1}"
    shift
  done
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel I'd go for 3 here.

  1. Is really hacky (and I think even problematic legally in some cases) since we have to flatten all jars (or do some other hack that deals with nested jars) into one which entails removing the checksums from their META-INF paths => not something we want to do imo.
  2. I don't have a good theoretical argument against this, but the build is already quite complicated isn't it? I'd rather not pile onto that :) Plus I was just able to remove the generated files like logstash_core_jars.rb from the production code paths, I'd rather remove these hacks and not just add a new generated piece of code.

=> sh is def more portable than bash, let me try your code :) thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel hmm your solution works like a charm, but unfortunately only on newer sh. On vanilla OSX with sh 3.2 the -n is not understood by echo and the solution will only work with bash.
But you know what bash must be fine for us ... we are currently calling jruby/bin/jruby from our scripts and it's a bash script too => we're only compatible if bash is available right now anyways.
=> Use your solution but with bash to work on systems with older sh? :)

require "logstash-core/logstash-core.jar"
rescue Exception => e
raise("Error loading logstash-core/logstash-core.jar file, cause: #{e.message}")
unless $LS_JARS_LOADED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. If we have two places (bin/logstash and this file) where we compute some classpath things, can we move this to a single place or at least a single language (ruby vs shell vs java)?

Maybe have org.logstash.Logstash have a method compute this dynamically for use in various places (testing, main method, etc) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel can't do it :( org.logstash.Logstash can't load without loading the jars beforehand (doing this kind of thing from Java and then dynamically loading the jars would require a massive hack I think). This we really only need for having the bin/rspec tool still work.
If we are fine with dropping that tool we don't need it, but I'd rather keep it around so I can debug Ruby comfortably from the IDE :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This we really only need for having the bin/rspec tool still work.

Can you note this in a code comment?

public static void main(final String... args) {
final String lsHome = System.getenv("LS_HOME");
if (lsHome == null) {
throw new IllegalStateException("LS_HOME environment variable must be set.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this error message more actionable? If a user sees this exception, what should they do to resolve it, and can we include hints for resolving this in the message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel hmm I think the user can't do anything they see this message (other than open an issue here). This being thrown means there is a bug in the bin/logstash or bin/logstash.bat. I simply put this check here for our convenience to have a clean failure instead of some other failure with a strange null in some string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than open an issue here

Ok cool. Let's tell them it's probably a bug in the error message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's include an action to tell them to file an issue or otherwise note that this is a bug and not a user configuration problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel isn't probably a bug implied in this being a RuntimeException? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user, I don't personally associate "this is a bug" with RuntimeException. I'd rather be explicit. Saying "must be set" implies to me that the user should take action to set LS_HOME, which likely isn't the case. Otherwise, a user will end up struggling with this action trying to figure out how, and what value, to set LS_HOME.

resolved = resolved.resolve(element);
}
if (!resolved.toFile().exists()) {
throw new IllegalArgumentException(String.format("Missing: %s.", resolved));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this more actionable, or would a user never expect to see this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel jup same here => user should never ever see this except for when there's a bug :)

if (lsHome == null) {
throw new IllegalStateException("LS_HOME environment variable must be set.");
}
final Path home = Paths.get(lsHome).toAbsolutePath();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify some properties about this path before using it: Does the path exist? Does it have things we want in it? etc..

My thought is we can do a quick check here and provide actionable response to the user if the check fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel same as the above, this will actually throw something like "file not found" if the path is wrong which we will understand, for the user this means there is a bug in the startup script in 100% of cases imo => nothing actionable they can do right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this, no change required here.

@jordansissel
Copy link
Contributor

should work out of the box with /bin/logstash after running ./gradlew clean assemble btw

% ./gradlew clean assemble
...
% bin/logstash -e ''
Unable to find JRuby.
If you are a user, this is a bug.
If you are a developer, please run 'rake bootstrap'. Running 'rake' requires the 'ruby' program be available.

@jordansissel
Copy link
Contributor

If I try to work around the above error (can't find jruby) with this patch:

diff --git a/bin/logstash.lib.sh b/bin/logstash.lib.sh
index bb9090808..7f01bcd32 100755
--- a/bin/logstash.lib.sh
+++ b/bin/logstash.lib.sh
@@ -120,7 +120,7 @@ setup_vendored_jruby() {

 setup() {
   setup_java
-  setup_vendored_jruby
+  #setup_vendored_jruby
 }

 ruby_exec() {

Then run bin/logstash, it appears to do nothing?

% time bin/logstash --log.level debug -e 'input { generator { count => 1 } }'
bin/logstash --log.level debug -e 'input { generator { count => 1 } }'  11.93s user 0.84s system 292% cpu 4.366 total

Running bin/logstash with sh -x to see what's going on, I see this as the command, if this helps:

+ exec /bin/java -Xms1g -Xmx1g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djruby.compile.invokedynamic=true -Djruby.jit.threshold=0 -XX:+HeapDumpOnOutOfMemoryError -Djava.security.egd=file:/dev/urandom -cp /home/jls/projects/logstash/logstash-core/lib/jars/commons-compiler-3.0.7.jar:/home/jls/projects/logstash/logstash-core/lib/jars/jackson-annotations-2.9.1.jar:/home/jls/projects/logstash/logstash-core/lib/jars/jackson-core-2.9.1.jar:/home/jls/projects/logstash/logstash-core/lib/jars/jackson-databind-2.9.1.jar:/home/jls/projects/logstash/logstash-core/lib/jars/jackson-dataformat-cbor-2.9.1.jar:/home/jls/projects/logstash/logstash-core/lib/jars/janino-3.0.7.jar:/home/jls/projects/logstash/logstash-core/lib/jars/jruby-complete-9.1.13.0.jar:/home/jls/projects/logstash/logstash-core/lib/jars/log4j-api-2.9.1.jar:/home/jls/projects/logstash/logstash-core/lib/jars/log4j-core-2.9.1.jar:/home/jls/projects/logstash/logstash-core/lib/jars/log4j-slf4j-impl-2.9.1.jar:/home/jls/projects/logstash/logstash-core/lib/jars/logstash-core.jar:/home/jls/projects/logstash/logstash-core/lib/jars/slf4j-api-1.7.25.jar org.logstash.Logstash --log.level debug -e 'input { generator { count => 1 } }'

I'm not sure how to test this as bin/logstash appears to do nothing. Any ideas?

@original-brownbear
Copy link
Member Author

@jordansissel hmm don't waste your time apparently there is an issue here now. It was green a few weeks ago when I pinged, but I only rebased today without rerunning tests. Seems Jenkins isn't going anywhere as well. Probably just some trivial packaging issue. Will fix tomorrow morning :)

@jordansissel
Copy link
Contributor

@original-brownbear noted! :)

@original-brownbear original-brownbear force-pushed the cleanup-ruby-load-simpler branch 2 times, most recently from 8d328dd to 6d3dfe3 Compare December 29, 2017 05:28
@original-brownbear
Copy link
Member Author

original-brownbear commented Dec 29, 2017

@jordansissel with your change to the startup script it works fine for me + on Jenkins (failure is unrelated OOM 05:50:14 +Cannot allocate memory - git ls-files from ).
Should be fine now. Making assemble work out of the box isn't so easy though I'm afraid :( We are currently calling that target from rake and making that work would require cleaning up the gradle and rake interaction some more first (we should do that asap imo, but not here, this one already is pretty long).
For now ./gradlew assembleTarDistribution should work fine though :)

@jordansissel
Copy link
Contributor

@original-brownbear noted, will test again. Thanks!

@jordansissel
Copy link
Contributor

jordansissel commented Jan 2, 2018

Manual test:

% rake artifact:tar
...
% bin/logstash -e ''
...

[separate terminal]
% jps
56479 Logstash

It's good to see Logstash here. Woo :)

@jordansissel
Copy link
Contributor

Tests passing locally.

require "logstash-core/logstash-core.jar"
rescue Exception => e
raise("Error loading logstash-core/logstash-core.jar file, cause: #{e.message}")
unless $LS_JARS_LOADED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This we really only need for having the bin/rspec tool still work.

Can you note this in a code comment?

@@ -53,7 +53,7 @@ Gem::Specification.new do |gem|

gem.add_runtime_dependency "sinatra", '~> 1.4', '>= 1.4.6'
gem.add_runtime_dependency 'puma', '~> 2.16'
gem.add_runtime_dependency "jruby-openssl", ">= 0.9.20" # >= 0.9.13 Required to support TLSv1.2
gem.add_runtime_dependency "jruby-openssl", ">= 0.9.21" # >= 0.9.13 Required to support TLSv1.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intended? Seems unrelated to the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel ha good question so many weeks later :D I think I fixed the problem that this solved elsewhere already => let me revert this :)

public static void main(final String... args) {
final String lsHome = System.getenv("LS_HOME");
if (lsHome == null) {
throw new IllegalStateException("LS_HOME environment variable must be set.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's include an action to tell them to file an issue or otherwise note that this is a bug and not a user configuration problem.

if (lsHome == null) {
throw new IllegalStateException("LS_HOME environment variable must be set.");
}
final Path home = Paths.get(lsHome).toAbsolutePath();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this, no change required here.

@@ -1,4 +1,4 @@
#!/bin/sh
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should still be /bin/sh

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel bash is more portable here imo :( Your code only works with sh 4.x => bash 3.x works fine though => standard OSX and old Linux will break if we go with sh => bash wins, especially since JRuby requires bash anyways in the way we currently used to invoke it? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, fair. I didn't realize that bin/jruby invoked bash.

* Created `org.logstash.Logstash` as entrypoint
* Safely handle `Ruby` runtime (which sadly is still a singleton, moving away from that will require a few iterations on top of this)
* Adjusted `bat` and `sh` entry point wrappers
* Verified manually that performance is unchanged (i.e. all Java opts are still loaded properly)
* Flattened `.jar` path to make it a little less bothersome to build the `-cp` string
* Retained ability to load jars from Ruby via the global `$LS_JARS_LOADED` variable hack, to keep plugin specs that load LS as a `.gem` functional (like e.g. the ITs in LS itself)
* No need for the gem jars magic anymore, the downloading and moving into place of jars is now all handled by Gradle
@original-brownbear original-brownbear force-pushed the cleanup-ruby-load-simpler branch from 6d3dfe3 to 166c88c Compare January 3, 2018 19:06
@original-brownbear
Copy link
Member Author

@jordansissel all points addressed I think/hope :)

@jordansissel
Copy link
Contributor

Yep! I still need to test on windows. Should be done building shortly :)

@original-brownbear
Copy link
Member Author

@jordansissel see PR description at least for me Windows worked fine :D *fingers crossed :P

@jordansissel
Copy link
Contributor

@original-brownbear 🤦‍♂️ I missed the screenshot. LGTM

@original-brownbear
Copy link
Member Author

@jordansissel thanks! :)

@elasticsearch-bot
Copy link

Armin Braun merged this into the following branches!

Branch Commits
master 543b722
6.x dff953d

elasticsearch-bot pushed a commit that referenced this pull request Jan 3, 2018
* Created `org.logstash.Logstash` as entrypoint
* Safely handle `Ruby` runtime (which sadly is still a singleton, moving away from that will require a few iterations on top of this)
* Adjusted `bat` and `sh` entry point wrappers
* Verified manually that performance is unchanged (i.e. all Java opts are still loaded properly)
* Flattened `.jar` path to make it a little less bothersome to build the `-cp` string
* Retained ability to load jars from Ruby via the global `$LS_JARS_LOADED` variable hack, to keep plugin specs that load LS as a `.gem` functional (like e.g. the ITs in LS itself)
* No need for the gem jars magic anymore, the downloading and moving into place of jars is now all handled by Gradle

Fixes #8161
@original-brownbear original-brownbear deleted the cleanup-ruby-load-simpler branch January 3, 2018 19:13
insukcho pushed a commit to insukcho/logstash that referenced this pull request Feb 1, 2018
* Created `org.logstash.Logstash` as entrypoint
* Safely handle `Ruby` runtime (which sadly is still a singleton, moving away from that will require a few iterations on top of this)
* Adjusted `bat` and `sh` entry point wrappers
* Verified manually that performance is unchanged (i.e. all Java opts are still loaded properly)
* Flattened `.jar` path to make it a little less bothersome to build the `-cp` string
* Retained ability to load jars from Ruby via the global `$LS_JARS_LOADED` variable hack, to keep plugin specs that load LS as a `.gem` functional (like e.g. the ITs in LS itself)
* No need for the gem jars magic anymore, the downloading and moving into place of jars is now all handled by Gradle

Fixes elastic#8161
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants