-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Cloudflare's zlib in Docker images #81245
Use Cloudflare's zlib in Docker images #81245
Conversation
Closes elastic#81208. Elasticsearch uses zlib for two purposes: * Compression of stored fields with `index.codec: best_compression`, which we use for observability and security data. * Request / response compression. Historically, zlib was packaged within the JDK, so that users wouldn't have to have zlib installed for basic usage of Java. However, the original zlib optimizes for portability and misses a number of important optimizations such as leveraging vectorization support for x86 and ARM architectures. Several forks have been created in order to address this. Since version 9, the JDK uses the system's zlib when available and falls back to the zlib that is packaged within the JDK if a system zlib cannot be found. This commit changes the Docker image to install the Cloudflare fork of zlib, and run Java using the fork instead of the original zlib, so that users of the Docker image can get better performance. Other ES distribution types are out-of-scope, since configuring the JVM to use an alternative zlib requires an environment config as well as installed another zlib, and Docker is the only distribution type where we can control both.
Pinging @elastic/es-delivery (Team:Delivery) |
Hi @pugnascotia, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, thanks for putting up a PR so quickly! I wonder if there's a way we can test that this cloudflare-zlib is actually being picked up by the bundled JDK, such as running ldd /path/to/java | grep libz
and making sure it's the cloudflare zlib and not the original zlib?
@@ -73,6 +73,10 @@ if [[ -n "$ES_LOG_STYLE" ]]; then | |||
esac | |||
fi | |||
|
|||
if [[ -d /usr/local/cloudflare-zlib/lib ]]; then | |||
export LD_LIBRARY_PATH=/usr/local/cloudflare-zlib/lib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we try to preserve existing values of LD_LIBRARY_PATH
, e.g.
export LD_LIBRARY_PATH=/usr/local/cloudflare-zlib/lib | |
export LD_LIBRARY_PATH=/usr/local/cloudflare-zlib/lib:$LD_LIBRARY_PATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered about that, but I also struggled to think of another reason why someone else would also be setting LD_LIBRARY_PATH
with our image. Any ideas, anyone?
Also, I now think that this isn't the right place for this change, because Cloud do something different with the Docker entrypoint. I'll move this into the elasticsearch
script and put a guard on it with $ES_DISTRIBUTION_TYPE
.
Looks like it gets picked up OK:
|
@mark-vieira the packaging tests on SLES are reporting failure again, with what looks like a failure to talk to HOMER? |
A few thoughts on scope:
Now since the environment config is being applied from within If so:
|
a note on testing in docker, i think at this point we'd want to see the following show cloudflare-zlib in use, given setting the library path has moved out of the entrypoint script:
|
Thanks @DJRickyB, I added a test case to cover this. |
|
||
final boolean matches = sh.run("bash -c 'pmap -p $(pidof java)'").stdout.lines().anyMatch(line -> line.contains("cloudflare-zlib")); | ||
|
||
assertTrue("Expect java to be using cloudflare-zlib", matches); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it would be nice to have the output of the command in the error message in case the problem is not fully reproducible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea @jpountz, I've modified the test.
Infra merged a fix for this which might take a day to percolate through CI as the images get rebuilt. |
@elasticmachine run elasticsearch-ci/packaging-tests-unix |
Closes #81208. Elasticsearch uses zlib for two purposes: * Compression of stored fields with `index.codec: best_compression`, which we use for observability and security data. * Request / response compression. Historically, zlib was packaged within the JDK, so that users wouldn't have to have zlib installed for basic usage of Java. However, the original zlib optimizes for portability and misses a number of important optimizations such as leveraging vectorization support for x86 and ARM architectures. Several forks have been created in order to address this. Since version 9, the JDK uses the system's zlib when available and falls back to the zlib that is packaged within the JDK if a system zlib cannot be found. This commit changes the Docker image to install the Cloudflare fork of zlib, and run Java using the fork instead of the original zlib, so that users of the Docker image can get better performance. Other ES distribution types are out-of-scope, since configuring the JVM to use an alternative zlib requires an environment config as well as installed another zlib, and Docker is the only distribution type where we can control both.
Backported to |
Closes #81208. Elasticsearch uses zlib for two purposes: * Compression of stored fields with `index.codec: best_compression`, which we use for observability and security data. * Request / response compression. Historically, zlib was packaged within the JDK, so that users wouldn't have to have zlib installed for basic usage of Java. However, the original zlib optimizes for portability and misses a number of important optimizations such as leveraging vectorization support for x86 and ARM architectures. Several forks have been created in order to address this. Since version 9, the JDK uses the system's zlib when available and falls back to the zlib that is packaged within the JDK if a system zlib cannot be found. This commit changes the Docker image to install the Cloudflare fork of zlib, and run Java using the fork instead of the original zlib, so that users of the Docker image can get better performance. Other ES distribution types are out-of-scope, since configuring the JVM to use an alternative zlib requires an environment config as well as installed another zlib, and Docker is the only distribution type where we can control both.
Backported to |
Hmm, we may have missed the cut-off for 7.16.0, meaning that we'd either release this in 7.16.1 (which is questionable) or wait until 7.17.0. |
Would this be considered breaking in any way? What are the user facing implications here? |
FWIW my expectation was that this change would make it to 8.1 since it's an enhancement and we're past feature freeze for both 7.16 and 8.0. |
If we back it out of 7.16 we definitely want it in for 7.17 though. That said, we haven't created a branch for that release yet so we'd have to hold off on the backport until that is done. |
This reverts commit 6582acf.
OK, I've yanked it from |
Given recent communication that 7.17 should effectively be treated as a patch release, I think we should only target 8.x with this change. |
Closes #81208. Elasticsearch uses zlib for two purposes:
index.codec: best_compression
,which we use for observability and security data.
Historically, zlib was packaged within the JDK, so that users wouldn't
have to have zlib installed for basic usage of Java. However, the
original zlib optimizes for portability and misses a number of important
optimizations such as leveraging vectorization support for x86 and ARM
architectures. Several forks have been created in order to address this.
Since version 9, the JDK uses the system's zlib when available and falls
back to the zlib that is packaged within the JDK if a system zlib cannot
be found.
This commit changes the Docker image to install the Cloudflare fork of
zlib, and run Java using the fork instead of the original zlib, so that
users of the Docker image can get better performance.
Other ES distribution types are out-of-scope, since configuring the JVM
to use an alternative zlib requires an environment config as well as
installed another zlib, and Docker is the only distribution type where
we can control both.