-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we package an alternate zlib implementation in our distributions? #81662
Comments
Pinging @elastic/es-delivery (Team:Delivery) |
I don't have a lot of understanding here but another concern is compatibility. Unlike our Docker distribution, in which we own the runtime environment (mostly) enabling this for all Linux packages in general is tougher. To what degree does the Cloudflare zlib depend on system dynamic libraries or can we build a statically-linked binary to minimize any system compatibility concerns. |
+1 to building a statically-linked library, to make cloudflare-zlib universally available in all distributions that can support it instead of the |
It's one thing to build a static library, but that can only get used if the program that currently expects a dynamic library is rebuilt to link the static library at build time. In this case I think that program that would need to get rebuilt is I agree The other possibility that's not quite as radical as rebuilding The way library search works is that first So if you set
So I think if you put the dynamic library Cloudflare zlib in |
I'd like to consider another option here which is providing guidance for folks deploying on Linux (and not using Docker) on how to install cloudflare-zlib themselves rather than us hack a distribution that tries to load it for them. While we certainly want to expose this optimization to as many users as possible, we don't want to sacrifice or complicate portability, which increases the more native code we package in Elasticsearch. |
I had considered this solution and I am -1 as it pushes the burden to users to configure it properly. There could be an added support burden as it may not always be clear which zlib is in use. |
Of course, I just wanted to make this explicit and ensure that we are triaging that burden against the technical complexity and maintenance cost of supporting bundling this functionality in. Let's keep in mind this is an optimization so it should be treated slightly differently than something that affects actual functionality.
I'd also argue that issues related to us effectively "hacking" Java to use our bundled dynamic library are sure to arise, and will have a cost as well. |
Thanks all. After recent discussion I've updated the title and the description to better reflect more concerns |
While working on a a local build of elasticsearch in docker for my company I discovered the changes in 8.0+ to use the cloudflare zlib (#81245, and elastic/dockerfiles#95). The problem with what has been merged is that these do not actually use the cloudflare zlib. The cloudflare zlib repo (https://github.com/cloudflare/zlib) is a fork of upstream zlib (https://github.com/madler/zlib). If you look you will see that the tags present in the cloudflare zlib repo are not from cloudflare, they are forked from upstream.
The net effect of what has been merged at this time is that elastic is not seeing any performance improvements as its not running the cloudflare code. Additionally, by using the older upstream tag the docker images have the following CVEs that would be fixed by using the zlib package provided in Ubuntu 20.04. https://nvd.nist.gov/vuln/detail/CVE-2016-9840 - CVSS V3.x Score: 8.8 Since it seems there is some debate as to exactly what/how to provide this improved performance I would recommend the changes to this repo (#81245) and the public docker build repo (elastic/dockerfiles#95) be reverted until this discussion is resolved. |
It seems to me that the issue with Cloudflare zlib is it's very much not intended for general consumption. The only way to stay up to date with the latest upstream zlib is to consume HEAD from that repo which is far to prone to errors and I don't think we have sufficient coverage against our Docker packaging format to give us strong confidence here. My preference would be to revert this change until we can investigate a better way to ensure we are integrating an up-to-date fork of zlib and have better test coverage in this area. @jpountz thoughts? |
Relates to elastic#81662. This library isn't ready for public consumption. Remove it from the Docker build.
Relates to #81662. This library isn't ready for public consumption. Remove it from the Docker build.
Relates to elastic#81662. This library isn't ready for public consumption. Remove it from the Docker build.
Relates to elastic#81662. This library isn't ready for public consumption. Remove it from the Docker build.
@mark-vieira +1 to revert for now |
We should also investigate the use of Intel's IPP zlib as another alternative option. |
As discussed in #81208, using Cloudflare's zlib implementation improves performance in the following cases:
index.codec: best_compression
,which we use by default for observability and security integrations' data.
transport.compression_scheme: deflate
(not the default for local transport, but is the default for remote clusters when compression is enabled as of Remote compression scheme default to deflate #76580)Whereas #81208 scoped the initial approach to our Docker image, where we control both the distribution and the environment, this issue asks if we can go one step further and bundle it in our tarballs. Some general notes about why this is worth considering:
elasticsearch
script, alleviating need to control the environment variables externally.If instead we packaged an improved zlib in the Elasticsearch distribution, we could reap the following advantages:
best_compression
, which can be considered for more uses potentially if it comes with a lower expected overheadDisadvantages:
We recently brought up this issue in a Fix-It meeting, and open questions included:
zlib-ng
, which is actively maintained and spells out a few of its own advantages versus Cloudflare nicely here: 2.0.0 Benchmark comparisons zlib-ng/zlib-ng#871 (comment)The text was updated successfully, but these errors were encountered: