Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Fix issues with Pulsar Alpine docker image stability: remove glibc-compat #23762

Merged
merged 31 commits into from
Dec 21, 2024

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Dec 19, 2024

Fixes #23717 #23306

Motivation

The Pulsar Alpine docker image has stability issues due to including glibc-package solution which includes glibc library into the Alpine docker image.

The stability issue causes JVM crashes. This happens mainly in Netty native library loading and usage. It's also a problem in other native libraries such as Conscrypt (#23364 (comment)) and Snappy (#22804). There are also Netty stability issues causing JVM crashes which could be caused by problems in native code (#23612 (comment)).

Alpine maintainers don't recommend adding glibc to Alpine since mixing real glibc in Alpine will result in an unstable environment. In Alpine maintainer Ariadne Conill's words: "Combining glibc and musl runtimes is basically all but guaranteed to create an unstable environment, unless the system is appropriately configured (glibc side uses glibc binaries only, and vice versa)."

The reason why the glibc solution was initially added was to support Pulsar IO Kinesis connector. The Amazon Kinesis Producer Library isn't fully Java. The Java API calls a native executable which is built for glibc. Amazon doesn't provide a binary for Alpine, however the native executable source code is provided. In order to use Amazon Kinesis Producer Library on Alpine, the stable solution is to compile this binary specifically for Alpine.

Additional context

Mailing list thread

Modifications

This PR includes changes to:

  • remove the glibc-package solution

  • provide a Amazon Kinesis Producer Library (KPL) executable compiled for Alpine

  • The kinesis_producer executable is copied from apachepulsar/pulsar-io-kinesis-sink-kinesis_producer:0.15.12 image to the apachepulsar/pulsar-all image and an environment variable PULSAR_IO_KINESIS_KPL_PATH is set to the executable path.

  • Amazon Kinesis Producer Library has been upgraded from 0.14.13 version to 0.15.12.

    • The 0.14.13 contains several critical issues such as a potential data loss issue
    • In 0.15.12, the implementation uses AWS STS (Security Token Service) under the covers.
      • It is necessary to update the unit and integration tests Localstack configuration to include STS support and overriding the endpoints in the Pulsar IO Kinesis Sink connector
  • The Pulsar IO Kinesis Sink connector has been modified to support the AWS Kinesis Producer Library parameter nativeExecutable. When the PULSAR_IO_KINESIS_KPL_PATH env var is set, it will be set to the nativeExecutable parameter as the default value. This is how the Pulsar IO Kinesis Sink connector will use the kinesis_producer binary compiled for Alpine when using the pulsar-all image.

  • In the Pulsar docker image, LD_PRELOAD=/lib/libgcompat.so.0 is set. This is required to support loading Netty native libraries in Alpine unless the JVM already loads the gcompat library which provides a glibc compatibility layer for Alpine. The Alpine gcompat library is the recommended option for Netty native libraries on Alpine.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@lhotari lhotari requested a review from heesung-sn December 20, 2024 17:57
@heesung-sn
Copy link
Contributor

In order to use Amazon Kinesis Producer Library on Alpine, the stable solution is to compile this binary specifically for Alpine.

I think it would be hard to maintain this build script in this pulsar repo.

nit: are we planning to export those IO connectors from the repo and maintain them separately?

@lhotari
Copy link
Member Author

lhotari commented Dec 20, 2024

In order to use Amazon Kinesis Producer Library on Alpine, the stable solution is to compile this binary specifically for Alpine.

I think it would be hard to maintain this build script in this pulsar repo.

@heesung-sn sure, it's possible to say that. The Amazon Kinesis Producer library doesn't provide Alpine support and there aren't other feasible options. I have contributed changes to the AWS repository so hopefully support would be added later and we could remove our custom solution. Please continue the review.

nit: are we planning to export those IO connectors from the repo and maintain them separately?

Yes, that was decided already in 2020. 😀

Copy link
Contributor

@heesung-sn heesung-sn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@heesung-sn
Copy link
Contributor

heesung-sn commented Dec 20, 2024

nit: please provide more reference links(like dependency location list ) as comments in the build scripts for future updates.

@lhotari lhotari merged commit 906d10e into apache:master Dec 21, 2024
63 of 65 checks passed
lhotari added a commit that referenced this pull request Dec 21, 2024
lhotari added a commit that referenced this pull request Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-3.3 cherry-picked/branch-4.0 doc-not-needed Your PR changes do not impact docs ready-to-test release/blocker Indicate the PR or issue that should block the release until it gets resolved release/3.3.4 release/4.0.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Pulsar 4.0.1 image crashed when loading the native SSL library
2 participants