Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove any unnecessary dependency from distribution #9075

Closed
Tracked by #21356
pedroigor opened this issue Dec 10, 2021 · 7 comments · Fixed by #21426
Closed
Tracked by #21356

Remove any unnecessary dependency from distribution #9075

pedroigor opened this issue Dec 10, 2021 · 7 comments · Fixed by #21426
Assignees
Labels
area/dist/quarkus kind/enhancement Categorizes a PR related to an enhancement status/needs-discussion PR needs discussion on developer mailing list team/cloud-native
Milestone

Comments

@pedroigor
Copy link
Contributor

pedroigor commented Dec 10, 2021

Description

Not only to reduce the distribution size but also to reduce the vulnerability surface, we should review the dependencies included in the distribution and keep only those strictly necessary to runtime and re-augmentation.

As the server is a mutable jar, these additional dependencies are a trade-off as part of continuous testing support from Quarkus. However, in theory, these dependencies won't actually be loaded in the runtime application.

Discussion

No response

Motivation

Not only to reduce the distribution size but also to reduce the vulnerability surface, we should review the dependencies included in the distribution and keep only those strictly necessary to runtime and re-augmentation.

Details

Looks like the best approach should be to exclude these dependencies through Maven (instead of using Quarkus properties such as quarkus.class-loading.removed-artifacts).

The proposal is to have a specific profile in the root pom that explicitly excludes the unwanted dependencies. The trade-off here is that we need to make sure the distribution is built using this profile.

@pedroigor
Copy link
Contributor Author

Sent #9077. I'm still investigating an issue related to the quarkus-vault extension where testcontainers dependencies can not be be excluded for this one due to compilation errors when running augmentation. Looks like this extension is running (or referencing) dev mode related code when not running dev mode but it shoudln't (similar to other extensions).

@pedroigor
Copy link
Contributor Author

See quarkiverse/quarkus-vault#12.

@pedroigor pedroigor modified the milestones: 17.0.0, 16.0.0 Dec 13, 2021
@stianst stianst modified the milestones: 16.0.0, 17.0.0 Dec 17, 2021
@pedroigor pedroigor modified the milestones: 17.0.0, 18.0.0 Feb 2, 2022
@stianst stianst modified the milestones: 18.0.0, 17.1.0 Mar 8, 2022
@pedroigor pedroigor modified the milestones: 18.0.0, 19.0.0 Mar 30, 2022
@pedroigor pedroigor added status/needs-discussion PR needs discussion on developer mailing list and removed status/triage labels May 31, 2022
@pedroigor pedroigor moved this to In Discussion in Keycloak: Quarkus distribution May 31, 2022
@trixpan
Copy link
Contributor

trixpan commented Jul 13, 2022

@DGuhr @pedroigor sorry for coming late to the party but it feels to me this issue and #12125 are somewhat in conflict.

TL;DR:
Quarkus seems to be all build around the augmentation concept. Can those be used to create recipes that minimise the need to get quarkus extensions directly into our source?

Long version
JSON logging is avaiable on quarkus out of the box via quarkus-logging-json. It is not perfect but it works and it fits very well with the idea promoted by 12 Factor that Logs are "event streams" and :

A twelve-factor app never concerns itself with routing or storage of its output stream... archival destinations are not visible to or configurable by the app, and instead are completely managed by the execution environment. Open-source log routers (such as Logplex and Fluentd) are available for this purpose

Why does that matter? Because GELF is not really a requirement for logging, centralised or not. This is evidenced by its main reasons to exist being basically limitations in Syslog, something that Java applications actually never truly had.

Now, I am not suggesting we go on accepting Twelve Factor as "The Truth ™️" but some of its concepts help set boundaries in functionality and reduce issues around dependency management and that helps distinguishing critical features from things we should pass.

Should we be adding to keycloak a module that essentially defines a wire format between an application and a log aggregator? I don't think so and my rationale is the same as this issue: Removing "any unnecessary dependency from distribution".

We know that once a new dependency is added we will inevitable end up with upstream vulnerabilities in our code that cannot be easily patched because they are managed by external groups. Take as example:

  • The Hibernate bug.
  • The random "vulnerability in keycloak-quarkus-dist" fixed in upstream Quarkus (in reality some Kafka related feature in quarkus that we don't even use but is injected by io.quarkus.quarkus-vertx-http-deployment)

In both cases, we need to update Quarkus but we cannot do so yet due to RH's own internal (and valid) needs.

Maybe we can find a path that provides users with optional functionality without dragging the some of these Quarkus extentions into Keycloak... In fact, as a keycloak user, as long there's documentation, I rather install a plugin to enable GELF logging (if I need to) than having Vulnerability Management teams chasing me to fix third party bugs that cannot be fixed without a significant bump of the quarkus version in keycloak.

A few weeks ago @thomasdarimont stated he hoped keycloak-x will not be as dependent on Quarkus versions as it was on Wildfly. The more quarkus extensions and dependencies we add the worst this will become.


PS-I do appreciate this may read as a rant about the PR in mention but it is not. As an FOSS developer myself, I am profoundly thankful for the effort put Keycloak, including PR#12125.

The reason I am using this PR as an example is because when reading the quarkus-logging-gelf documentation I cannot stop noticing that half of the documentation is actually telling the developer using Quarkus how to integrate its app using the extension with an agent capable on interpreting GELF wire protocol.

One could argue the same outcome (sending logs to a centralised server) is automatically achieved - without any code dependency - by:

Even for those not using OpenShift (or any K8S type solution), the same can be achieved by simply:

  • Option 2: Deploying the same documented fluentd agent with a tail file input and fluent-plugin-gelf
  • Option 3: Deploying the logstash as an agent with a tail file input and gelf-output-plugin
  • Option N: Deploying <your favorite log collection agent> with a tail file input and <relevant gelf output plugin>

@DGuhr
Copy link
Contributor

DGuhr commented Jul 18, 2022

Hi @trixpan
To make it short: I don't feel offended at all, and I totally understand you w.r.t. 12125 - that was the reason why I avoided to add it to the core when we had the first story around logging showing up and people were asking for it in the discussions.

Your lines above exactly match some thoughts of mine. These thoughts were the reason why I just posted a proposal for a "layered extension framework" to the quarkus-dev mailing list and I would be happy for people interested here to give feedback there (hesitating to open up another discussion here atm, to have one place to get feedback first -> our discussion).

In the end, there was a trade-off to be made:

  1. holding back a core integration because the extension story is not that advanced atm, so people have to rely on a totally experimental approach not supported by anything
  2. Create a small, easy-to-remove-later-on layer and add the dependencies - at this point in time - to "core".

I decided to go with option 2, as people were demanding the specific feature and there are holding costs involved (ref)

I consider the core distribution not the right place for the centralized logging extension to rely mid- to longterm, instead it should be an extension for Keycloak, but I consider the current solution to bring value for our users, and to be easily removable later on when we make some significant progress around the extension story.

On a more general level, as I understand your lines, you are touching the decade-long "framework vs library" war here, am I right?

Here's my view on that: I generally prefer to use plain + clearly domain-separated libraries (technical impls) instead of any framework. But: The decision for Keycloak to go with Quarkus was made long ago, even long before I joined, and I can somehow see value in it for a big project that is Keycloak, and especially in the way we did it.

The approach we took provides a loose coupling, actually, because we are loading most parts of the codebase "as is", without using quarkus interna, into that framework, that is atm "just another" runtime besides Wildfly. We should definitely take care of not coupling too much of our "domain logic" to tightly to the underlying framework. That was, for example, the idea behind the chagnes we made to the config layer - separating general, runtime-independent "Options" keycloak needs from the "PropertyMapper" implementations we use with the specific quarkus property targets (ref). It was first planned to use it as its own independent module, and for now relies in the quarkus namespace just because there is only one impl for it and it somewhat had bad devex implications (you need to build two modules instead of one). Again: Easily extractable later on, and the devex issue can and should be solved. Usable no matter where we evolve to.

On the other hand, when you use a framework, you should definitely use its parts to get the plus of convenience that is the one selling point of frameworks, otherwise you would "work against it", re-implement parts of its functionality and should just get rid of it. I think we have a good balance here for now. Optimizable? sure.

I hope this answers some of your questions.

@trixpan
Copy link
Contributor

trixpan commented Jul 21, 2022

@DGuhr thanks for the write up. I am 100% in agreement with the approach suggested and glad the concerns of community are being taken care of. ;-)

@vmuzikar
Copy link
Contributor

Will be (partially) resolved by #9144

@stianst stianst modified the milestones: 21.0.0, 22.0.0 Feb 17, 2023
@vmuzikar
Copy link
Contributor

vmuzikar commented May 17, 2023

Consider removing:

./lib/lib/deployment/org.testcontainers.mariadb-1.17.6.jar
./lib/lib/deployment/org.testcontainers.database-commons-1.17.6.jar
./lib/lib/deployment/org.testcontainers.oracle-xe-1.17.6.jar
./lib/lib/deployment/org.testcontainers.jdbc-1.17.6.jar
./lib/lib/deployment/org.testcontainers.postgresql-1.17.6.jar
./lib/lib/deployment/org.testcontainers.mysql-1.17.6.jar
./lib/lib/deployment/org.testcontainers.testcontainers-1.17.6.jar
./lib/lib/deployment/org.testcontainers.mssqlserver-1.17.6.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-postgresql-3.0.2.Final.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-mysql-3.0.2.Final.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-h2-3.0.2.Final.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-mariadb-3.0.2.Final.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-common-3.0.2.Final.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-deployment-3.0.2.Final.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-oracle-3.0.2.Final.jar
./lib/lib/deployment/io.quarkus.quarkus-devservices-mssql-3.0.2.Final.jar

vmuzikar added a commit to vmuzikar/keycloak that referenced this issue Jul 4, 2023
@github-project-automation github-project-automation bot moved this from In Discussion to Done in Keycloak: Quarkus distribution Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dist/quarkus kind/enhancement Categorizes a PR related to an enhancement status/needs-discussion PR needs discussion on developer mailing list team/cloud-native
Projects
5 participants