-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META] Split and modularize the server monolith #8110
Comments
/cc @rishabh6788 this work has been largely advancing by me over the last few months and reaching some good breakthrough milestones. The Streamable classes (e.g., StreamInput, StreamOutput, Writeable, NamedWriteable) and OpenSearchException class PR is up to tease out a big part of the tightly coupled classes across the If you have some time I'd love for you to jump in and help review the PRs. |
Thanks @nknize! I want to start the discussion on expediting the completion of phase 0 (elimination of packages split across jars) AND backporting it the 2.x branch. I think the current state where there is major structural divergence between |
@andrross Agree 💯 , the elimination of packages splits would be a huge win towards getting JPMS done, though for |
I think we should look at that on a case by case basis? Let's lean into progress not perfection here so we can iterate quickly. I also think we shouldn't have FUD around breaking java signatures and package names in minor releases at this juncture (as long as we don't do it too close to a release). The codebase isn't well defined enough for that (hence |
I agree with a case by case basis. I think the value here of getting rid of a blocker to adopting JPMS is clear, and the "breakages" will be super simple to resolve as it is almost exclusively a matter of fixing imports (and the plugins maintaining a version against I also agree that it is not practical to be 100% strict on not changing any Java APIs given how tightly coupled the code bases are. |
Semi related, I opened an issue for someone to add useful Java annoations we could use to enforce bwc. Not a requirement for this but another mechanism that would be helpful once we start carving out the publicly exposed java classes through |
The extensibility project (#2447) proposed that we first make plugins depend on an SDK reducing the N plugins : 1 core dependency tree to N plugins : 1 SDK : 1 core, first, therefore avoiding this pain (only the SDK will need to change when core changes), and we are at experimental release for 2.9, but it looks like we're not choosing to wait for that to deliver and aren't going to try to port plugins to extensions as a prerequisite. I don't have any solutions or workarounds to what the downstream dependencies are experiencing or will experience, you just can't both refactor and not cause disruption. Given that, I would keep doing what @nknize is doing, and GA 3.0 as soon as possible to stop the pain of backports to 2.x. The only other idea I have is to make extensions work as plugins (in-proc) so that we can still cut off the direct dependency on core. Finally, if we go the SDK route, I am not sure what additional benefits core having all this modularity brings other than to make it easier to work inside of it. Here's my original comment to the refactoring proposal along those lines: #5910 (comment). |
Tagging correct Rishabh @rishabhmaurya |
+1 to work towards 3.0 GA Backport pain will always exist
A big gap in this plan is the sdk still transitively exports all of the core classes to "downstream" plugins defeating strong encapsulation provided by JPMS (which is the primary objective of phase 0). This means core isolation through the sdk is not achieved and plugins can bypass the sdk and override core logic. This is exacerbated by the sdk also itself not providing JPMS support. Opening RFCs to deprecate core mechanisms like ThreadContext is a side effect we can eliminate with JPMS support. The near term pain (reduced by good DevOps mechanisms like These efforts are far better together. |
Major versions upgrades are disruptive to all users (including non-plugin developer users). I have a hard time justifying the disruption to the community at large that a major version release brings with the refactoring changes here.
Assuming that downstream dependencies are maintaining a version against main and against 2.x, then they are already experiencing the disruption, and the way to resolve that disruption is to backport the refactorings and bring the branches in line with each other. Any community plugins (outside of the opensearch-project) that are only maintaining a version against the 2.x branch are blissfully unaware of these structural changes and would be disrupted by the backports. But we don't truly support semver with plugins today anyway because plugins have to make code changes and recompile for all releases. We should still be thoughtful about the "breaking" changes that we backport, but I'm trying to make the case that the changes in phase 0 here (structural that are trivial for a consumer to resolve) are worth it on balance. |
Once Phase 0 is complete we can selectively Note too that ML Commons, common utils, and parts of security can move to core libraries further simplifying cross plugin dependencies. The one "gotcha" (blocker) is the code implemented in Kotlin should be refactored to java before bringing to core and we should enforce no Kotlin code in the core libraries. |
This is only current implementation for build time, not at runtime, to go faster in dev. At runtime extensions do not depend on core transitively. |
This is misleading. You still have jar hell. The SDK doesn't just move classes around or annotate them, it introduces a proper r/w API, and allows |
At build time is exactly where all of core classes are exposed such that a developer can override core logic.
At runtime the classloader needs core classes directly (e.g., ClusterState for SDKClusterService) without the runtime dependency you'd get a CNF exception. This is what modularity prevents. |
I don't understand what jar hell has to do with this. JPMS is about not exposing packages to downstreams not about highjacking classes with the same package and classname.
A proper r/w API is about strong encapsulation and not allowing downstreams to access classes you don't want them to hijack at runtime. Take ThreadContext right now. Without JPMS any downstream can import In the context of an SDK I encourage anyone confused about the benefits of refactoring to support JPMS to achieve a proper r/w API with strong encapsulation to read JEP 403 and Project Jigsaw documentation.
💯 And you don't need a separate opensearch-sdk to achieve this. Nor do you need a separate SDK to achieve separating JVM processes (which doesn't alone solve the encapsulation issue). |
The goals of refactoring as proposed is to avoid leaking internals into plugins by establishing a r/w API with strong encapsulation. Even when achieved, it will be difficult to have a large ecosystem of plugins for all the reasons that are not solved by refactoring. If I think we should do both, but I believe that extensions give a lot more benefits a lot quicker, and more incrementally with less disruption, as plugins can migrate to extensions gradually, extension authors don't have to work or deal with core anymore, and don't need to release with every new minor version of OpenSearch, making developing plugins/extensions a lot cheaper. |
Hey @nknize just for some context, the SDK's use of TLDR: we're trying to get rid of it, but we're not there yet. |
@dblock In an ideal future world (thinking 3.x) this is the goal for extensions. The only commonality is intended to be request/response classes auto-generated from spec files... and probably some needed XContent parsing classes. |
Even with opensearch-sdk the |
That is simply incorrect. You are right that to expose aggregation into the SDK without a transitive dependency a refactor in core will be required. Re: value of extensions I invite you to read https://github.com/opensearch-project/opensearch-sdk-java#introduction. |
AnomalyDetection takes a dependency on core. At least it's
I've read it...now we're in discussion mode. The part I've always liked about extensibility is |
|
Nomenclature-wise we have |
Related modularization discuss issue (more than just JPMS support): #5910
Related JPMS issue: #1588
[WIP] This meta issue tracks the scope for refactoring the monolithic
:server
module into separable libraries to support JPMS and serverless or cloud native implementation extensions.Phase 0 - JPMS Support (Eliminate top level split packages, see #1838)
:libs:opensearch-common
or:libs:opensearch-core
where needed:libs:opensearch-common
to newo.o.common.bootstrap
package:libs:opensearch-core
or o.o.lucene in:libs:opensearch-server
:libs:opensearch-core
or o.o.cli in:libs:opensearch-cli
:libs:opensearch-core
or to proper project in o.o.clientPhase 1 - Decoupling to support serverless and cloud native implementation extensions
:libs:opensearch-core
(dependency of o.o.bootstrap):libs:opensearch-core
(dependency of o.o.env):libs:opensearch-cluster
libraryResulting libraries:
:libs:opensearch-common
:libs:opensearch-core
:libs:opensearch-cluster
relates #1838
The text was updated successfully, but these errors were encountered: