Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Modify input manfiest yml schema and assemble workflow to support pre-installation of native plugins #2849

Open
joshpalis opened this issue Nov 3, 2022 · 29 comments
Assignees
Labels
enhancement New Enhancement feature New feature v3.0.0

Comments

@joshpalis
Copy link
Member

joshpalis commented Nov 3, 2022

Is your feature request related to a problem? Please describe

Now that the Job Scheduler project ownership has been transferred to the OpenSearch Core Team, we and the community have made a decision to relocate Job Scheduler to native plugins [1]. This change will have an effect on how the full bundle of OpenSearch is assembled, since multiple plugins (ISM, AD, Reporting) have a dependency on Job Scheduler.

Currently, the input build manifest includes Job Scheduler as a component[2], and this will be removed upon relocation. However, unless there is a mechanism to detect and pre-install native plugin dependencies, assembly of the aforementioned dependent plugins will fail.

[1] opensearch-project/OpenSearch#4218
[2] https://github.com/opensearch-project/opensearch-build/blob/main/manifests/2.4.0/opensearch-2.4.0.yml#L26

Describe the solution you'd like

In order to support the assembly of the Job Scheduler dependent components, I propose modifications to the input manifest yml schema such that native plugins that other components depend on will also be included. Consequently, changes to the input manifest yml schema will necessitate changes to the bundle_OpenSearch assemble workflow, such that native plugins listed within the input manifest are pre-installed prior to the dependent components.

CC : @prudhvigodithi @peterzhuamazon Open to discussion on how best to handle these changes

Describe alternatives you've considered

No response

Additional context

No response

@joshpalis joshpalis added enhancement New Enhancement untriaged Issues that have not yet been triaged labels Nov 3, 2022
@joshpalis joshpalis assigned joshpalis and unassigned joshpalis Nov 3, 2022
@owaiskazi19
Copy link
Member

Thanks for writing this up @joshpalis. How about including a flag named depends_on for the plugins such as ISM, AD which are dependent on Job Scheduler in the input manifest?

 - name: anomaly-detection
    repository: https://github.com/opensearch-project/anomaly-detection.git
    depends_on: ['Job-Scheduler']
    ref: '2.4'
    platforms:
      - linux
      - windows
    checks:
      - gradle:properties:version
      - gradle:dependencies:opensearch.version

While reading the input manifest we have to take into account the dependency and based on the flag we can either install or not the JS native plugin.

@prudhvigodithi
Copy link
Collaborator

Hey @owaiskazi19 something similar to what proposed here ?
Another way is to have new schema as following

native_plugins:
- job-scheduler
- repository-s3
components:
  - name: OpenSearch
    repository: https://github.com/opensearch-project/OpenSearch.git
    ref: main
    checks:
      - gradle:publish
      - gradle:properties:version
  - name: common-utils
    repository: https://github.com/opensearch-project/common-utils.git
    ref: main
    platforms:
      - linux
      - windows
    checks:
      - gradle:publish
      - gradle:properties:version

@dblock @bbarani

@owaiskazi19
Copy link
Member

Hey @prudhvigodithi. Looks like we are on the same line with the scheme here.

For the above schema, I don't have any strong opinion(totally fine with it). Only just that it will create a little complex manifest file which can be simplified by using the flag depends_on. Let me know WDYT.

@joshpalis
Copy link
Member Author

Thanks @owaiskazi19 and @prudhvigodithi for your contributions for this discussion. I am partial to Owais' proposal for the modifications to the input manifest yml schema. The modifications would be simplified and it places the onus of responsibility for components to define the plugins that they are dependent on, making it clear which components are dependent on which native plugin.

@prudhvigodithi prudhvigodithi added feature New feature and removed untriaged Issues that have not yet been triaged labels Nov 8, 2022
@prudhvigodithi
Copy link
Collaborator

prudhvigodithi commented Nov 8, 2022

[Triage] Hey @joshpalis what is the targeted release with job-scheduler as native plugin, we should time this right to unblock 3.0.0 development.

@joshpalis
Copy link
Member Author

@prudhvigodithi Currently the targeted release is 2.5.0

@bbarani
Copy link
Member

bbarani commented Nov 21, 2022

@joshpalis I would recommend to target 2.6.0 release since this is technically a breaking change for the build and development process so we need to synchronize the effort between multiple teams. CC: @CEHENKLE @peterzhuamazon @minalsha

@joshpalis
Copy link
Member Author

Sure, I am fine with targeting 2.6.0 release. I have no strong opinions against this. @minalsha is this alright with you?

@minalsha
Copy link

@bbarani we are working closely with folks across different teams. I don't see why we need to push it out. Lets target for 2.5.0 and see where we land on 01/02/2023. cc @CEHENKLE @prudhvigodithi @joshpalis @dagneyb

@bbarani
Copy link
Member

bbarani commented Dec 5, 2022

@minalsha Is there a need to rush this change soon? This is not a new feature for users rather more of a change to existing plugin installation process. This change needs to be synchronized across multiple versions (3.x and 2.x). The available resources are currently working on 1.3.7 release along with Windows support hence I would not prioritize this for 2.5.0 release. @CEHENKLE @prudhvigodithi @joshpalis @dagneyb @peterzhuamazon @gaiksaya

@saratvemulapalli Will this change be done only for 2.x and 3.x version. I assume 1.x versions are not affected by this change. Can you please confirm?

@saratvemulapalli
Copy link
Member

@saratvemulapalli Will this change be done only for 2.x and 3.x version. I assume 1.x versions are not affected by this change. Can you please confirm?

@bbarani you are right. This change will only go out for 2.x and above. This will not impact any 1.x releases. Job Scheduler will still be supported with the same maven co-ordinates for 1.x.

@joshpalis
Copy link
Member Author

joshpalis commented Dec 20, 2022

@bbarani @prudhvigodithi The relocation of Job Scheduler to native plugins has been determined to be a breaking change due to the necessary build.gradle modifications for dependent plugins. In our efforts to adhere to semantic versioning, we have modified our approach to relocate Job Scheduler to native plugins. Please refer to the updated section of this issue for additional information.

Moving forward, modifications to the input manifest yml schema will still be necessary. Support for the pre-installation of native plugins will only be needed for OpenSearch 3.x and onwards.

CC: @saratvemulapalli @minalsha

@gaiksaya
Copy link
Member

gaiksaya commented Jan 6, 2023

[Proposal] Posting one of the approach here:

In order to add JS as a native plugin, a new schema needs to be introduced. OpenSearch consist of a number of native plugins. In this case, we choose to install just one as a part of the distribution which is job scheduler (maybe more in future). Hence the manifest needs to document it.

One way is to have a type as one of the component key in the schema. The type key can be mandatory and the components can be defined accordingly. The following types seems valid:

  • min (eg: OpenSearch, OpenSearch Dashboards)
  • lib (eg: common-utils)
  • core-plugin (eg: job scheduler, discovery-ec2)
  • plugin (eg: anomaly detection, alerting)
  • client: (eg: opensearch-high-level-rest client) → Example for future, out of scope for now.

Each type will define how the component is installed.

The functionality to install components that are of type core-plugin needs to be added to the build system.

  • Building JS: From build perspective since native plugins are already built with the core we do not need to build again.
  • Dependent plugins: For plugins with job scheduler as a dependency, it can retrieved from core dependencies which will already be taken care by plugins itself. The dependency will be retrieved from core modules so from build side if min is present we should be good. See issue Verify dependencies between plugins in CI #441 (comment) for one of the possible approaches regarding dependency management. If we go with depends_on that can be added as part of this schema too
  • Assembling the distribution: In case of assembling the distribution, the JS needs to be installed from core plugins and then proceed with other plugin installations.
  • Publishing: For publishing and stuff, since it’s a part of native plugin now, it will continue to publish under https://repo1.maven.org/maven2/org/opensearch/plugin/opensearch-job-scheduler/

From offline discussion looks like the move will be happen from 2.x version but enforced from 3.x due to breaking changes.
We can decide if we want the new manifest schema to be introduced from 2.x version. Since build repository does not follow branching strategy to build (maybe it should), for 2.x the type for job-scheduler can be a plugin and then starting 3.0 it can be of type core-plugin

The new input manifest can look like below:

---
schema-version: '1.1'
build:
  name: OpenSearch
  version: 3.0.0
ci:
  image:
    name: opensearchstaging/ci-runner:ci-runner-centos7-opensearch-build-v2
    args: -e JAVA_HOME=/opt/java/openjdk-17
components:
  - name: OpenSearch
    repository: https://github.com/opensearch-project/OpenSearch.git
    ref: main
    type: min
    checks:
      - gradle:publish
      - gradle:properties:version
  - name: common-utils
    repository: https://github.com/opensearch-project/common-utils.git
    type: lib
    ref: main
    platforms:
      - linux
      - windows
    checks:
      - gradle:publish
      - gradle:properties:version
  - name: job-scheduler
    repository: https://github.com/opensearch-project/job-scheduler.git
    type: core-plugin
    ref: main
    platforms:
      - linux
      - windows
    checks:
      - gradle:properties:version
      - gradle:dependencies:opensearch.version
  - name: ml-commons
    repository: https://github.com/opensearch-project/ml-commons.git
    type: plugin
    ref: main
    platforms:
      - linux
      - windows
    checks:
      - gradle:properties:version
      - gradle:dependencies:opensearch.version: opensearch-ml-plugin
   

@saratvemulapalli @joshpalis @owaiskazi19 @dblock @opensearch-project/engineering-effectiveness let me know what do you think about this. More approaches and recommendations are welcomed.
Thanks!

@dblock
Copy link
Member

dblock commented Jan 8, 2023

Generally adding a type is a clutch, I don't recommend it. It forks a lot of classes and makes switching on type instead of switching on features.

  1. Does the current system work without changes and a working directory option, even if it means re-building the native plugin twice?
  2. Since the native plugin is already built, can it just be added to published artifacts instead as being declared in the manifest altogether? Existing components publish a plugin zip today, so why can't core also do that? What's missing if along with opensearch-min you get a job scheduler JAR/ZIP?

@bbarani
Copy link
Member

bbarani commented Jan 13, 2023

@dblock Are you suggesting to integrate JS native plugin installation process outside of distribution build process? In that case, we will always assume that -min contains the job-scheduler (JS) pre-installed and install other plugins?

@saratvemulapalli @minalsha @joshpalis can you add your inputs here?

@saratvemulapalli
Copy link
Member

This manifest is used for build and eventually assemble the distribution.
Native plugin is a plugin, I dont think we'd want to differentiate between these two for builds.

It changes with installation, the way they are installed. As long as we have right maven co-ordinates to install JS (as a native plugin), wouldn't that solve the problem? @gaiksaya

The proposal also talks about order of execution which is a problem by itself.

@bbarani bbarani added enhancement New Enhancement and removed enhancement New Enhancement feature New feature labels Feb 10, 2023
@bbarani bbarani added the feature New feature label Feb 10, 2023
@bbarani bbarani assigned prudhvigodithi and unassigned zelinh Feb 27, 2023
@prudhvigodithi
Copy link
Collaborator

Hey all just circling back to this issue, thanks for multiple proposals added, @joshpalis just to refresh can you please add the timeline? Like what would be the next step for 2.6? the best chosen proposal has to be pushed before 2.6? As far as I know now the JS plugin is being tested as both as native and as a regular plugin, hence no change needed from build side. Will this strategy be same for 2.6 as well?
Thank you

@prudhvigodithi
Copy link
Collaborator

Hey @joshpalis just following up, can you please add your thoughts based on my previous message?
Thank you

@prudhvigodithi
Copy link
Collaborator

The change is targeted for 3.0.0 release. To support, the build process has to be modified to ensure the job-scheduler is installed during assemble workflow. This can be done by installing the zip file from the core-plugins folder (code) and finally having the job-scheduler removed as a component from the manifest file.

  core-plugins:
        - core-plugins/discovery-gce-2.6.0.zip
        - core-plugins/repository-hdfs-2.6.0.zip
        - core-plugins/discovery-ec2-2.6.0.zip
        - core-plugins/analysis-icu-2.6.0.zip
        - core-plugins/discovery-azure-classic-2.6.0.zip
        - core-plugins/ingest-attachment-2.6.0.zip
        - core-plugins/analysis-stempel-2.6.0.zip
        - core-plugins/analysis-phonetic-2.6.0.zip
        - core-plugins/transport-nio-2.6.0.zip
        - core-plugins/repository-s3-2.6.0.zip
        - core-plugins/repository-gcs-2.6.0.zip
        - core-plugins/analysis-ukrainian-2.6.0.zip
        - core-plugins/repository-azure-2.6.0.zip
        - core-plugins/analysis-smartcn-2.6.0.zip
        - core-plugins/mapper-annotated-text-2.6.0.zip
        - core-plugins/store-smb-2.6.0.zip
        - core-plugins/analysis-nori-2.6.0.zip
        - core-plugins/mapper-murmur3-2.6.0.zip
        - core-plugins/mapper-size-2.6.0.zip
        - core-plugins/analysis-kuromoji-2.6.0.zip

The native plugin zips (under core-plugins folder), should also be considered publishing to maven. This gives more options for a user to install the native plugin, currently it pulls form artifacts.opensearch.org.

./bin/opensearch-plugin install repository-s3
-> Installing https://artifacts.opensearch.org/releases/plugins/repository-s3/2.6.0/repository-s3-2.6.0.zip
-> Downloading https://artifacts.opensearch.org/releases/plugins/repository-s3/2.6.0/repository-s3-2.6.0.zip

But this can be an enhancement later and will not impact the installation of JS as a native plugin during assemble as the assemble workflow use the build manifest (the output of the build workflow) and installs the plugins using the zips from the local workspace.
@peterzhuamazon @gaiksaya @bbarani @dblock @saratvemulapalli @joshpalis please add if i'm missing anything

Thank you

@dblock
Copy link
Member

dblock commented Apr 3, 2023

Do we need to change the manifest at all? Would it be simpler to install whichever native plugins we want as part of install.sh for OpenSearch core?

@prudhvigodithi
Copy link
Collaborator

Hey @dblock with the existing flow the install.sh is called during install_plugin that has the logic just to run some cp commands, but the actual plugin installation for opensearch is happening from install_plugin method using bundle_opensearch.py , this internally uses opensearch-plugin cli part of a random tmp folder (example: /tmp/tmpvfijvu7o/opensearch-2.8.0/bin/opensearch-plugin).

So looks to me like install_plugin from bundle_opensearch.py has to be modified with native plugin list coming from a place (could be manifest) and works good.

Tested with repository-s3 :
During assemble

2023-06-07 18:47:07 INFO     Installed plugins: ['repository-s3', 'opensearch-job-scheduler']

Inside the final tar file.

opensearch-2.8.0/plugins/repository-s3/
opensearch-2.8.0/plugins/repository-s3/LICENSE.txt
opensearch-2.8.0/plugins/repository-s3/NOTICE.txt
opensearch-2.8.0/plugins/repository-s3/aws-java-sdk-core-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/aws-java-sdk-s3-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/aws-java-sdk-sts-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/commons-codec-1.15.jar
opensearch-2.8.0/plugins/repository-s3/commons-logging-1.2.jar
opensearch-2.8.0/plugins/repository-s3/httpclient-4.5.13.jar
opensearch-2.8.0/plugins/repository-s3/httpcore-4.4.15.jar
opensearch-2.8.0/plugins/repository-s3/jackson-annotations-2.15.1.jar
opensearch-2.8.0/plugins/repository-s3/jackson-databind-2.15.1.jar
opensearch-2.8.0/plugins/repository-s3/jaxb-api-2.3.1.jar
opensearch-2.8.0/plugins/repository-s3/jmespath-java-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/log4j-1.2-api-2.17.1.jar
opensearch-2.8.0/plugins/repository-s3/plugin-descriptor.properties
opensearch-2.8.0/plugins/repository-s3/plugin-security.policy
opensearch-2.8.0/plugins/repository-s3/repository-s3-2.8.0.jar

I'm open for any other solutions.
@gaiksaya @peterzhuamazon @zelinh please add your thoughts.

Thank you

@prudhvigodithi
Copy link
Collaborator

Another approach (first approach in previous comment) we can go with is having a gradle task from OpenSearch core repo that can handle the desired native plugin installation code, and the script install.sh can have this gradle task called which ensures the native plugins are installed.
Having this approach the ordering and requirement of right set of native plugins can be directly controlled from OpenSearch core repo.
@gaiksaya @dblock @joshpalis @saratvemulapalli

@peterzhuamazon
Copy link
Member

Hey @dblock with the existing flow the install.sh is called during install_plugin that has the logic just to run some cp commands, but the actual plugin installation for opensearch is happening from install_plugin method using bundle_opensearch.py , this internally uses opensearch-plugin cli part of a random tmp folder (example: /tmp/tmpvfijvu7o/opensearch-2.8.0/bin/opensearch-plugin).

So looks to me like install_plugin from bundle_opensearch.py has to be modified with native plugin list coming from a place (could be manifest) and works good.

Tested with repository-s3 : During assemble

2023-06-07 18:47:07 INFO     Installed plugins: ['repository-s3', 'opensearch-job-scheduler']

Inside the final tar file.

opensearch-2.8.0/plugins/repository-s3/
opensearch-2.8.0/plugins/repository-s3/LICENSE.txt
opensearch-2.8.0/plugins/repository-s3/NOTICE.txt
opensearch-2.8.0/plugins/repository-s3/aws-java-sdk-core-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/aws-java-sdk-s3-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/aws-java-sdk-sts-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/commons-codec-1.15.jar
opensearch-2.8.0/plugins/repository-s3/commons-logging-1.2.jar
opensearch-2.8.0/plugins/repository-s3/httpclient-4.5.13.jar
opensearch-2.8.0/plugins/repository-s3/httpcore-4.4.15.jar
opensearch-2.8.0/plugins/repository-s3/jackson-annotations-2.15.1.jar
opensearch-2.8.0/plugins/repository-s3/jackson-databind-2.15.1.jar
opensearch-2.8.0/plugins/repository-s3/jaxb-api-2.3.1.jar
opensearch-2.8.0/plugins/repository-s3/jmespath-java-1.12.270.jar
opensearch-2.8.0/plugins/repository-s3/log4j-1.2-api-2.17.1.jar
opensearch-2.8.0/plugins/repository-s3/plugin-descriptor.properties
opensearch-2.8.0/plugins/repository-s3/plugin-security.policy
opensearch-2.8.0/plugins/repository-s3/repository-s3-2.8.0.jar

I'm open for any other solutions. @gaiksaya @peterzhuamazon @zelinh please add your thoughts.

Thank you

I am ok with this approach, tho the input manifest needs changes to know the type of the plugins, per @gaiksaya suggestions.

@dblock
Copy link
Member

dblock commented Jun 13, 2023

Another approach (first approach in previous comment) we can go with is having a gradle task from OpenSearch core repo that can handle the desired native plugin installation code, and the script install.sh can have this gradle task called which ensures the native plugins are installed. Having this approach the ordering and requirement of right set of native plugins can be directly controlled from OpenSearch core repo. @gaiksaya @dblock @joshpalis @saratvemulapalli

This sounds like a good idea!

@prudhvigodithi
Copy link
Collaborator

Hey @joshpalis coming form your comment opensearch-project/OpenSearch#5310 (comment), is the JS native plugin migration targeted to 3.0.0 release?

@joshpalis
Copy link
Member Author

Another approach (first approach in previous #2849 (comment)) we can go with is having a gradle task from OpenSearch core repo that can handle the desired native plugin installation code, and the script install.sh can have this gradle task called which ensures the native plugins are installed.
Having this approach the ordering and requirement of right set of native plugins can be directly controlled from OpenSearch core repo.
@gaiksaya @dblock @joshpalis @saratvemulapalli

I support this approach, as it negates the need to modify the input manifest to add a type to differentiate plugins and native plugins. As @saratvemulapalli stated, they should be treated the same.

@prudhvigodithi
Copy link
Collaborator

Thanks Josh, in that case can you please coordinate from the core side to on board a task that can install the native plugins and then we can come back to the build side and update the install.sh file with that gradle task ?
Thank you

@bbarani
Copy link
Member

bbarani commented Sep 18, 2023

@prudhvigodithi will look in this issue and see if it can be integrated to Gradle build for 3.x version.

@prudhvigodithi
Copy link
Collaborator

prudhvigodithi commented Feb 13, 2024

This issue remains on hold as the proposal and implementation for migrating job-scheduler as native plugin is still under review, also this migration is a breaking change targeted to 3.0.0 release which is moved to Feb 18 2025 based on the release schedule opensearch-project/.github#186.

Its also worth exploring using of core PersistentTaskPlugin with adding scheduler capabilities and allow plugins use the core feature instead of job-scheduler. Adding @peterzhuamazon opensearch-project/job-scheduler#147 (comment).

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New Enhancement feature New feature v3.0.0
Projects
None yet
Development

No branches or pull requests