Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduces conditional control plane installation #10

Merged

Conversation

bartoszmajsak
Copy link

Overview

When using KfDef such as:

apiVersion: kfdef.apps.kubeflow.org/v1
kind: KfDef
metadata:
  name: odh-mesh
spec:
  plugins:
    - kind: KfOssmPlugin
      spec:
        mesh:
          name: minimal
          namespace: istio-system
          mode: Managed
          certificate:
            generate: true
        auth:
          name: authorino
          namespace: auth-provider
          authorino:
            label: authorino/topic=odh
  applications:
    # your components
  repos:
    - name: manifests
      uri: https://github.com/maistra/odh-manifests/tarball/v0.0.6
  version: v0.0.6

the operator will install a minimal version of the control plane.

Changes

  • new feature of conditional installation of control plane based on Mesh.Mode value
  • reworked plugin integration in the KfApp call chain (see docs around Apply() and Delete())
  • fixed issue of leaving KfApp in an odd state when errors on Apply occurs (raised as separated PR too fix(reconcile): corrects KfDef status in case of any error opendatahub-io/opendatahub-operator#501)
  • reworks SMCP readiness check - it now checks not status, but a list of components in defined states
  • simplifies CRD existence check call

and reuse it in smcp-related funcs
in certain cases this information can only be determined at runtime, for example based on user-defined values of the plugin spec.
coupled with wait condition
It relies on owner reference so can be cleaned up as other resources
Additionally moves OssmResourceTracker creation to .Apply func
This way we can cleanup the resources in case deploying service mesh manifests because creating those in Generate phase leaves them hanging due to failing Delete hook
Additionally propagates the error through reconcile to keep trying, as it was ignored.
Copy link

@cam-garrison cam-garrison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes (or explanations) for two things, otherwise LGTM!

pkg/kfapp/ossm/feature/verification.go Outdated Show resolved Hide resolved
@cam-garrison
Copy link

Note from testing. If the mesh.mode field is not set, eg:

mesh:
          name: basic
          namespace: istio-system
          certificate:
            generate: true

The operator enters a CrashLoopBackOff state, with the error

2023-08-31T20:04:22.269Z INFO KfOssmPlugin Initializing {"plugin": "KfOssmPlugin"}
panic: reflect.Set: value of type string is not assignable to type ossmplugin.InstallationMode
goroutine 813 [running]:
reflect.Value.assignTo({0x1e65ba0?, 0xc0038441c0?, 0x7f992d423f18?}, {0x22e0a70, 0xb}, 0x1e5fee0, 0x0)
/usr/lib/golang/src/reflect/value.go:3064 +0x2ac
reflect.Value.Set({0x1e5fee0?, 0xc005a70180?, 0x22d975a?}, {0x1e65ba0?, 0xc0038441c0?, 0x1e5fee0?})
/usr/lib/golang/src/reflect/value.go:2090 +0xeb
github.com/opendatahub-io/opendatahub-operator/pkg/kfconfig/ossmplugin.setDefaults({0x1f9bea0?, 0xc005a70160?})
/workspace/pkg/kfconfig/ossmplugin/types.go:89 +0x2fe
github.com/opendatahub-io/opendatahub-operator/pkg/kfconfig/ossmplugin.setDefaults({0x202bea0?, 0xc005a70160?})
/workspace/pkg/kfconfig/ossmplugin/types.go:93 +0x33b
github.com/opendatahub-io/opendatahub-operator/pkg/kfconfig/ossmplugin.(*OssmPluginSpec).SetDefaults(...)
..................

To get out of it, have to delete KfDef and remove finalizer manually.

for example `type InstallationMode string` was previously failing when
converting the default value defined in a custom tag. Now there's a
check performed and if value can be convereted to a given type it will.

In all other cases it will return an error.
@bartoszmajsak
Copy link
Author

bartoszmajsak commented Sep 1, 2023

The operator enters a CrashLoopBackOff state, with the error

Thanks for spotting it! It's now fixed in ca7e9d8

in YAML: mode -> installationMode

values:

  • pre-installed
  • minimal

prometheus is gone

installation

Proposed installation modes are:

- minimal
- pre-installed
Copy link

@cam-garrison cam-garrison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested some more, lgtm

@bartoszmajsak bartoszmajsak merged commit 38a59a5 into maistra:kf_ossm_plugin Sep 1, 2023
bartoszmajsak added a commit that referenced this pull request Sep 6, 2023
* chore: extracts smcp GVR to a var

and reuse it in smcp-related funcs

* fix: returns err on oauth creation failure

instead of nil by mistake

* feat: adds ability to enable feature based on predicates

in certain cases this information can only be determined at runtime, for example based on user-defined values of the plugin spec.

* feat: introduces control plane installation

coupled with wait condition

* fix: makes feature enabled by default

* chore: reworks smcp creation

It relies on owner reference so can be cleaned up as other resources

* fix: applies cleanups only if feature is enabled

Additionally moves OssmResourceTracker creation to .Apply func

* feat: reworks plugin integration to use Apply and Delete funcs of KfApp

This way we can cleanup the resources in case deploying service mesh manifests because creating those in Generate phase leaves them hanging due to failing Delete hook

* fix(reconcile): corrects KfDef status in case of any error

Additionally propagates the error through reconcile to keep trying, as it was ignored.

* fix: reworks SMCP component readiness check

* chore: improves KfApp func docs

* chore: reworks logging in verifiers

* chore: simplifies CRD existence check

* chore: adds tests for feature enablement

* fix: sets interval for SMCP polling to 5s

* fix: handles types which are derived for built-in ones

for example `type InstallationMode string` was previously failing when
converting the default value defined in a custom tag. Now there's a
check performed and if value can be convereted to a given type it will.

In all other cases it will return an error.

* fix: sets default control plane installation mode rely on existing
installation

Proposed installation modes are:

- minimal
- pre-installed
bartoszmajsak added a commit that referenced this pull request Sep 6, 2023
* chore: extracts smcp GVR to a var

and reuse it in smcp-related funcs

* fix: returns err on oauth creation failure

instead of nil by mistake

* feat: adds ability to enable feature based on predicates

in certain cases this information can only be determined at runtime, for example based on user-defined values of the plugin spec.

* feat: introduces control plane installation

coupled with wait condition

* fix: makes feature enabled by default

* chore: reworks smcp creation

It relies on owner reference so can be cleaned up as other resources

* fix: applies cleanups only if feature is enabled

Additionally moves OssmResourceTracker creation to .Apply func

* feat: reworks plugin integration to use Apply and Delete funcs of KfApp

This way we can cleanup the resources in case deploying service mesh manifests because creating those in Generate phase leaves them hanging due to failing Delete hook

* fix(reconcile): corrects KfDef status in case of any error

Additionally propagates the error through reconcile to keep trying, as it was ignored.

* fix: reworks SMCP component readiness check

* chore: improves KfApp func docs

* chore: reworks logging in verifiers

* chore: simplifies CRD existence check

* chore: adds tests for feature enablement

* fix: sets interval for SMCP polling to 5s

* fix: handles types which are derived for built-in ones

for example `type InstallationMode string` was previously failing when
converting the default value defined in a custom tag. Now there's a
check performed and if value can be convereted to a given type it will.

In all other cases it will return an error.

* fix: sets default control plane installation mode rely on existing
installation

Proposed installation modes are:

- minimal
- pre-installed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants