Skip to content

Commit

Permalink
kserve: configure servicemesh before deploying manifests (opendatahu…
Browse files Browse the repository at this point in the history
…b-io#1019)

* kserve: configure servicemesh before deploying manifests

Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

1) to add a type to the schema
2) to watch the objects of that type.

If the object appears in between odh-model-controller complains:

```
2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

```

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified by due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta <[email protected]>

* kserve: get rid of extra enabled check for setupKserveConfig()

To avoid linter report:

components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo)

move setupKserveConfig() call under "if enabled" branch below.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
  • Loading branch information
ykaliuta authored May 24, 2024
1 parent 20a8c53 commit df469bb
Showing 1 changed file with 9 additions and 5 deletions.
14 changes: 9 additions & 5 deletions components/kserve/kserve.go
Original file line number Diff line number Diff line change
Expand Up @@ -136,18 +136,22 @@ func (k *Kserve) ReconcileComponent(ctx context.Context, cli client.Client,
}
}

if err = k.configureServiceMesh(cli, dscispec); err != nil {
return fmt.Errorf("failed configuring service mesh while reconciling kserve component. cause: %w", err)
}

if err := deploy.DeployManifestsFromPath(cli, owner, Path, dscispec.ApplicationsNamespace, ComponentName, enabled); err != nil {
return fmt.Errorf("failed to apply manifests from %s : %w", Path, err)
}

l.WithValues("Path", Path).Info("apply manifests done for kserve")

if enabled {
if err := k.setupKserveConfig(ctx, cli, dscispec); err != nil {
return err
}
}
l.WithValues("Path", Path).Info("apply manifests done for kserve")
// For odh-model-controller
if enabled {

// For odh-model-controller
if err := cluster.UpdatePodSecurityRolebinding(cli, dscispec.ApplicationsNamespace, "odh-model-controller"); err != nil {
return err
}
Expand Down Expand Up @@ -182,7 +186,7 @@ func (k *Kserve) ReconcileComponent(ctx context.Context, cli client.Client,
l.Info("updating SRE monitoring done")
}

return k.configureServiceMesh(cli, dscispec)
return nil
}

func (k *Kserve) Cleanup(cli client.Client, instance *dsciv1.DSCInitializationSpec) error {
Expand Down

0 comments on commit df469bb

Please sign in to comment.