Skip to content

Commit

Permalink
Sync from odh to rhoai 2.11 (#287)
Browse files Browse the repository at this point in the history
* feat: increase QPS and Burst for client (opendatahub-io#1031)

- we might see throttling in some cluster, this is just to uplift the
default value

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 54ee87d)

* kserve: configure servicemesh before deploying manifests (opendatahub-io#1019)

* kserve: configure servicemesh before deploying manifests

Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

1) to add a type to the schema
2) to watch the objects of that type.

If the object appears in between odh-model-controller complains:

```
2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

```

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified by due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta <[email protected]>

* kserve: get rid of extra enabled check for setupKserveConfig()

To avoid linter report:

components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo)

move setupKserveConfig() call under "if enabled" branch below.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>

* chore: adds godoc to Feature builder (opendatahub-io#1013)

As the first step of improving Feature DSL this PR brings godoc explaining purpose of each method in the builder chain.

(cherry picked from commit 5ce6306)

* fix: wrong path when use devFlag + wrong default value + special name in (opendatahub-io#1024)

trustyai

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit d9e78b4)

* chore: moves operator/subscriptions operations to pkg/cluster (opendatahub-io#1027)

They do not belong to pkg/deploy as they are about reading/writing cluster resources rather than deploying resources.

(cherry picked from commit d90983b)

* chore: Open up util functions for context propagation (opendatahub-io#1033)

context should be determined by the caller and propagated
down the call chain.

(cherry picked from commit 105adae)

* fix: missing label "opendatahub.io/generated-namespace" on auth (opendatahub-io#1038)

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit d6b108b)

* chore: append ownerRef to resources owned by Features (opendatahub-io#1039)

* rename / chg FeatureOwner func

* remove unused old ownerref func

(cherry picked from commit 6583645)

* Revert "chore: append ownerRef to resources owned by Features (opendatahub-io#1039)"

This reverts commit 6583645.

opendatahub-io#1039 (comment)

Signed-off-by: Yauheni Kaliuta <[email protected]>
(cherry picked from commit 952795a)

* Update Owners-aliases list (opendatahub-io#1040)

(cherry picked from commit cc9aecb)

* RHOAIENG-5426: Updated pull request template with prerequisites (opendatahub-io#1042)

* RHOAIENG-5426: Updated pull request template with prerequisites

* PR template changes

(cherry picked from commit 244ca13)

* chore: renames manifests source to location (opendatahub-io#1050)

- Source is already used elsewhere in Feature DSL, so we might want to
  avoid confusion
- Location fits the purpose of this field better

(cherry picked from commit 8921839)

* Fix trustyai changes

* cluster: GetPlatform: replace CSV list with OperatorExists calls (opendatahub-io#1051)

* tests: envtest: Add OperatorCondition CRD

Add OLM's[1] OperatorCondition/OperatorConditionList to the external
CRDs. Will be used in future patches.

[1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml

Signed-off-by: Yauheni Kaliuta <[email protected]>

* tests: dscinitialization: add OperatorCondition CRD to schema

Following patches will change initialization to list the objects.

Signed-off-by: Yauheni Kaliuta <[email protected]>

* cluster: GetPlatform: replace CSV list with OperatorExists calls

Jira: https://issues.redhat.com/browse/RHOAIENG-8483

Depending of the way operators installed CSVs can be seen in all
namespace so listing of them causes producing N * M results (where N
is number of namespaces and M is number of CSVs) which is not
scalable in general and in practice causes timeouts on such large
clusters.

The function basically checks if ODH or RHOAI operator installed and
there is already such function in the package, OperatorExists(). So,
reuse it.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
(cherry picked from commit 261bbab)

* chore: remove duplicated platform call in each component (opendatahub-io#1055)

- get in DSC and pass into compoment

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 1b04761)

* chore: update toolbox sdk version and remove duplicated addtoschema (opendatahub-io#1061)

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 1b23c9f)

* chore: funcs to create kustomize plugins (opendatahub-io#1062)

The actual use with resMap is inlined and moved to the caller.

This way we can construct plugins on demand and use them as building blocks instead.

(cherry picked from commit 7bf56a4)

* Fix linter errors

* Update scheme

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
Co-authored-by: Wen Zhou <[email protected]>
Co-authored-by: Yauheni Kaliuta <[email protected]>
Co-authored-by: Bartosz Majsak <[email protected]>
Co-authored-by: Aslak Knutsen <[email protected]>
Co-authored-by: Cameron Garrison <[email protected]>
Co-authored-by: Saravana Balaji Srinivasan <[email protected]>
  • Loading branch information
7 people authored Jun 19, 2024
1 parent 797f83e commit 7127164
Show file tree
Hide file tree
Showing 50 changed files with 755 additions and 373 deletions.
32 changes: 32 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
<!--- Provide a general summary of your changes in the Title above -->

Many thanks for submitting your Pull Request ❤️!

Please make sure that your PR meets the following requirements:

- [ ] You have read the [contributors guide](https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/CONTRIBUTING.md)
- [ ] Pull Request contains description of the issue
- [ ] Pull Request contains link to the JIRA issue
- [ ] Pull Request contains link to any dependent or related Pull Request

## Description
<!--- Describe your changes in detail -->

## JIRA issue:
<!--- Link your JIRA and related links here for reference. -->

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran to -->
<!--- see how your change affects other areas of the code, etc. -->

## Screenshot or short clip:
<!--- Attach a screenshot or a short clip demonstrating the feature. -->

## Merge criteria:
<!--- This PR will be merged by any repository approver when it meets all the points in the checklist -->
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->

- [ ] Have a meaningful commit messages.
- [ ] Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
- [ ] The developer has manually tested the changes and verified that the changes work
2 changes: 1 addition & 1 deletion Dockerfiles/toolbox.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM registry.fedoraproject.org/fedora-toolbox:38

ARG GOLANG_VERSION=1.20
ARG OPERATOR_SDK_VERSION=1.24.1
ARG OPERATOR_SDK_VERSION=1.31.0

ENV GOLANG_VERSION=$GOLANG_VERSION \
OPERATOR_SDK_VERSION=$OPERATOR_SDK_VERSION
Expand Down
3 changes: 1 addition & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -341,14 +341,13 @@ catalog-push: ## Push a catalog image.
$(MAKE) image-push IMG=$(CATALOG_IMG)

TOOLBOX_GOLANG_VERSION := 1.20
TOOLBOX_OPERATOR_SDK_VERSION := 1.24.1

# Generate a Toolbox container for locally testing changes easily
.PHONY: toolbox
toolbox: ## Create a toolbox instance with the proper Golang and Operator SDK versions
$(IMAGE_BUILDER) build \
--build-arg GOLANG_VERSION=$(TOOLBOX_GOLANG_VERSION) \
--build-arg OPERATOR_SDK_VERSION=$(TOOLBOX_OPERATOR_SDK_VERSION) \
--build-arg OPERATOR_SDK_VERSION=$(OPERATOR_SDK_VERSION) \
-f Dockerfiles/toolbox.Dockerfile -t opendatahub-toolbox .
$(IMAGE_BUILDER) stop opendatahub-toolbox ||:
toolbox rm opendatahub-toolbox ||:
Expand Down
8 changes: 0 additions & 8 deletions OWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,5 @@ approvers:
- platform
reviewers:
- platform
- serving
- ide
- dashboard
- datasciencepipelines
- modelregistry
- servicemesh
- distributedworkloads
- aiexplainability


40 changes: 9 additions & 31 deletions OWNERS_ALIASES
Original file line number Diff line number Diff line change
@@ -1,41 +1,19 @@
aliases:
platform:
- adelton
- AjayJagan
- ajaypratap003
- asanzgom
- biswassri
- CFSNM
- devguyio
- etirelli
- grdryn
- Jackdelahunt
- LaVLaS
- MarianMacik
- mattmahoneyrh
- Sara4994
- VaishnaviHire
- ykaliuta
- zdtsw
aiexplainability:
- RobGeada
- ruivieira
dashboard:
- andrewballantyne
- lucferbux
datasciencepipelines:
- gmfrasca
- HumairAK
distributedworkloads:
- astefanutti
- dimakis
ide:
- atheo89
- harshad16
modelregistry:
- dhirajsb
- rareddy
- tarilabs
servicemesh:
- aslakknutsen
- bartoszmajsak
- cam-garrison
serving:
- israel-hdez
- VedantMahabaleshwarkar




- zdtsw
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -96,7 +96,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -140,7 +140,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -198,7 +198,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -301,7 +301,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -345,7 +345,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -388,7 +388,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -431,7 +431,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -474,7 +474,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down Expand Up @@ -517,7 +517,7 @@ spec:
items:
properties:
contextDir:
default: ""
default: manifests
description: contextDir is the relative path to
the folder containing manifests in a repository
type: string
Expand Down
2 changes: 1 addition & 1 deletion bundle/tests/scorecard/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ stages:
- entrypoint:
- scorecard-test
- basic-check-spec
image: quay.io/operator-framework/scorecard-test:v1.24.1
image: quay.io/operator-framework/scorecard-test:v1.31.0
labels:
suite: basic
test: basic-check-spec-test
Expand Down
21 changes: 12 additions & 9 deletions components/codeflare/codeflare.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,13 @@ func (c *CodeFlare) GetComponentName() string {
return ComponentName
}

func (c *CodeFlare) ReconcileComponent(ctx context.Context, cli client.Client, logger logr.Logger, owner metav1.Object, dscispec *dsciv1.DSCInitializationSpec, _ bool) error {
func (c *CodeFlare) ReconcileComponent(ctx context.Context,
cli client.Client,
logger logr.Logger,
owner metav1.Object,
dscispec *dsciv1.DSCInitializationSpec,
platform cluster.Platform,
_ bool) error {
l := c.ConfigComponentLogger(logger, ComponentName, dscispec)
var imageParamMap = map[string]string{
"codeflare-operator-controller-image": "RELATED_IMAGE_ODH_CODEFLARE_OPERATOR_IMAGE", // no need mcad, embedded in cfo
Expand All @@ -66,22 +72,19 @@ func (c *CodeFlare) ReconcileComponent(ctx context.Context, cli client.Client, l

enabled := c.GetManagementState() == operatorv1.Managed
monitoringEnabled := dscispec.Monitoring.ManagementState == operatorv1.Managed
platform, err := cluster.GetPlatform(cli)
if err != nil {
return err
}

if enabled {
if c.DevFlags != nil {
// Download manifests and update paths
if err = c.OverrideManifests(string(platform)); err != nil {
if err := c.OverrideManifests(string(platform)); err != nil {
return err
}
}
// check if the CodeFlare operator is installed: it should not be installed
// Both ODH and RHOAI should have the same operator name
dependentOperator := CodeflareOperator

if found, err := deploy.OperatorExists(cli, dependentOperator); err != nil {
if found, err := cluster.OperatorExists(cli, dependentOperator); err != nil {
return fmt.Errorf("operator exists throws error %w", err)
} else if found {
return fmt.Errorf("operator %s is found. Please uninstall the operator before enabling %s component",
Expand Down Expand Up @@ -115,10 +118,10 @@ func (c *CodeFlare) ReconcileComponent(ctx context.Context, cli client.Client, l
}

// inject prometheus codeflare*.rules in to /opt/manifests/monitoring/prometheus/prometheus-configs.yaml
if err = c.UpdatePrometheusConfig(cli, enabled && monitoringEnabled, ComponentName); err != nil {
if err := c.UpdatePrometheusConfig(cli, enabled && monitoringEnabled, ComponentName); err != nil {
return err
}
if err = deploy.DeployManifestsFromPath(cli, owner,
if err := deploy.DeployManifestsFromPath(cli, owner,
filepath.Join(deploy.DefaultManifestPath, "monitoring", "prometheus", "apps"),
dscispec.Monitoring.Namespace,
"prometheus", true); err != nil {
Expand Down
7 changes: 4 additions & 3 deletions components/component.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client"

dsciv1 "github.com/opendatahub-io/opendatahub-operator/v2/apis/dscinitialization/v1"
"github.com/opendatahub-io/opendatahub-operator/v2/pkg/cluster"
"github.com/opendatahub-io/opendatahub-operator/v2/pkg/common"
)

Expand Down Expand Up @@ -68,9 +69,9 @@ type ManifestsConfig struct {
// +operator-sdk:csv:customresourcedefinitions:type=spec,order=1
URI string `json:"uri,omitempty"`

// contextDir is the relative path to the folder containing manifests in a repository
// contextDir is the relative path to the folder containing manifests in a repository, default value "manifests"
// +optional
// +kubebuilder:default:=""
// +kubebuilder:default:="manifests"
// +operator-sdk:csv:customresourcedefinitions:type=spec,order=2
ContextDir string `json:"contextDir,omitempty"`

Expand All @@ -83,7 +84,7 @@ type ManifestsConfig struct {

type ComponentInterface interface {
ReconcileComponent(ctx context.Context, cli client.Client, logger logr.Logger,
owner metav1.Object, DSCISpec *dsciv1.DSCInitializationSpec, currentComponentStatus bool) error
owner metav1.Object, DSCISpec *dsciv1.DSCInitializationSpec, platform cluster.Platform, currentComponentStatus bool) error
Cleanup(cli client.Client, DSCISpec *dsciv1.DSCInitializationSpec) error
GetComponentName() string
GetManagementState() operatorv1.ManagementState
Expand Down
18 changes: 8 additions & 10 deletions components/dashboard/dashboard.go
Original file line number Diff line number Diff line change
Expand Up @@ -87,13 +87,11 @@ func (d *Dashboard) ReconcileComponent(ctx context.Context,
logger logr.Logger,
owner metav1.Object,
dscispec *dsciv1.DSCInitializationSpec,
platform cluster.Platform,
currentComponentExist bool,
) error {
var l logr.Logger
platform, err := cluster.GetPlatform(cli)
if err != nil {
return err
}

if platform == cluster.SelfManagedRhods || platform == cluster.ManagedRhods {
l = d.ConfigComponentLogger(logger, ComponentNameSupported, dscispec)
} else {
Expand Down Expand Up @@ -126,14 +124,14 @@ func (d *Dashboard) ReconcileComponent(ctx context.Context,

// 2. platform specific RBAC
if platform == cluster.OpenDataHub || platform == "" {
err := cluster.UpdatePodSecurityRolebinding(cli, dscispec.ApplicationsNamespace, "odh-dashboard")
err := cluster.UpdatePodSecurityRolebinding(ctx, cli, dscispec.ApplicationsNamespace, "odh-dashboard")
if err != nil {
return err
}
}

if platform == cluster.SelfManagedRhods || platform == cluster.ManagedRhods {
err := cluster.UpdatePodSecurityRolebinding(cli, dscispec.ApplicationsNamespace, "rhods-dashboard")
err := cluster.UpdatePodSecurityRolebinding(ctx, cli, dscispec.ApplicationsNamespace, "rhods-dashboard")
if err != nil {
return err
}
Expand All @@ -152,7 +150,7 @@ func (d *Dashboard) ReconcileComponent(ctx context.Context,
switch platform {
case cluster.SelfManagedRhods, cluster.ManagedRhods:
// anaconda
if err := cluster.CreateSecret(cli, "anaconda-ce-access", dscispec.ApplicationsNamespace); err != nil {
if err := cluster.CreateSecret(ctx, cli, "anaconda-ce-access", dscispec.ApplicationsNamespace); err != nil {
return fmt.Errorf("failed to create access-secret for anaconda: %w", err)
}
// overlay which including ../../base + anaconda-ce-validator
Expand Down Expand Up @@ -183,7 +181,7 @@ func (d *Dashboard) ReconcileComponent(ctx context.Context,
if err := d.UpdatePrometheusConfig(cli, enabled && monitoringEnabled, ComponentNameSupported); err != nil {
return err
}
if err = deploy.DeployManifestsFromPath(cli, owner,
if err := deploy.DeployManifestsFromPath(cli, owner,
filepath.Join(deploy.DefaultManifestPath, "monitoring", "prometheus", "apps"),
dscispec.Monitoring.Namespace,
"prometheus", true); err != nil {
Expand All @@ -194,11 +192,11 @@ func (d *Dashboard) ReconcileComponent(ctx context.Context,
return nil
default:
// base
if err = deploy.DeployManifestsFromPath(cli, owner, Path, dscispec.ApplicationsNamespace, ComponentName, enabled); err != nil {
if err := deploy.DeployManifestsFromPath(cli, owner, Path, dscispec.ApplicationsNamespace, ComponentName, enabled); err != nil {
return err
}
// ISV
if err = deploy.DeployManifestsFromPath(cli, owner, PathISV, dscispec.ApplicationsNamespace, ComponentName, enabled); err != nil {
if err := deploy.DeployManifestsFromPath(cli, owner, PathISV, dscispec.ApplicationsNamespace, ComponentName, enabled); err != nil {
return err
}
// consolelink
Expand Down
Loading

0 comments on commit 7127164

Please sign in to comment.