Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add mpi-operator(v1) to the unified operator #1457

Merged
merged 3 commits into from
Nov 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion cmd/training-operator.v1/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/healthz"
"sigs.k8s.io/controller-runtime/pkg/log/zap"

mpiv1 "github.com/kubeflow/training-operator/pkg/apis/mpi/v1"
mxnetv1 "github.com/kubeflow/training-operator/pkg/apis/mxnet/v1"
pytorchv1 "github.com/kubeflow/training-operator/pkg/apis/pytorch/v1"
tensorflowv1 "github.com/kubeflow/training-operator/pkg/apis/tensorflow/v1"
Expand All @@ -47,11 +48,11 @@ var (

func init() {
utilruntime.Must(clientgoscheme.AddToScheme(scheme))

utilruntime.Must(xgboostv1.AddToScheme(scheme))
utilruntime.Must(pytorchv1.AddToScheme(scheme))
utilruntime.Must(tensorflowv1.AddToScheme(scheme))
utilruntime.Must(mxnetv1.AddToScheme(scheme))
utilruntime.Must(mpiv1.AddToScheme(scheme))
//+kubebuilder:scaffold:scheme
}

Expand Down
62 changes: 62 additions & 0 deletions manifests/base/cluster-role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ rules:
- apiGroups:
- kubeflow.org
resources:
- mpijobs
- tfjobs
- mxjobs
- pytorchjobs
- xgboostjobs
- mpijobs/status
- tfjobs/status
- pytorchjobs/status
- mxjobs/status
Expand Down Expand Up @@ -41,6 +43,66 @@ rules:
- deployments
verbs:
- "*"
# This is needed for the launcher Role.
hackerboy01 marked this conversation as resolved.
Show resolved Hide resolved
- apiGroups:
- ""
resources:
- pods/exec
verbs:
- create
- apiGroups:
- ""
resources:
hackerboy01 marked this conversation as resolved.
Show resolved Hide resolved
- endpoints
verbs:
- create
- get
- update
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- rbac.authorization.k8s.io
resources:
- roles
- rolebindings
verbs:
- create
- list
- watch
- update
- apiGroups:
- ""
resources:
- configmaps
- secrets
- services
- serviceaccounts
verbs:
- create
- list
- watch
- update
- apiGroups:
- policy
resources:
- poddisruptionbudgets

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall a need for this either

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

verbs:
- create
- list
- update
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remind me what CRD needs to be created?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm... I wouldn't expect the controller to create CRDs

verbs:
- create
- get
- apiGroups:
- scheduling.volcano.sh
resources:
Expand Down
6,951 changes: 6,951 additions & 0 deletions manifests/base/crds/kubeflow.org_mpijobs.yaml

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions manifests/base/crds/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ resources:
- kubeflow.org_mxjobs.yaml
- kubeflow.org_pytorchjobs.yaml
- kubeflow.org_xgboostjobs.yaml
- kubeflow.org_mpijobs.yaml
38 changes: 38 additions & 0 deletions pkg/apis/mpi/v1/constants.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
// Copyright 2019 The Kubeflow Authors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2021 here and other files

//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package v1

import common "github.com/kubeflow/common/pkg/apis/common/v1"

const (
// EnvKubeflowNamespace is ENV for kubeflow namespace specified by user.
EnvKubeflowNamespace = "KUBEFLOW_NAMESPACE"
// DefaultPortName is name of the port used to communicate between Master and Workers.
DefaultPortName = "mpi-port"
// DefaultContainerName is the name of the XGBoostJob container.
DefaultContainerName = "mpi"
// DefaultPort is default value of the port.
DefaultPort = 9999
// DefaultRestartPolicy is default RestartPolicy for ReplicaSpec.

DefaultRestartPolicy = common.RestartPolicyNever
Kind = "MPIJob"
// Plural is the Plural for TFJob.
Plural = "mpijobs"
// Singular is the singular for TFJob.
Singular = "mpijob"
// FrameworkName is the name of the ML Framework
FrameworkName = "mpi"
)
58 changes: 58 additions & 0 deletions pkg/apis/mpi/v1/default.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
// Copyright 2019 The Kubeflow Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package v1

import (
common "github.com/kubeflow/common/pkg/apis/common/v1"
"k8s.io/apimachinery/pkg/runtime"
)

// Int32 is a helper routine that allocates a new int32 value
// to store v and returns a pointer to it.
func Int32(v int32) *int32 {
return &v
}

func addDefaultingFuncs(scheme *runtime.Scheme) error {
return RegisterDefaults(scheme)
}

// setDefaultsTypeLauncher sets the default value to launcher.
func setDefaultsTypeLauncher(spec *common.ReplicaSpec) {
if spec != nil && spec.RestartPolicy == "" {
spec.RestartPolicy = DefaultRestartPolicy
}
}

// setDefaultsTypeWorker sets the default value to worker.
func setDefaultsTypeWorker(spec *common.ReplicaSpec) {
if spec != nil && spec.RestartPolicy == "" {
spec.RestartPolicy = DefaultRestartPolicy
}
}

func SetDefaults_MPIJob(mpiJob *MPIJob) {
// Set default cleanpod policy to None.
if mpiJob.Spec.CleanPodPolicy == nil {
none := common.CleanPodPolicyNone
mpiJob.Spec.CleanPodPolicy = &none
}

// set default to Launcher
setDefaultsTypeLauncher(mpiJob.Spec.MPIReplicaSpecs[MPIReplicaTypeLauncher])

// set default to Worker
setDefaultsTypeWorker(mpiJob.Spec.MPIReplicaSpecs[MPIReplicaTypeWorker])
}
20 changes: 20 additions & 0 deletions pkg/apis/mpi/v1/doc.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// Copyright 2018 The Kubeflow Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// +k8s:defaulter-gen=TypeMeta
// +k8s:openapi-gen=true

// Package v1 is the v1 version of the API.
// +groupName=kubeflow.org
package v1
38 changes: 38 additions & 0 deletions pkg/apis/mpi/v1/groupversion_info.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*
Copyright 2021.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

// Package v1 contains API Schema definitions for the kubeflow.org v1 API group
//+kubebuilder:object:generate=true
//+groupName=kubeflow.org
package v1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have a doc.go file if there is already this comment here?


import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/scheme"
)

var (
// GroupVersion is group version used to register these objects
GroupVersion = schema.GroupVersion{Group: "kubeflow.org", Version: "v1"}

SchemeGroupVersionKind = schema.GroupVersionKind{Group: "kubeflow.org", Version: "v1", Kind: Kind}

// SchemeBuilder is used to add go types to the GroupVersionKind scheme
SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}

// AddToScheme adds the types in this group-version to the given scheme.
AddToScheme = SchemeBuilder.AddToScheme
)
Loading