Skip to content

Commit

Permalink
fix: Limit parallel builds on operator
Browse files Browse the repository at this point in the history
- Avoid many parallel integration builds
- Monitor all builds started by the operator instance and limit max number of running builds according to given setting
- By default use max running builds limit = 3 for build strategy routine
- By default use max running builds limit = 10 for build strategy pod
- Add max running builds setting to IntegrationPlatform
- Add some documentation on build strategy and build queues
  • Loading branch information
christophd committed Apr 6, 2023
1 parent 7747ea6 commit 5e66e20
Show file tree
Hide file tree
Showing 20 changed files with 501 additions and 48 deletions.
5 changes: 5 additions & 0 deletions config/crd/bases/camel.apache.org_builds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,11 @@ spec:
spec:
description: BuildSpec defines the Build operation to be executed
properties:
maxRunningBuilds:
description: the maximum amount of parallel running builds started
by this operator instance
format: int32
type: integer
operatorNamespace:
description: The namespace where to run the builder Pod (must be the
same of the operator in charge of this Build reconciliation).
Expand Down
10 changes: 10 additions & 0 deletions config/crd/bases/camel.apache.org_integrationplatforms.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,11 @@ spec:
type: object
type: object
type: object
maxRunningBuilds:
description: the maximum amount of parallel running builds started
by this operator instance
format: int32
type: integer
publishStrategy:
description: the strategy to adopt for publishing an Integration
base image
Expand Down Expand Up @@ -1784,6 +1789,11 @@ spec:
type: object
type: object
type: object
maxRunningBuilds:
description: the maximum amount of parallel running builds started
by this operator instance
format: int32
type: integer
publishStrategy:
description: the strategy to adopt for publishing an Integration
base image
Expand Down
31 changes: 31 additions & 0 deletions docs/modules/ROOT/pages/architecture/cr/build.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@

A *Build* resource, describes the process of assembling a container image that copes with the requirement of an xref:architecture/cr/integration.adoc[Integration] or xref:architecture/cr/integration-kit.adoc[IntegrationKit].

The result of a build is an xref:architecture/cr/integration-kit.adoc[IntegrationKit] that can and should be reused for multiple xref:architecture/cr/integration.adoc[Integrations].

[source,go]
----
type Build struct {
Expand All @@ -25,3 +27,32 @@ the full go definition can be found https://github.com/apache/camel-k/blob/main/

image::architecture/camel-k-state-machine-build.png[life cycle]

[[build-strategy]]
= Build strategy

You can choose from different build strategies. The build strategy defines how a build should be executed.
At the moment the available strategies are:

- buildStrategy: pod (each build is run in a separate pod, the operator monitors the pod state)
- buildStrategy: routine (each build is run as a go routine inside the operator pod)

[[build-queue]]
= Build queues

IntegrationKits and its base images should be reused for multiple Integrations in order to
accomplish an efficient resource management and to optimize build and startup times for Camel K Integrations.

In order to reuse images the operator is going to queue builds in sequential order.
This way the operator is able to use efficient image layering for Integrations.

By default, builds are queued sequentially based on their layout (e.g. native, fast-jar) and the build namespace.

To avoid having many builds running in parallel the operator uses a maximum number of running builds setting that limits the
amount of builds running.

You can set this limit in the xref:architecture/cr/integration-platform.adoc[IntegrationPlatform] settings.

The default values for this limitation is based on the build strategy.

- buildStrategy: pod (MaxRunningBuilds=10)
- buildStrategy: routine (MaxRunningBuilds=3)
14 changes: 14 additions & 0 deletions docs/modules/ROOT/partials/apis/camel-k-crds.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,13 @@ The Build deadline is set to the Build start time plus the Timeout duration.
If the Build deadline is exceeded, the Build context is canceled,
and its phase set to BuildPhaseFailed.
|`maxRunningBuilds` +
int32
|
the maximum amount of parallel running builds started by this operator instance
|===
Expand Down Expand Up @@ -1898,6 +1905,13 @@ map[string]string
Generic options that can used by each publish strategy
|`maxRunningBuilds` +
int32
|
the maximum amount of parallel running builds started by this operator instance
|===
Expand Down
5 changes: 5 additions & 0 deletions helm/camel-k/crds/crd-build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,11 @@ spec:
spec:
description: BuildSpec defines the Build operation to be executed
properties:
maxRunningBuilds:
description: the maximum amount of parallel running builds started
by this operator instance
format: int32
type: integer
operatorNamespace:
description: The namespace where to run the builder Pod (must be the
same of the operator in charge of this Build reconciliation).
Expand Down
10 changes: 10 additions & 0 deletions helm/camel-k/crds/crd-integration-platform.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,11 @@ spec:
type: object
type: object
type: object
maxRunningBuilds:
description: the maximum amount of parallel running builds started
by this operator instance
format: int32
type: integer
publishStrategy:
description: the strategy to adopt for publishing an Integration
base image
Expand Down Expand Up @@ -1784,6 +1789,11 @@ spec:
type: object
type: object
type: object
maxRunningBuilds:
description: the maximum amount of parallel running builds started
by this operator instance
format: int32
type: integer
publishStrategy:
description: the strategy to adopt for publishing an Integration
base image
Expand Down
2 changes: 2 additions & 0 deletions pkg/apis/camel/v1/build_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ type BuildSpec struct {
// and its phase set to BuildPhaseFailed.
// +kubebuilder:validation:Format=duration
Timeout metav1.Duration `json:"timeout,omitempty"`
// the maximum amount of parallel running builds started by this operator instance
MaxRunningBuilds int32 `json:"maxRunningBuilds,omitempty"`
}

// Task represents the abstract task. Only one of the task should be configured to represent the specific task chosen.
Expand Down
2 changes: 2 additions & 0 deletions pkg/apis/camel/v1/integrationplatform_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ type IntegrationPlatformBuildSpec struct {
Maven MavenSpec `json:"maven,omitempty"`
// Generic options that can used by each publish strategy
PublishStrategyOptions map[string]string `json:"PublishStrategyOptions,omitempty"`
// the maximum amount of parallel running builds started by this operator instance
MaxRunningBuilds int32 `json:"maxRunningBuilds,omitempty"`
}

// IntegrationPlatformKameletSpec define the behavior for all the Kamelets controller by the IntegrationPlatform
Expand Down
9 changes: 9 additions & 0 deletions pkg/client/camel/applyconfiguration/camel/v1/buildspec.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions pkg/cmd/install.go
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,7 @@ type installCmdOptions struct {
MavenCASecret string `mapstructure:"maven-ca-secret"`
MavenCLIOptions []string `mapstructure:"maven-cli-options"`
HealthPort int32 `mapstructure:"health-port"`
MaxRunningBuilds int32 `mapstructure:"max-running-builds"`
Monitoring bool `mapstructure:"monitoring"`
MonitoringPort int32 `mapstructure:"monitoring-port"`
TraitProfile string `mapstructure:"trait-profile"`
Expand Down Expand Up @@ -539,6 +540,10 @@ func (o *installCmdOptions) setupIntegrationPlatform(
Duration: d,
}
}
if o.MaxRunningBuilds > 0 {
platform.Spec.Build.MaxRunningBuilds = o.MaxRunningBuilds
}

if o.TraitProfile != "" {
platform.Spec.Profile = v1.TraitProfileByName(o.TraitProfile)
}
Expand Down
8 changes: 6 additions & 2 deletions pkg/controller/build/build_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -142,19 +142,23 @@ func (r *reconcileBuild) Reconcile(ctx context.Context, request reconcile.Reques

var actions []Action

buildMonitor := Monitor{
maxRunningBuilds: instance.Spec.MaxRunningBuilds,
}

switch instance.Spec.Strategy {
case v1.BuildStrategyPod:
actions = []Action{
newInitializePodAction(r.reader),
newScheduleAction(r.reader),
newScheduleAction(r.reader, buildMonitor),
newMonitorPodAction(r.reader),
newErrorRecoveryAction(),
newErrorAction(),
}
case v1.BuildStrategyRoutine:
actions = []Action{
newInitializeRoutineAction(),
newScheduleAction(r.reader),
newScheduleAction(r.reader, buildMonitor),
newMonitorRoutineAction(),
newErrorRecoveryAction(),
newErrorAction(),
Expand Down
107 changes: 107 additions & 0 deletions pkg/controller/build/build_monitor.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
/*
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package build

import (
"context"
"github.com/apache/camel-k/v2/pkg/util/kubernetes"
"sync"

"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/selection"
"k8s.io/apimachinery/pkg/types"
ctrl "sigs.k8s.io/controller-runtime/pkg/client"

v1 "github.com/apache/camel-k/v2/pkg/apis/camel/v1"
)

var runningBuilds sync.Map

type Monitor struct {
maxRunningBuilds int32
}

func (bm *Monitor) canSchedule(ctx context.Context, c ctrl.Reader, build *v1.Build) (bool, error) {
var runningBuildsTotal int32
runningBuilds.Range(func(_, v interface{}) bool {
runningBuildsTotal = runningBuildsTotal + 1
return true
})

if runningBuildsTotal >= bm.maxRunningBuilds {
requestName := build.Name
requestNamespace := build.Namespace
buildCreator := kubernetes.GetCamelCreator(build)
if buildCreator != nil {
requestName = buildCreator.Name
requestNamespace = buildCreator.Namespace
}

Log.WithValues("request-namespace", requestNamespace, "request-name", requestName, "max-running-builds-limit", runningBuildsTotal).
ForBuild(build).Infof("Maximum number of running builds (%d) exceeded - the build gets enqueued", runningBuildsTotal)

// max number of running builds limit exceeded
return false, nil
}

layout := build.Labels[v1.IntegrationKitLayoutLabel]

// Native builds can be run in parallel, as incremental images is not applicable.
if layout == v1.IntegrationKitLayoutNative {
return true, nil
}

// We assume incremental images is only applicable across images whose layout is identical
withCompatibleLayout, err := labels.NewRequirement(v1.IntegrationKitLayoutLabel, selection.Equals, []string{layout})
if err != nil {
return false, err
}

builds := &v1.BuildList{}
// We use the non-caching client as informers cache is not invalidated nor updated
// atomically by write operations
err = c.List(ctx, builds,
ctrl.InNamespace(build.Namespace),
ctrl.MatchingLabelsSelector{
Selector: labels.NewSelector().Add(*withCompatibleLayout),
})
if err != nil {
return false, err
}

// Emulate a serialized working queue to only allow one build to run at a given time.
// This is currently necessary for the incremental build to work as expected.
// We may want to explicitly manage build priority as opposed to relying on
// the reconciliation loop to handle the queuing.
for _, b := range builds.Items {
if b.Status.Phase == v1.BuildPhasePending || b.Status.Phase == v1.BuildPhaseRunning {
// Let's requeue the build in case one is already running
return false, nil
}
}

return true, nil
}

func monitorRunningBuild(build *v1.Build) {
runningBuilds.Store(types.NamespacedName{Namespace: build.Namespace, Name: build.Name}.String(), true)
}

func monitorFinishedBuild(build *v1.Build) {
runningBuilds.Delete(types.NamespacedName{Namespace: build.Namespace, Name: build.Name}.String())
}
Loading

0 comments on commit 5e66e20

Please sign in to comment.