Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScaledJob: introduce MultipleScalersCalculation #2016

Merged
merged 23 commits into from
Sep 13, 2021

Conversation

TsuyoshiUshio
Copy link
Contributor

@TsuyoshiUshio TsuyoshiUshio commented Aug 6, 2021

This PR fix issue of multi scalers on Scaled Job. And introduce a new option.
The current login assume only one scaler is active and available.
The most of the cases, it is ok for that. That is why we don't get any issues about it.

However @zroubalik pointed out this issue, I come up with the fix and do some refactoring to make it clear.

I'm not writing documentation yet, before that, I'd like to confirm this logic if fine for you guys.
Once It looks, ok, I'll go ahead full testing and write the documentation and make this PR as not draft.

Spec

Introduce a new option MultipleScalersCalculation on ScaledJob.

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: {scaled-job-name}
spec:
  jobTargetRef:
         :
  scalingStrategy:
    strategy: "custom"                        # Optional. Default: default. Which Scaling Strategy to use. 
    customScalingQueueLengthDeduction: 1      # Optional. A parameter to optimize custom ScalingStrategy.
    customScalingRunningJobPercentage: "0.5"  # Optional. A parameter to optimize custom ScalingStrategy.
    pendingPodConditions:                     # Optional. A parameter to calculate pending job count per the specified pod conditions
      - "Ready"
      - "PodScheduled"
      - "AnyOtherCustomPodCondition"
    MultipleScalersCalculation: "max" # Optional. Default: max. If multiple scaler is exists, how to calculate the queueLength and maxValue (that is the expected number of pods that will be created)
  triggers:

Available Options

The option is how keda calculate queueLength and maxValue that is the number of pod will be created.

max : default, pick a scaler that has the max number of queueLength.
min: pick a scaler that has the min number of queueLength.
avg: sum up all the active scalers metrics and devided by the number of the active scalers.
sum: sum up all the active scalers metrics

Checklist

  • Commits are signed with Developer Certificate of Origin (DCO - learn more)
  • Tests have been added
  • A PR is opened to update the documentation on (repo) (if applicable)
  • Changelog has been updated

Fixes #

Document #
kedacore/keda-docs#505

@TsuyoshiUshio TsuyoshiUshio requested a review from zroubalik August 6, 2021 18:40
Signed-off-by: Tsuyoshi Ushio <[email protected]>
@TsuyoshiUshio TsuyoshiUshio marked this pull request as ready for review August 9, 2021 03:56
@TsuyoshiUshio
Copy link
Contributor Author

I tested, and I add test case and improve logging for clarifying on the multiple scalers scenario.
One question is, the change of controller-gen.kubebuilder.io/version: v0.5.0 from v0.3.0 Should I keep it to v0.3.0 ? It is auto generated, so that I keep it for now.

config/crd/bases/keda.sh_scaledjobs.yaml Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
pkg/scaling/scale_handler.go Outdated Show resolved Hide resolved
TsuyoshiUshio and others added 5 commits September 4, 2021 09:50
@TsuyoshiUshio TsuyoshiUshio force-pushed the tsushi/fixscaledjobmultitrigger branch from 141bd85 to 784bc7c Compare September 4, 2021 19:52
@TsuyoshiUshio
Copy link
Contributor Author

Hi @zroubalik
I update this PR as you requested. Could you have a look?
I also update the documentation according to the name suggestion.
kedacore/keda-docs#505

CHANGELOG.md Outdated Show resolved Hide resolved
pkg/metrics/scaledjob_prometheus_metrics.go Outdated Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics.go Outdated Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics.go Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics.go Outdated Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics.go Outdated Show resolved Hide resolved
@zroubalik zroubalik changed the title Fix issue of ScaledJob scaling and introduce MultiScalersOption ScaledJob: introduce MultipleScalersCalculation Sep 7, 2021
@TsuyoshiUshio TsuyoshiUshio force-pushed the tsushi/fixscaledjobmultitrigger branch from e5b9da9 to cddf760 Compare September 8, 2021 03:35
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
@TsuyoshiUshio TsuyoshiUshio force-pushed the tsushi/fixscaledjobmultitrigger branch from cddf760 to 96116f8 Compare September 8, 2021 03:45
@TsuyoshiUshio
Copy link
Contributor Author

Hi @zroubalik
I updated the PR. Could you have a look?

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I have a few very minor nits. Thanks @TsuyoshiUshio !

pkg/scaling/scaledjob/scale_metrics.go Outdated Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics.go Outdated Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics.go Outdated Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics_test.go Outdated Show resolved Hide resolved
pkg/scaling/scaledjob/scale_metrics.go Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@zroubalik zroubalik added this to the v2.5.0 milestone Sep 10, 2021
TsuyoshiUshio and others added 7 commits September 11, 2021 10:43
Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>
@TsuyoshiUshio TsuyoshiUshio force-pushed the tsushi/fixscaledjobmultitrigger branch from 41b2c15 to 6b7e6da Compare September 11, 2021 17:49
Signed-off-by: Tsuyoshi Ushio <[email protected]>
@TsuyoshiUshio
Copy link
Contributor Author

Hi @zroubalik I accept all your suggestion, however, CI / Validate PR looks start to fail. I looked the code change again, however, I just changed ScaledJob related part. The CI fails with the ScaledObjectController. Do you know this issue?

ScaledObjectController
/__w/keda/keda/controllers/keda/scaledobject_controller_test.go:50
  functional tests
  /__w/keda/keda/controllers/keda/scaledobject_controller_test.go:204
    deploys ScaledObject and creates HPA, when IdleReplicaCount, MinReplicaCount and MaxReplicaCount is defined [It]
    /__w/keda/keda/controllers/keda/scaledobject_controller_test.go:269

    Timed out after 20.001s.
    Expected
        <v1.ConditionStatus>: Unknown
    to equal
        <v1.ConditionStatus>: True

@zroubalik
Copy link
Member

Hi @zroubalik I accept all your suggestion, however, CI / Validate PR looks start to fail. I looked the code change again, however, I just changed ScaledJob related part. The CI fails with the ScaledObjectController. Do you know this issue?

ScaledObjectController
/__w/keda/keda/controllers/keda/scaledobject_controller_test.go:50
  functional tests
  /__w/keda/keda/controllers/keda/scaledobject_controller_test.go:204
    deploys ScaledObject and creates HPA, when IdleReplicaCount, MinReplicaCount and MaxReplicaCount is defined [It]
    /__w/keda/keda/controllers/keda/scaledobject_controller_test.go:269

    Timed out after 20.001s.
    Expected
        <v1.ConditionStatus>: Unknown
    to equal
        <v1.ConditionStatus>: True

That is just a flaky test :( I rerun the job and it fine now.

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, great job @TsuyoshiUshio !

@TsuyoshiUshio TsuyoshiUshio merged commit af5981a into main Sep 13, 2021
@TsuyoshiUshio TsuyoshiUshio deleted the tsushi/fixscaledjobmultitrigger branch September 13, 2021 17:15
nilayasiktoprak pushed a commit to nilayasiktoprak/keda that referenced this pull request Oct 23, 2021
* Fix bug of ScaledJob multiple triggers

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* fix pre-commit issue

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update logs on scale handler and add crds

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* update change log

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Add test case for exceeding MaxReplicaCount in sum

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* fix pre-commit

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update pkg/scaling/scale_handler.go

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update pkg/scaling/scale_handler.go

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* revert version and refactor scaling logic

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Change option name from MultipleScalersOption to MultipleScalersCalculation

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* update changelog for multiple triggers section

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Error Handling

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Refactor logging

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* remove scaledjob prometheus

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update pkg/scaling/scaledjob/scale_metrics.go

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update pkg/scaling/scaledjob/scale_metrics.go

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update pkg/scaling/scaledjob/scale_metrics.go

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update pkg/scaling/scaledjob/scale_metrics_test.go

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update pkg/scaling/scaledjob/scale_metrics.go

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Update CHANGELOG.md

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: Tsuyoshi Ushio <[email protected]>

* Move section to new section

Signed-off-by: Tsuyoshi Ushio <[email protected]>

* fix pre-commit white space issue

Signed-off-by: Tsuyoshi Ushio <[email protected]>

Co-authored-by: Zbynek Roubalik <[email protected]>
Signed-off-by: nilayasiktoprak <[email protected]>
@taylorchu
Copy link

is there a reason why we use queuelength instead of maxscale in max/min/avg calculation?

For example consider 2 scalers:

  1. queuelength=10, target=4 will produce maxscale=3
  2. queuelength=5, target=1 will produce maxscale=5

Right now max in this implementation, will use maxscale=3. To me it makes more sense to use maxscale=5. Similarly min seems more intuitive to pick the one that will produce maxscale=3.

@taylorchu
Copy link

@TsuyoshiUshio @zroubalik thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants