You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When applying an update (typically a new container image tag from our CI/CD pipeline) to a ScaledObject with scaleType: job terminates all running jobs.
This does not seem to fit well with the run-to-completion nature of jobs, and we have to make sure deploying new code does not interrupt our long running simulations (the main reason for choosing jobs over deployments).
Expected Behavior
Already started jobs run to completion with the configuration as it was when started.
New jobs triggered (e.g. by new incoming queue messages) should run with the new configuration.
Actual Behavior
Already running jobs and associated pods are terminated and deleted.
Steps to Reproduce the Problem
Define some long running queue triggered job ScaleType:
apiVersion: keda.k8s.io/v1alpha1kind: ScaledObjectmetadata:
name: my-long-running-scaled-jobnamespace: defaultspec:
scaleType: jobpollingInterval: 10# Optional. Default: 30 secondsmaxReplicaCount: 15# Optional. Default: 100minReplicaCount: 0# Optional. Default: 0cooldownPeriod: 30# Optional. Default: 300 secondsjobTargetRef:
parallelism: 1# [max number of desired pods](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#controlling-parallelism)completions: 1# [desired number of successfully finished pods](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#controlling-parallelism)activeDeadlineSeconds: 900# Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integerbackoffLimit: 6# Specifies the number of retries before marking this job failed. Defaults to 6template:
# describes the [job template](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/)metadata:
labels:
jobgroup: somejobgroupthingspec:
containers:
- name: busybox-loopingimage: busyboxcommand: ['sh', '-c', 'x=1;while [ $x -le 100 ]; do let y=x*2; let z=x*3; let a=x*4; echo $x $y $z $a ; sleep 1; let x=x+1;done']env:
- name: THE_QUEUEvalue: mytestqueuethatijustaddamessageto
- name: STORAGE_ACCOUNT_CONNECTION_STRINGvalueFrom:
secretKeyRef:
name: my-secretskey: STORAGE_ACCOUNT_CONNECTION_STRINGrestartPolicy: Nevertriggers:
- type: azure-queuemetadata:
queueName: mytestqueuethatijustaddamessagetoqueueLength: '20'# Optional. Queue length target for HPA. Default: 5 messagesconnection: STORAGE_ACCOUNT_CONNECTION_STRING
save file and apply it to k8s with kubectl apply -f my-busybox-job-test.yaml
push a message to the queue
observe pods being created and start calculating
do a simple update to the YAML, e.g. spec.jobTargetRef.template.spec.containers.image, or command,
The running jobs/pods are terminated
Specifications
KEDA Version:1.4.1
Platform & Version: *Azure AKS, *
Kubernetes Version:v1.16.9
Scaler(s):job
The text was updated successfully, but these errors were encountered:
We are seeing this as well with our long running jobs and it does not play nice with the continuous delivery nature of our code bases that are using containers being scaled by KEDA.
The other alternative of course is to ensure that all of your batched jobs running via KEDA jobs are using some kind of saga pattern so when they do get interrupted, if they are driven off a queue with a visibility window, then the job will be kicked off again and you can resume close to where you were. However this depends on the nature of the work being done and is not always possible.
When applying an update (typically a new container image tag from our CI/CD pipeline) to a ScaledObject with
scaleType: job
terminates all running jobs.This does not seem to fit well with the run-to-completion nature of jobs, and we have to make sure deploying new code does not interrupt our long running simulations (the main reason for choosing jobs over deployments).
Expected Behavior
Already started jobs run to completion with the configuration as it was when started.
New jobs triggered (e.g. by new incoming queue messages) should run with the new configuration.
Actual Behavior
Already running jobs and associated pods are terminated and deleted.
Steps to Reproduce the Problem
kubectl apply -f my-busybox-job-test.yaml
spec.jobTargetRef.template.spec.containers.image
, orcommand
,Specifications
The text was updated successfully, but these errors were encountered: