Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kueue deletes workloads for jobs with limits, but no requests #590

Closed
mimowo opened this issue Feb 21, 2023 · 3 comments · Fixed by #600
Closed

Kueue deletes workloads for jobs with limits, but no requests #590

mimowo opened this issue Feb 21, 2023 · 3 comments · Fixed by #600
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@mimowo
Copy link
Contributor

mimowo commented Feb 21, 2023

What happened:

Kueue creates and deletes in a loop (the cycle is <1s) a workload for a job with limits, but without requests.

What you expected to happen:

Kueue does not delete the workloads and allows the jobs to complete.

How to reproduce it (as minimally and precisely as possible):

  1. Setup Kueue with the main local queue
  2. Create a job from the yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: job-longrun-only-limits
  annotations:
    kueue.x-k8s.io/queue-name: main
spec:
  suspend: true
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: job-longrun
        image: centos:7
        resources:
          limits:
            cpu: 100m
            memory: "200Mi"
        command: ["bash"]
        args: ["-c", 'sleep 120 && echo "Hello world"']
        imagePullPolicy: IfNotPresent
  backoffLimit: 0

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): master (before 1.27)
  • Kueue version (use git describe --tags --dirty --always): master (before 0.3.0)
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@mimowo mimowo added the kind/bug Categorizes issue or PR as related to a bug. label Feb 21, 2023
@alculquicondor
Copy link
Contributor

This was introduced in #317.

We need to fix by changing the logic that matches a Workload and a Job. This logic should be included in the job integration library, maybe at the podtemplate level. cc @kerthcet

We should also have an integration test.

@kerthcet
Copy link
Contributor

/assign
Let me fix this at first.

@kerthcet
Copy link
Contributor

I'd like to default the container again then process deepEqual. See #597

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
3 participants