Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The scheduler unit test "only one workload can borrow one resources from the same flavor..." flakes #1455

Closed
mimowo opened this issue Dec 14, 2023 · 4 comments · Fixed by #1456
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.

Comments

@mimowo
Copy link
Contributor

mimowo commented Dec 14, 2023

/kind bug
/kind flake

What happened:

Failed unit test: sigs.k8s.io/kueue/pkg/scheduler: TestSchedule/only_one_workload_can_borrow_one_resources_from_the_same_flavor_in_the_same_cycle_if_cohort_quota_cannot_fit with output:

  scheduler_test.go:1120: Unexpected scheduled workloads (-want,+got):
          map[string]v1beta1.Admission{
        - 	"sales/wl1": {
        - 		ClusterQueue: "cq1",
        - 		PodSetAssignments: []v1beta1.PodSetAssignment{
        - 			{
        - 				Name:          "main",
        - 				Flavors:       map[v1.ResourceName]v1beta1.ResourceFlavorReference{...},
        - 				ResourceUsage: v1.ResourceList{...},
        - 				Count:         &1,
        - 			},
        - 		},
        - 	},
        + 	"sales/wl2": {
        + 		ClusterQueue: "cq2",
        + 		PodSetAssignments: []v1beta1.PodSetAssignment{
        + 			{
        + 				Name:          "main",
        + 				Flavors:       map[v1.ResourceName]v1beta1.ResourceFlavorReference{...},
        + 				ResourceUsage: v1.ResourceList{...},
        + 				Count:         &1,
        + 			},
        + 		},
        + 	},
          }
    scheduler_test.go:1145: Unexpected assigned clusterQueues in cache (-want,+got):
          map[string]v1beta1.Admission{
        - 	"sales/wl1": {
        - 		ClusterQueue: "cq1",
        - 		PodSetAssignments: []v1beta1.PodSetAssignment{
        - 			{
        - 				Name:          "main",
        - 				Flavors:       map[v1.ResourceName]v1beta1.ResourceFlavorReference{...},
        - 				ResourceUsage: v1.ResourceList{...},
        - 				Count:         &1,
        - 			},
        - 		},
        - 	},
        + 	"sales/wl2": {
        + 		ClusterQueue: "cq2",
        + 		PodSetAssignments: []v1beta1.PodSetAssignment{
        + 			{
        + 				Name:          "main",
        + 				Flavors:       map[v1.ResourceName]v1beta1.ResourceFlavorReference{...},
        + 				ResourceUsage: v1.ResourceList{...},
        + 				Count:         &1,
        + 			},
        + 		},
        + 	},
          }
    scheduler_test.go:1150: Unexpected elements left in the queue (-want,+got):
          map[string]sets.Set[string]{
        + 	"cq1": {"sales/wl1": {}},
        - 	"cq2": {"sales/wl2": {}},
          }
--- FAIL: TestSchedule/only_one_workload_can_borrow_one_resources_from_the_same_flavor_in_the_same_cycle_if_cohort_quota_cannot_fit (0.07s)

What you expected to happen:

No failure

How to reproduce it (as minimally and precisely as possible):

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kueue/1405/pull-kueue-test-unit-main/1735227794121560064

Anything else we need to know?:

I think this is the second time I see it

Environment:

  • Kubernetes version (use kubectl version):
  • Kueue version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@mimowo mimowo added the kind/bug Categorizes issue or PR as related to a bug. label Dec 14, 2023
@k8s-ci-robot k8s-ci-robot added the kind/flake Categorizes issue or PR as related to a flaky test. label Dec 14, 2023
@mimowo
Copy link
Contributor Author

mimowo commented Dec 14, 2023

cc @alculquicondor @tenzen-y @trasc

@alculquicondor
Copy link
Contributor

/assign @trasc

@alculquicondor
Copy link
Contributor

The dates point to #1406 as the culprit. We just need to enable the feature.

/assign
/unassign trasc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants