Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler crashes if node.Others[api.GPUSharingDevice] is nil #2738

Closed
igormish opened this issue Mar 15, 2023 · 3 comments · Fixed by #2751
Closed

Scheduler crashes if node.Others[api.GPUSharingDevice] is nil #2738

igormish opened this issue Mar 15, 2023 · 3 comments · Fixed by #2751
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@igormish
Copy link

igormish commented Mar 15, 2023

In pkg/scheduler/plugins/predicates/predicates.go
link

What happened:
fit, err = node.Others[api.GPUSharingDevice].(api.Devices).FilterNode(task.Pod)
If node.Others[api.GPUSharingDevice] is nil it causes volcano to crash as nil is trying run FilterNode(task.Pod)

What you expected to happen:
A check for if node.Others[api.GPUSharingDevice] is nil should be done before FilterNode(task.Pod) is executed

Environment:

  • Volcano Version: master
  • Kubernetes version: v1.22.17
@igormish igormish added the kind/bug Categorizes issue or PR as related to a bug. label Mar 15, 2023
@igormish
Copy link
Author

igormish commented Mar 15, 2023

For example we can add check before as nil is an error as predicate should fail if it does not exist

if d, exist := node.Others[api.GPUSharingDevice].(api.Devices); exist {
  fit, err = d.(api.Devices).FilterNode(task.Pod)
  if err != nil {
    return err
  }
} else {
  return fmt.Errorf("predicates failed because node %s don't have label %s to enable device sharing", node.Name, api.GPUSharingDevice)
}

@wangyang0616
Copy link
Member

/assign

@wangyang0616
Copy link
Member

/priority important-soon

@volcano-sh-bot volcano-sh-bot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants