Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process only one task of jobs in reclaim action #629

Closed
zionwu opened this issue Dec 19, 2019 · 7 comments
Closed

Process only one task of jobs in reclaim action #629

zionwu opened this issue Dec 19, 2019 · 7 comments
Labels
area/scheduling kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@zionwu
Copy link
Contributor

zionwu commented Dec 19, 2019

I am using reclaim action and notice that it will only process only one task of jobs:
https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/actions/reclaim/reclaim.go#L107

             // Found "high" priority job
		if jobs, found := preemptorsMap[queue.UID]; !found || jobs.Empty() {
			continue
		} else {
			job = jobs.Pop().(*api.JobInfo)
		}

		// Found "high" priority task to reclaim others
		if tasks, found := preemptorTasks[job.UID]; !found || tasks.Empty() {
			continue
		} else {
			task = tasks.Pop().(*api.TaskInfo)
		}

The above code is to find one task from the job and use it to reclaim resources. However, the job is not pushed back to jobs(the priority queue) so it will not be processed again.

I wonder if this behavior is by design? If so, why?
Or is it a bug ? We should push the job back to jobs at this line https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/actions/reclaim/reclaim.go#L198

@k82cn
Copy link
Member

k82cn commented Dec 20, 2019

I think we should put task back to the queue :)

btw, what's your use case for reclaim?

@k82cn
Copy link
Member

k82cn commented Dec 20, 2019

/area scheduling
/kind bug
/priority important-soon

@volcano-sh-bot volcano-sh-bot added area/scheduling kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Dec 20, 2019
@k82cn k82cn added this to the v0.4 milestone Dec 20, 2019
@zionwu
Copy link
Contributor Author

zionwu commented Dec 20, 2019

@k82cn, thank you for the quick reply.
My use case is as follow:
we have two queues, one for high priority jobs like production training job, another for low priority jobs like testing training job. If the queue for production is running out of resource, we want to reclaim resource from the queue for testing.

Another question about reclaim is why we push the queue back to queues only when assigned is true? I think we should always push the queue back:
https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/actions/reclaim/reclaim.go#L197

                if assigned {
			queues.Push(queue)
		}

@k82cn
Copy link
Member

k82cn commented Dec 20, 2019

Another question about reclaim is why we push the queue back to queues only when assigned is true?

If we did not assign any resource in this loop, that means there's no resource for it; so will not consider it in next loop.

@zionwu
Copy link
Contributor Author

zionwu commented Dec 20, 2019

Another question about reclaim is why we push the queue back to queues only when assigned is true?

If we did not assign any resource in this loop, that means there's no resource for it; so will not consider it in next loop.

If assigned is false, it means there is no resource for the task of the job, but it doesn't mean there is no resource for the task of other job in the same queue.
For example, if queue A have job1 and job2. If task of job1 failed to reclaim, we should not consider job1 again, but we should consider job2. Therefore we need to put queue A back to the priority queue.

Does it make sense to you?

@k82cn
Copy link
Member

k82cn commented Dec 23, 2019

yes, that makes more sense :)

@k82cn
Copy link
Member

k82cn commented Dec 23, 2019

fixed by #631

@k82cn k82cn closed this as completed Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/scheduling kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

3 participants