Skip to content

Commit

Permalink
Respect ctx cancel
Browse files Browse the repository at this point in the history
I think there's a race condition between controller and receptor

controller see the job finished and try to cancel workunit

receptor does not respect the cancel and continue to do GET to kube apiserver with a dead ctx which cause a very misleading error message
```
client rate limiter Wait returned an error: context canceled
```
  • Loading branch information
TheRealHaoLiu authored and shanemcd committed May 10, 2023
1 parent aa4d0c7 commit 9e19a84
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions pkg/workceptor/kubernetes.go
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,15 @@ func (kw *kubeUnit) runWorkUsingLogger() {
// resuming from a previously created pod
var err error
for retries := 5; retries > 0; retries-- {
// check if the kw.ctx is already cancel
select {
case <-kw.ctx.Done():
errMsg := fmt.Sprintf("Context Done while getting pod %s/%s. Error: %s", podNamespace, podName, kw.ctx.Err())
kw.Warning(errMsg)
return
default:
}

kw.pod, err = kw.clientset.CoreV1().Pods(podNamespace).Get(kw.ctx, podName, metav1.GetOptions{})
if err == nil {
break
Expand Down

0 comments on commit 9e19a84

Please sign in to comment.