Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple containers in worker pod in case for ErrPodCompleted #717

Merged
merged 1 commit into from
May 10, 2023

Conversation

kurokobo
Copy link
Contributor

Closes #710

Simply, this PR makes Receptor to find worker container and invoke the same validation as implemented before in the case for ErrPodCompleted.

Passed the same tests as described in #710.

$ receptorctl --socket /tmp/receptor.sock work submit --rm --follow --no-payload --param secret_kube_config=@/etc/rancher/k3s/k3s.yaml --param secret_kube_pod=@./pod-single-5.yml demo
WORKER DONE
(RlqiVXB3, released)

$ receptorctl --socket /tmp/receptor.sock work submit --rm --follow --no-payload --param secret_kube_config=@/etc/rancher/k3s/k3s.yaml --param secret_kube_pod=@./pod-single-0.yml demo
WORKER DONE
(SQ1ZviRQ, released)

$ receptorctl --socket /tmp/receptor.sock work submit --rm --follow --no-payload --param secret_kube_config=@/etc/rancher/k3s/k3s.yaml --param secret_kube_pod=@./pod-multiple-5.yml demo
WORKER DONE
(x16GudEc, released)

$ receptorctl --socket /tmp/receptor.sock work submit --rm --follow --no-payload --param secret_kube_config=@/etc/rancher/k3s/k3s.yaml --param secret_kube_pod=@./pod-multiple-0.yml demo
WORKER DONE
(e44l72jZ, released)

In addition to above, the patterns that container exited with non-zero exit code are also tested;

# Exit code of sidecar is ignored and work succeeded
$ cat ./pod-multiple-0-sidecar-failed.yml
apiVersion: v1
kind: Pod
metadata:
  name: demo
  namespace: receptor
spec:
  containers:
    - name: worker
      image: busybox
      command: ["sh", "-c", "sleep 0; echo WORKER DONE"]
    - name: sidecar
      image: busybox
      command: ["sh", "-c", "sleep 0; echo SIDECAR DONE; exit 1"]
  restartPolicy: Never

$ receptorctl --socket /tmp/receptor.sock work submit --rm --follow --no-payload --param secret_kube_config=@/etc/rancher/k3s/k3s.yaml --param secret_kube_pod=@./pod-multiple-0-sidecar-failed.yml demo
WORKER DONE
(fhGshOSJ, released)

# Exit code of worker validated and work failed
$ cat ./pod-multiple-0-worker-failed.yml
apiVersion: v1
kind: Pod
metadata:
  name: demo
  namespace: receptor
spec:
  containers:
    - name: worker
      image: busybox
      command: ["sh", "-c", "sleep 0; echo WORKER DONE; exit 1"]
    - name: sidecar
      image: busybox
      command: ["sh", "-c", "sleep 0; echo SIDECAR DONE"]
  restartPolicy: Never

$ receptorctl --socket /tmp/receptor.sock work submit --rm --follow --no-payload --param secret_kube_config=@/etc/rancher/k3s/k3s.yaml --param secret_kube_pod=@./pod-multiple-0-worker-failed.yml demo
ERROR: Remote unit failed: Error creating pod: container failed with exit code 1: 

(qucjq6N2, released)
# ErrImagePull
$ cat ./pod-multiple-0-errimagepull.yml
apiVersion: v1
kind: Pod
metadata:
  name: demo
  namespace: receptor
spec:
  containers:
    - name: worker
      image: dummy-nonexisting-image
      command: ["sh", "-c", "sleep 0; echo WORKER DONE"]
    - name: sidecar
      image: busybox
      command: ["sh", "-c", "sleep 0; echo SIDECAR DONE"]
  restartPolicy: Never

$ receptorctl --socket /tmp/receptor.sock work submit --rm --follow --no-payload --param secret_kube_config=@/etc/rancher/k3s/k3s.yaml --param secret_kube_pod=@./pod-multiple-0-errimagepull.yml demo
Warning: receptorctl and receptor are different versions, they may not be compatible
ERROR: Remote unit failed: Error creating pod: container failed to start, ImagePullBackOff

(3yW7l5fr, released)

@kurokobo
Copy link
Contributor Author

In the first place, in the current upstream implementation, exit codes of all containers including worker are ignored in non-ErrPodCompleted pattern;

  • sleep 0; echo WORKER DONE; exit 1 for worker container causes work failure
    • This is ErrPodCompleted pattern that the pod exits before becoming Running.
  • sleep 5; echo WORKER DONE; exit 1 for worker container does not cause work failure
    • This is non-ErrPodCompleted pattern that the pod exits after becoming Running

The validation for the exit code for worker container remains in this PR in the case for ErrPodCompleted. However, it would also be good way to avoid verifying the exit code of the worker container in all cases in the first place.

@kurokobo
Copy link
Contributor Author

kurokobo commented Feb 2, 2023

@shanemcd
Thanks for moving forward this PR. Updated my branch to fix lint error reported from CI.

@AaronH88
Copy link
Contributor

AaronH88 commented Mar 6, 2023

I like the approach,
Can more tests be added to pkg/workceptor/kubernetes_test.go to show this PR working please?

@kurokobo
Copy link
Contributor Author

@AaronH88
Thanks for your review! And sorry for my delay.
I will look into it a bit to see if I can add tests and investigate about your comments (but I can't get a coherent time frame, so it will take some time).

if len(kw.pod.Status.ContainerStatuses) != 1 {
return fmt.Errorf("expected 1 container in pod but there were %d", len(kw.pod.Status.ContainerStatuses))
}
for _, cstat := range kw.pod.Status.ContainerStatuses {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see this change integrated, as it fixes another problem as well: Running in a cluster with sidecar injection (e.g. istio) enabled. In such a case the 1-container check constantly fails.

@shanemcd shanemcd changed the title fix: allow multiple container in worker pod in case for ErrPodCompleted Allow multiple containers in worker pod in case for ErrPodCompleted May 10, 2023
@AaronH88
Copy link
Contributor

PR #718 added required tests for this, happy to merge.

Copy link
Contributor

@AaronH88 AaronH88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@AaronH88 AaronH88 merged commit aa4d0c7 into ansible:devel May 10, 2023
@kurokobo kurokobo deleted the multiple-container branch May 10, 2023 23:07
@kurokobo
Copy link
Contributor Author

@AaronH88 @shanemcd
Huge thanks for working on this! ...And sorry for my lack of activities for tests😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple containers in worker pod is not allowed under certain circumstances
4 participants