Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

post-install hooks do not get ran until all chart resources are ready/healthy #766

Closed
corang opened this issue Sep 22, 2022 · 11 comments · Fixed by #816
Closed

post-install hooks do not get ran until all chart resources are ready/healthy #766

corang opened this issue Sep 22, 2022 · 11 comments · Fixed by #816
Assignees

Comments

@corang
Copy link
Contributor

corang commented Sep 22, 2022

Environment

Device and OS: All
App version: 0.21.3
Kubernetes distro being used: k3d/k3s
Other:

Steps to reproduce

  1. Get a helm chart that utilizes a post-install hook for a resource
  2. Modify helm chart that a pod doesn't become healthy until that post-install hook is ran
  3. Test using just helm - post-install hook runs once all resources are created and all pods end up healthy
  4. Create zarf package using modified helm chart
  5. Deploy package

Expected result

Post-install hook resources are created once all other resources have been created, pod that depends on post-install resource becomes healthy.

Actual Result

Post-install hook never runs so the zarf package never succeeds.

Severity/Priority

Medium - Blocking but has a workaround.

Additional Context

This can be worked around by adding a dataInjection since zarf doesn't wait for resources to be ready before running post-install hook resources.

@jeff-mccoy
Copy link
Contributor

That's a clever workaround (and fairly new 😂). I actually think we should just expose this as an option in the zarf.yaml for when it's needed. Thoughts @defenseunicorns/zarf?

@Racer159
Copy link
Contributor

It would be a relatively simple change in src/internal/helm/chart.go Is there a reason we don't flip the behavior though? The default in helm is opposite the default in Zarf currently (which may cause issues with folks trying something in Zarf and Helm and wondering why one doesn't work and the other does as in this case).

@jeff-mccoy
Copy link
Contributor

what line are you referring to?

@jeff-mccoy
Copy link
Contributor

If you're referring to our use of --wait, that's because the expectation in how all zarf components works is that the next component won't run until the previous is ready. Changing that default behavior would kind of break * everything *. That doesn't matter for helm where it doesn't know about multiple helm charts chained together, it does matter for zarf. Think the easy answer is to just expose the option in zarf.yaml. This is actually just another poor helm design from older versions that seems to be carried over: helm/helm#5118.

@brandtkeller
Copy link
Contributor

brandtkeller commented Sep 22, 2022

Shouldn't the use of --wait in zarf still allow the post-install hook to run?

The library loads the resulting resources into Kubernetes. Note that if the --wait flag is set, the library will wait until all resources are in a ready state and will not run the post-install hook until they are ready.

link

@jeff-mccoy
Copy link
Contributor

Not according to the linked issue above. And that's all we are doing is setting wait: true, this is helm internal behavior.

@brandtkeller
Copy link
Contributor

This should be investigated further, I created/deployed this package: (helm chart created to try and reproduce the issue)

kind: ZarfPackageConfig
metadata:
  name: "init-package-test"

components:
  - name: test
    description: "testing hooks"
    images:
      - docker.io/nginx:1.16.0
      - docker.io/busybox:latest
    charts:
      - name: hookchart
        url: https://github.com/brandtkeller/testchart.git
        version: 0.1.0
        namespace: test

Zarf properly waits for both the deployment pod to be ready AND the completion of the post-install hook job.

? Deploy the test component? Yes
                                                                                       
  📦 TEST COMPONENT                                                                    
                                                                                       

  ✔  Loading the Zarf State from the Kubernetes cluster                                                                                                                                                                                               
  ✔  Creating port forwarding tunnel at http://127.0.0.1:59316                                                                                                                                                                                        
  ✔  Storing images in the zarf registry                                                                                                                                                                                                              
  ✔  Processing helm chart hookchart:0.1.0 from https://github.com/brandtkeller/testchart.git                                                                                                                                                         
  ✔  Zarf deployment complete

@brandtkeller
Copy link
Contributor

output when running with a post-install job (before it finishes):

  ✔  Loading the Zarf State from the Kubernetes cluster                                                                                                                                                                                               
  ✔  Creating port forwarding tunnel at http://127.0.0.1:60538                                                                                                                                                                                        
  ✔  Storing images in the zarf registry                                                                                                                                                                                                              
  ⠸  testjob: Jobs active: 1, jobs failed: 0, jobs succeeded: 0 (19s)   

@Racer159
Copy link
Contributor

Racer159 commented Sep 23, 2022

If you're referring to our use of --wait, that's because the expectation in how all zarf components works is that the next component won't run until the previous is ready. Changing that default behavior would kind of break * everything *. That doesn't matter for helm where it doesn't know about multiple helm charts chained together, it does matter for zarf. Think the easy answer is to just expose the option in zarf.yaml. This is actually just another poor helm design from older versions that seems to be carried over: helm/helm#5118.

We could add another mechanism to Zarf to wait for the chart between components in Zarf-land instead of helm itself without breaking everything no?

@jeff-mccoy
Copy link
Contributor

No I don't think so, it's a really non-trivial problem. I spent weeks debugging/analyzing various deployment paradigms and helm's was the most stable for ensuring reconciliation/completion prior to moving on. Helm hooks are gross and what you use when you don't have smarter orchestration such as flux/argo/fleet/knative or some operator framework. Changing this would be a very significant change to Zarf that I am not comfortable for what is an edge case we haven't seen before.

I do support exposing this control to the user though as that seems to easily meet the needs here without changing behavior for every system. Keep in mind even raw manifests/kustomizations are actually turned into helm charts for apply.

Separately, I also see value in more advanced wait logic independent of Helm for things that say are controlled by flux & friends. Something like a wait condition in a component to wait for some arbitrary K8s resource to exist or be ready before proceeding. e.g. "wait for deployment xx to be healthy" or "wait for crd xx to exist". However, I see that as an advanced feature that is separate from this particular concern.

@JasonvanBrackel JasonvanBrackel moved this from New Requests to Backlog in Zarf Project Board Sep 23, 2022
@corang corang self-assigned this Sep 29, 2022
@corang corang moved this from Backlog to Doing Now in Zarf Project Board Sep 29, 2022
@corang
Copy link
Contributor Author

corang commented Sep 29, 2022

Going to try to implement another field into the zarf.yaml to conditionalize the wait flag to helm

corang added a commit that referenced this issue Oct 5, 2022
## Description

Add field to zarf.yaml to support not waiting for charts and manifests.

Fixes #766 

## Type of change

<!-- Please delete options that are not relevant -->

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)

## Checklist before merging

<!-- Please delete options that are not relevant -->

- [x] Tests have been added/updated as necessary (add the `needs-tests`
label)
- [x] Documentation has been updated as necessary (add the `needs-docs`
label)
- [ ] (Optional) Changes have been linted locally with
[golangci-lint](https://github.com/golangci/golangci-lint). (NOTE: We
haven't turned on lint checks in the pipeline yet so linting may be hard
if it shows a lot of lint errors in places that weren't touched by
changes. Thus, linting is optional right now.)

Co-authored-by: Megamind <[email protected]>
Co-authored-by: Wayne Starr <[email protected]>
Repository owner moved this from Doing Now to Done in Zarf Project Board Oct 5, 2022
Noxsios pushed a commit that referenced this issue Mar 8, 2023
## Description

Add field to zarf.yaml to support not waiting for charts and manifests.

Fixes #766 

## Type of change

<!-- Please delete options that are not relevant -->

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)

## Checklist before merging

<!-- Please delete options that are not relevant -->

- [x] Tests have been added/updated as necessary (add the `needs-tests`
label)
- [x] Documentation has been updated as necessary (add the `needs-docs`
label)
- [ ] (Optional) Changes have been linted locally with
[golangci-lint](https://github.com/golangci/golangci-lint). (NOTE: We
haven't turned on lint checks in the pipeline yet so linting may be hard
if it shows a lot of lint errors in places that weren't touched by
changes. Thus, linting is optional right now.)

Co-authored-by: Megamind <[email protected]>
Co-authored-by: Wayne Starr <[email protected]>
@Racer159 Racer159 moved this to Done in Zarf Project Board Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants