Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health checking from NGINX to individual endpoints. #146

Closed
wants to merge 1 commit into from

Conversation

chrismoos
Copy link
Contributor

Overview

This change adds support for the nginx_upstream_check_module and configures upstreams with health checks based on the readinessProbe in the service's pod(s).

It is configurable with ConfigMap (default disabled):

use-upstream-health-checks

If the upstream health checking is enabled there is also an additional endpoint at /upstream_status.

Notes

This was originally proposed in the old NGINX ingress controller @ kubernetes-retired/contrib#2072

This change depends on a new nginx-slim image that would target version 0.13. The PR for that is kubernetes-retired/contrib#2329

A request for this was also raised in #140.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 20, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.6%) to 42.723% when pulling 5a3a373 on chrismoos:nginx_health_check into fbcedc0 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.6%) to 43.645% when pulling 7d1eac3 on chrismoos:nginx_health_check into b9d272a on kubernetes:master.

@chrismoos
Copy link
Contributor Author

@bprashanth I've added in the changes here to nginx that were made on kubernetes-retired/contrib#2329.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.6%) to 43.777% when pulling 9867801 on chrismoos:nginx_health_check into 0b1fc38 on kubernetes:master.

@cmluciano
Copy link
Contributor

Is there any sort of integration or unit tests that could be added here? Not sure if it is plugin specific.

Copy link
Contributor

@bprashanth bprashanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pr, overall looks good, just a request for an example and a few minor code review comments

@@ -365,6 +372,12 @@ http {
access_log off;
return 200;
}

{{ if $cfg.UseUpstreamHealthChecks }}
location /upstream_status {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an example, as in this pr: #235, there's already an examples/health-checks directory, just create a gce/ and nginx/ under that and add it under nginx/.

Please also mention in that example what this endpoint exposes.

@@ -302,6 +308,19 @@ func newIngressController(config *Configuration) *GenericController {
glog.Warning("Update of ingress status is disabled (flag --update-status=false was specified)")
}

ic.podLister.Indexer, ic.podController = cache.NewIndexerInformer(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fyi this will increase the memory footprint of the controller in large clusters. Somethig to keep in mind, not much we can do about it at this point I guess. We can think about optimizing that when the time comes, however we might have to update examples that have request/limits set (unsure if there are any).

if err != nil {
return nil, fmt.Errorf("Failed to get pod listing: %v", err)
}
for _, pod := range pods {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you want this to deterministically give you the same health check, even if there are 3 endpoint pods behind the service with all different readiness probes, we should sort based on creation timestamp and always take the first one.

if servicePort.Type == intstr.Int && int(port.ContainerPort) == servicePort.IntValue() ||
servicePort.Type == intstr.String && port.Name == servicePort.String() {
if container.ReadinessProbe != nil {
if container.ReadinessProbe.HTTPGet == nil || container.ReadinessProbe.HTTPGet.Scheme != "HTTP" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not support https health checks? (I'm fine with the HTTP caveat, just curious)

@@ -102,6 +102,16 @@ type BackendInfo struct {
Repository string `json:"repository"`
}

// UpstreamCheck is used to configure ingress health checks
type UpstreamCheck struct {
HttpSend string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a godoc comment for each public field, especially HttpSend, Fall and Rise which are non-obvious.
Also please s/Http/HTTP in conformance with https://github.com/golang/go/wiki/CodeReviewComments#initialisms

@bprashanth
Copy link
Contributor

Also please add unittests

@bprashanth
Copy link
Contributor

Ideally a unittest might seed the pod cache with some fake pods and check to see that the right health check is retrieved with int/named ports. Let me know if you need me to dig up example unittests where we do something similar.

@cmluciano integration tests are a little harder, we're working on something generally applicable cross kube-repo, but till then we mostly rely on e2es and unittests. The ingress e2es already point at an rc with a readiness probe, so we can probably modify that to fail if this feature doesn't work once the image is published: https://github.com/kubernetes/kubernetes/blob/master/test/e2e/testing-manifests/ingress/http/rc.yaml#L18

@bprashanth
Copy link
Contributor

@cmluciano integration tests are a little harder

#5

@abstrctn
Copy link

Is this still under active development? Would be very useful for us, I'd be happy to lend a hand if it's helpful.

@chrismoos
Copy link
Contributor Author

See #981, this is probably preferable to this as unfortunately this duplicates the readinessProbe checking already built into Kubernetes.

@chrismoos chrismoos closed this Jul 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants