nginxinc · chrisakker · Jan 26, 2023 · Jan 25, 2023
@@ -0,0 +1,23 @@
+## NginxLB API testing:
+
+List upstreams in block "nginx-lb-https":
+
+curl http://10.1.1.4:9000/api/8/stream/upstreams/nginx-lb-https |jq
+
+Add upstream with JSON:
+
+curl -X POST "http://10.1.1.4:9000/api/8/stream/upstreams/nginx-lb-https/servers/" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"server\": \"10.1.1.8:31269\" }"
+
+curl -X POST "http://10.1.1.4:9000/api/8/stream/upstreams/nginx-lb-https/servers/" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"server\": \"10.1.1.10:31269\" }"
+
+Disable upstream #2 ( id required ):
+
+curl -X PATCH -d "{ \"down\": true }" -s 'http://10.1.1.4:9000/api/8/stream/upstreams/nginx-lb-https/servers/2' -H "accept: application/json" -H "Content-Type: application/json"
+
+Enable upstream #2 ( id required ):
+
+curl -X PATCH -d "{ \"down\": false }" -s 'http://10.1.1.4:9000/api/8/stream/upstreams/nginx-lb-https/servers/2' -H "accept: application/json" -H "Content-Type: application/json"
+
+Delete upstream server with id=0( id required ):
+
+curl -X DELETE -s 'http://10.1.1.4:9000/api/8/stream/upstreams/nginx-lb-https/servers/0' -H "accept: application/json" -H "Content-Type: application/json"
@@ -0,0 +1,253 @@
+# New Nginx LB Solution - "Nginx Kubernetes Loadbalancer"
+
+<br/>
+
+- Build an Nginx Kubernetes Loadbalancer Controller for MVP
+- Provide a functional replacement for the "Loadbalancer Service Type" external to an On Premise K8s cluster.
+- Chris Akker  / Jan 2023 / Initial draft
+
+<br/>
+
+## Abstract:
+
+Create a new K8s Controller, that will monitor specified k8s Service Endpoints, and then send API calls to an external NginxPlus server to manage Nginx Upstream server blocks.  
+This is will synchronize the K8s Service Endpoint list, with the Nginx LB server's Upstream block server list.  
+The primary use case is for tracking the NodePort IP:Port definitions for the Nginx Ingress Controller's `nginx-ingress Service`.  
+With the NginxPlus Server located external to the K8s cluster, this new controller LB function would provide an alternative TCP "Load Balancer Service" for On Premises k8s clusters, which do not have access to a Cloud providers "Service Type LoadBalancer".
+
+<br/>
+
+## Solution Description:
+
+When running a k8s Cluster On Premise, there is no equivalent to a Cloud Provider's Loadbalancer Service Type.  This solution and new software is the TCP load balancer functional replacement.
+
+When using a Cloud Provider's Loadbalancer Service Type, it provides 3 basic functions for External access to the k8s pods/services running inside the cluster:
+
+1. Public IP address allocation, visible from the Internet
+2. DNS record management for this Public IP (usually A records for FQDNs)
+3. TCP loadbalancing, from the PublicIP:wellknownports, to the NodePort:highnumberports of the cluster nodes.  
+
+This is often called "NLB", a term used in AWS for Network Load Balancer, but functions nearly identical in all Public Cloud Provider networks.  It is not actually a component of K8s, rather, it is a service provided by the Cloud Providers SDN (Software Defined Network), but is managed by the user with K8s Service Type LoadBalancer definitions/declarations.
+
+**This Solution uses NGINX to provide an alternative to #3, the TCP loadbalancing from PublicIP to k8s NodePort.**
+
+Note: This solution is not for Cloud-based K8s clusters, only On-Premise K8s clusters.
+
+## Reference Diagram:
+
+<br/>
+
+![NGINX LB Server](media/nginxlb-nklv1.png)
+
+<br/>
+
+## Business Case
+
+- Every On Premise Kubernetes cluster needs this Solution, for external clients to access pods/service running inside the cluster.
+- Market opportunity is at least one NginxPlus license for every k8s cluster.  Two licenses if you agree that High Availability is a requirement.
+- Exposing Pods and Services with NodePort requires the use of high numbered TCP ports (greater than 30000 by default).  Lower, well-known TCP port numbers less than 1024 are NOT allowed to bind to the k8s Nodes' IP address.  This contradicts the ephemeral dynamic nature of k8s itself, and mandates that all HTTP URLs must contain port numbers unfamiliar to everyone.
+- There is a finite limit of available NodePorts available, as 30000-32767 is the default range, leaving ~ 2768 usable ports.
+- The tracking and allocation of which pod/service is using what TCP port is manual and tedius for app dev and devops teams.
+
+Alternatives:
+- CIS with BIG-IP
+- MetalLB
+- AVI Networks / VMWare
+- Many Other HW Vendors
+
+However, most of these alternatives are proprietary, open source / unsupported, competitive, or have other customer concerns.
+
+>**`NGINX PLUS is a viable alternative for most customers.`**
+
+Why not Nginx OpenSource?  Nginx Open Source does not have the API endpoint and service for managing Upstream Server block configurations.
+
+<br/>
+
+## Definition of Terms
+
+- NKL - Nginx Kubernetes Loadbalancer - the name of this Solution
+- NEC - Nginx LB Controller - k8s controller / watcher
+- Nginx LB - An NginxPlus server external to the k8s cluster
+- NIC - Nginx Ingress Controller pod
+- Nginx Ingress Endpoints - the list of IP:Ports for the nginx-ingress Service defined in K8s.
+- Nginx-lb-http - the Nginx LB Server Upstream block that represents the mapped Nginx Ingress Controller(s) `Host:NodePort` Endpoints for http
+- Nginx-lb-https - the Nginx LB Server Upstream block that represents the mapped Nginx Ingress Controller(s) `Host:NodePort` Endpoints for https
+- NodePort nginx-ingress Service - exposes the Nginx Ingress Controller(s) on Host:Port
+- Plus API - the standard Nginx Plus API service that is running on the Nginx LB Server
+- Upstream - the IP:Port list of servers that Nginx will Load Balance traffic to at Layer 4 TCP using the stream configuration
+
+<br/>
+
+## Development requirements for the Nginx K8s LB controller
+
+<br/>
+
+Preface -  Define access parameters for NKL Controller to communicate with NginxPlus instance:
+- IP address:port or FQDN of the target Nginx LB Server
+- Optional auth:  SSL certificate/key
+- Optional auth:  IP allow list
+- Optional auth:  HTTP Auth userid/password
+- Optional auth:  JWT Token
+
+1. Initialization:
+- Define the name of the target Upstream Server Block
+- "nginx-lb-http" or "nginx-lb-https" should be the default server block names, returns error if this does not exist
+- API query to NginxPlus LB server for current Upstream list
+- API query to K8s apiserver of list of Ingress Controller Endpoints
+- Reconcile the two lists, making changes to Nginx Upstreams to match the Ingress Endpoints ( add / delete Upstreams as needed to converge the two lists )
+
+2. Runtime:
+- Periodic check - API query for the list of Servers in the Upstream block, using the NginxPlus API ( query time TBD )
+- IP:port definition
+- other possible metadata: status, connections, response_time, etc
+- Keep a copy of this list in memory, if state is required
+
+3. Modify Upstream server entries, based on K8s NodePort Service endpoint "Notification" changes
+- Register the LB Controller with the K8s watcher Service, subscribe to Notifications for changes to the nginx-ingress Service Endpoints.
+- Add new Endpoint to Upstream Server list on k8s Notify
+- Remove deleted Endpoints to Upstream list, using the Nginx Plus "Drain" function, leaving existing TCP connections to close gracefully on K8s Notify delete.
+- Create and Set Drain_wait timer on Draining Upstream servers
+- Remove Draining Upstream servers after Drain_wait timer expires
+- Log changes to debug, nginx error.log, or custom access.log as appropriate
+
+4. Query the K8s api server, for the list of Endpoints for the "nginx-ingress" Service object.  This is the list of NodePorts where the Nginx Ingress Controller is listening.
+- Keep a copy of this list in memory, if state is desired
+
+5. Main program 
+- Compare the list of Upstream servers from the Nginx API call, with the list nginx-ingress Service Endpoints from the K8s API call
+- Calculate the difference in the list, and create new Nginx API calls to update the Upstream list, adding or removing the changes needed to mirror the nginx-ingress Service Endpoints list
+- Log these changes
+
+6. Optional:  Make Nginx API calls to update the entire Upstream list, regardless of what the existing list contains.  *Not sure how NginxPlus responds when you try to add a duplicate server entry via the API - I believe it just fails with no effect to the existing server entry and established connections - needs to be tested*
+
+<br/>
+
+## PM/PD Suggestion - to build this new Controller, use the existing Nginx Ingress Controller framework/code, to create this new k8s Controller, leveraging the Enterprise class, supportable code Nginx already has on hand.
+
+<br/>
+
+## Example Nginx Plus API request for Upstream block changes
+
+<br/>
+
+Here are some examples of using cURL to the NginxPlus API to control Upstream server blocks:
+
+<Nginx API call to add Upstream Server, Nginx LB Server is at 172.16.1.15:9000 in these examples>
+
+Nginx API is listening on port 9000.  To enable the Nginx Plus API, refer to these instructions:
+https://docs.nginx.com/nginx/admin-guide/monitoring/live-activity-monitoring/
+
+("jq" is used to format the Json responses into easy to read output.)
+
+<br/>
+
+To `ADD` a new server to the Upstream group “nginx-lb-http”:
+
+curl -X POST -d ‘{ “server”: “172.16.1.81:32080” }’ -s ‘http://172.16.1.15:9000/api/4/stream/upstreams/nginx-lb-http/servers’
+
+
+To `LIST` the Servers in an Upstream group called "nginx-lb-http":
+curl http://172.16.1.15:9000/api/4/stream/upstreams/nginx-lb-http/ | jq
+
+To `DISABLE` the existing Upstream Server with ID = 0:
+curl -X PATCH -d ‘{ “down”: true }’ -s ‘http://172.16.1.15:9000/api/4/stream/upstreams/nginx-lb-http/servers/0’
+
+Response is:
+{“id”:0,”server”:”127.0.0.1:8081”,”weight”:1,”max_conns”:0,”max_fails”:1,”fail_timeout”:”10s”,”slow_start”:”0s”,”route”:””,”backup”:false,”down”:true}
+
+To `ENABLE` an existing Upstream Server with ID = 0 that is down:
+curl -X PATCH -d ‘{ “down”: false }’ -s ‘http://172.16.1.15:9000/api/4/stream/upstreams/nginx-lb-http/servers/0’
+
+Response is:
+{“id”:0,”server”:”127.0.0.1:8081”,”weight”:1,”max_conns”:0,”max_fails”:1,”fail_timeout”:”10s”,”slow_start”:”0s”,”route”:””,”backup”:false,”down”:false}
+
+To `ADD` a new server to the Upstream group “nginx-lb-http”, with 60 seconds slow start:
+curl -X POST -d ‘{ “server”: “127.0.0.1:8085”,”slow_start”: “60s” }’ -s ‘http://172.16.1.15:9000/api/4/stream/upstreams/nginx-lb-http/servers’
+
+Response is:
+{“id”:9,”server”:”127.0.0.1:8085”,”weight”:1,”max_conns”:0,”max_fails”:1,”fail_timeout”:”10s”,”slow_start”:”60s”,”route”:””,”backup”:false,”down”:false}
+
+To `DRAIN` connections off an existing Upstream Server with ID = 2
+curl -X PATCH -d '{ "drain": true }' -s 'http://172.16.1.15:9000/api/4/stream/upstreams/nginx-lb-http/servers/2'
+
+Response is:
+{"id":2,"server":"127.0.0.1:8083","weight":1,"max_conns":0,"max_fails":1,"fail_timeout":"10s","slow_start":"0s","route":"","backup":false,"down":false,"drain":true}
+
+Note:  During recent testing with R28 and API version 8, the Drain command was 404 - not found.
+
+To `CHANGE the LB WEIGHT` of an Upstream Server with ID = 2:
+curl -X PATCH -d '{ "weight": 3 }' -s 'http://172.16.1.15:9000/api/4/stream/upstreams/nginx-lb-http/servers/2'
+
+Response is:
+{"id":2,"server":"127.0.0.1:8083","weight":3,"max_conns":0,"max_fails":1,"fail_timeout":"10s","slow_start":"0s","route":"","backup":false,"down":false}
+
+Add upstream with JSON:
+
+curl -X POST "http://10.1.1.4:9000/api/8/stream/upstreams/nginx-lb-https/servers/" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"server\": \"10.1.1.99:31269\" }"
+
+<br/>
+
+## References:
+
+Cloud Provider's K8s Loadbalancer Service Type:
+
+https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/
+
+- AWS: https://aws.amazon.com/premiumsupport/knowledge-center/eks-kubernetes-services-cluster/
+- GCP: https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer
+- Azure: https://learn.microsoft.com/en-us/azure/load-balancer/components#frontend-ip-configurations
+- Digital Ocean: https://docs.digitalocean.com/products/kubernetes/how-to/add-load-balancers/
+- You get the point - this Service does not exist in private data centers
+
+Kubernetes controllers:
+
+https://kubernetes.io/docs/concepts/architecture/controller/#:~:text=In%20Kubernetes%2C%20controllers%20are%20control,closer%20to%20the%20desired%20state.
+
+Nginx Ingress Controller, how it works:
+
+https://docs.nginx.com/nginx-ingress-controller/intro/how-nginx-ingress-controller-works/
+
+Nginx API: http://nginx.org/en/docs/http/ngx_http_api_module.html
+
+Example: http://nginx.org/en/docs/http/ngx_http_api_module.html#example
+
+Nginx Upstream API examples:  http://nginx.org/en/docs/http/ngx_http_api_module.html#stream_upstreams_stream_upstream_name_servers_stream_upstream_server_id
+
+<br/>
+
+## Sample NginxPlus LB Server configuration ( server and upstream blocks )
+
+```bash
+# NginxLB Stream configuration, for TCP load balancing
+# Chris Akker, Jan 2023
+# TCP Proxy and load balancing block
+# Nginx Kubernetes Loadbalancer
+# backup servers allow Nginx to start
+#
+#### nginxlb.conf
+
+   upstream nginx-lb-http {
+      zone nginx_lb_http 256k;
+      #placeholder
+      server 1.1.1.1:32080 backup; 
+    }
+
+   upstream nginx-lb-https {
+      zone nginx_lb_https 256k;
+      #placeholder
+      server 1.1.1.1:32443 backup; 
+    }
+
+   server {
+      listen 80;
+      status_zone nginx_lb_http;
+      proxy_pass nginx-lb-http;
+    }
+
+   server {
+      listen 443;
+      status_zone nginx_lb_https;
+      proxy_pass nginx-lb-https;
+    }
+
+```
@@ -0,0 +1,17 @@
+server {
+   listen 9000;
+
+   location /api {
+      api write=on;
+   }
+
+   location = /dashboard.html {
+      root /usr/share/nginx/html;
+   }
+
+   # Redirect requests for "/" to "/dashboard.html"
+   location / {
+       return 301 /dashboard.html;
+   }
+
+}
@@ -0,0 +1,32 @@
+# NginxLB Stream configuration, for TCP load balancing
+# Chris Akker, Jan 2023
+# TCP Proxy and load balancing block
+# Nginx Kubernetes Loadbalancer
+### backup servers allow Nginx to start
+#
+#### nginxlb.conf
+
+   upstream nginx-lb-http {
+      zone nginx-lb-http 256k;
+      #placeholder
+      server 1.1.1.1:32080 backup; 
+    }
+
+   upstream nginx-lb-https {
+      zone nginx-lb-https 256k;
+      #placeholder
+      server 1.1.1.1:32443 backup; 
+    }
+
+   server {
+      listen 80;
+      status_zone nginx-lb-http;
+      proxy_pass nginx-lb-http;
+    }
+
+   server {
+      listen 443;
+      status_zone nginx-lb-https;
+      proxy_pass nginx-lb-https;
+    }
+
@@ -0,0 +1,41 @@
+## WRK load tests from Ubuntu Jumphost
+## to Nginx LB server
+## and direct to each k8s nodeport
+## using WRK in a container
+
+### 10.1.1.4 is the Nginx LB Server's IP addr
+
+docker run --rm williamyeh/wrk -t4 -c50 -d2m -H 'Host: cafe.example.com' --timeout 2s https://10.1.1.4/coffee
+Running 2m test @ https://10.1.1.4/coffee
+  4 threads and 50 connections
+  Thread Stats   Avg      Stdev     Max   +/- Stdev
+    Latency    19.73ms   11.26ms 172.76ms   81.04%
+    Req/Sec   626.50    103.68     1.03k    75.60%
+  299460 requests in 2.00m, 481.54MB read
+Requests/sec:   2493.52
+Transfer/sec:      4.01MB
+
+## To knode1
+
+ubuntu@k8-jumphost:~$ docker run --rm williamyeh/wrk -t4 -c50 -d2m -H 'Host: cafe.example.com' --timeout 2s https://10.1.1.8:31269/coffee
+Running 2m test @ https://10.1.1.8:31269/coffee
+  4 threads and 50 connections
+  Thread Stats   Avg      Stdev     Max   +/- Stdev
+    Latency    17.87ms   10.63ms 151.45ms   80.16%
+    Req/Sec   698.98    113.22     1.05k    75.67%
+  334080 requests in 2.00m, 537.22MB read
+Requests/sec:   2782.35
+Transfer/sec:      4.47MB
+
+## t0 knode2
+
+ubuntu@k8-jumphost:~$ docker run --rm williamyeh/wrk -t4 -c50 -d2m -H 'Host: cafe.example.com' --timeout 2s https://10.1.1.10:31269/coffee
+Running 2m test @ https://10.1.1.10:31269/coffee
+  4 threads and 50 connections
+  Thread Stats   Avg      Stdev     Max   +/- Stdev
+    Latency    17.62ms   10.01ms 170.99ms   80.32%
+    Req/Sec   703.96    115.07     1.09k    74.17%
+  336484 requests in 2.00m, 541.41MB read
+Requests/sec:   2801.89
+Transfer/sec:      4.51MB
+