Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to set upstream zone size and keepalive settings #483

Closed
4 tasks done
Tracked by #2162
kate-osborn opened this issue Mar 17, 2023 · 8 comments
Closed
4 tasks done
Tracked by #2162

Ability to set upstream zone size and keepalive settings #483

kate-osborn opened this issue Mar 17, 2023 · 8 comments
Assignees
Labels
area/nginx-configuration Relates to nginx configuration enhancement New feature or request refined Requirements are refined and the issue is ready to be implemented. size/large Estimated to be completed within two weeks
Milestone

Comments

@kate-osborn
Copy link
Contributor

kate-osborn commented Mar 17, 2023

As a user of NGF
I want NGF to update the upstream zone size for NGINX
So that if I run into errors when due to exceeding my zone size, I can fix them.

As a user of NGF
I want NGF to enable keepalive connections on my route
So that I can optimize the performance of my application.

Acceptance

  • The user is able to set NGINX's upstream zone size.
  • The user is able to enable keepalive connections as defined by the design.
  • When possible, configuration updates with NGINX Plus should be made using the NGINX Plus API so NGINX is not reloaded.
    • zone size
    • keepalives connections

Dev Notes:

Tasks

Preview Give feedback
  1. bjee19 kate-osborn
  2. refined size/medium stale
    kate-osborn
  3. refined size/medium
    bjee19
  4. blocked enhancement refined
@kate-osborn kate-osborn added the enhancement New feature or request label Mar 17, 2023
@kate-osborn kate-osborn added the area/nginx-configuration Relates to nginx configuration label Mar 21, 2023
@kate-osborn kate-osborn added this to the v1.0.0 milestone Mar 21, 2023
@mpstefan
Copy link
Collaborator

Would the user expect the system to do this? What is the impact on the user's system if the zone size is not dynamically updated?

@brianehlert
Copy link

Dynamic calculation of upstream zone accomplishes one thing, as the size of an upstream service grows it ensure that NGINX can handle that.
The positive impact is that the customer can scale upstream services at will to 1000s of pods and the system will dynamically adapt.
The negative impact is the growth in memory utilization as the possibility exists to high limits.

I have an entire write-up around this for NIC to do the same.
This would be a later and advanced capability that has the impact of optimizing the system.

If not dynamically calculated, it needs to be exposed in a configmap for example or however system tuning is exposed.
Whether auto-magic is a requirement for v1 should be discussed.

@mpstefan mpstefan modified the milestones: v1.0.0, v1.0.1 Aug 11, 2023
@mpstefan mpstefan added refined Requirements are refined and the issue is ready to be implemented. size/small Estimated to be completed within ~2 days labels Sep 11, 2023
@mpstefan
Copy link
Collaborator

Today we discussed on how this would be valuable for NGINX+ as it does not require a reload when upstreams are added or removed.

@mpstefan mpstefan modified the milestones: v1.0.1, v1.1.0 Sep 26, 2023
@brianehlert
Copy link

brianehlert commented Oct 13, 2023

NIC has run into a number of customer situations where customers set their limits so lean as even a back end service scaling event can cause OOM or CPUThrottling situations as a result of the configuration change. Without conscious memory consumption increases that are introduced by a feature like this.

While I think this capability is highly valuable,
As I have learned more about how customers are leaning into using Quality of Service and other K8s platform requirements that force the setting of limits - I am hesitant at introducing something like this due to the feared impact on the system as a whole and a situation where the Gateway is unable to start because the configuration alone is forcing the pod into an OOM state.

@mpstefan mpstefan changed the title Dynamically calculate upstream zone size Ability to set upstream zone size Oct 23, 2023
@mpstefan mpstefan added the blocked Blocked by other issue label Oct 24, 2023
@mpstefan
Copy link
Collaborator

blocked by #929

@pleshakov
Copy link
Contributor

@brianehlert

NIC has run into a number of customer situations where customers set their limits so lean as even a back end service scaling event can cause OOM or CPUThrottling situations as a result of the configuration change. Without conscious memory consumption increases that are introduced by a feature like this.

While I think this capability is highly valuable,
As I have learned more about how customers are leaning into using Quality of Service and other K8s platform requirements that force the setting of limits - I am hesitant at introducing something like this due to the feared impact on the system as a whole and a situation where the Gateway is unable to start because the configuration alone is forcing the pod into an OOM state.

The amount of configuration does affect memory consumptions of NGINX - more config you have (including TLS secrets), more memory it will consume.

Also note that our architecture includes running the control plane along the data plane, where the control plane has a cache of resources in the cluster in memory. This means that the number of those resources (including HTTPRoutes, Secrets, Endpoints... ) also directly affect memory consumption of NGF pod, without even considering the data plane.

However, traffic will much greater affect memory -- as each connection requires memory.

Additionally, configuration changes (reloading NGINX) temporarily increases memory consumption, as during a reload both old worker processes and new worker processes coexist.

Supporting dynamic calculation of zone sizes will reduce overall memory of NGINX -- because each upstream will use the amount tuned to the number of upstream servers, not some large value that will hold any amount for most cases.

Considering all that, I think dynamic calculations of zone sizes will be beneficial and it will not lead to OOMs - other things will lead to OOMs first.

@ciarams87 ciarams87 removed the blocked Blocked by other issue label Nov 6, 2023
@bjee19 bjee19 self-assigned this Nov 13, 2023
@bjee19 bjee19 moved this from 🆕 New to 🏗 In Progress in NGINX Gateway Fabric Nov 13, 2023
@bjee19 bjee19 moved this from 🏗 In Progress to 🆕 New in NGINX Gateway Fabric Nov 15, 2023
@kate-osborn kate-osborn added the blocked Blocked by other issue label Nov 16, 2023
@bjee19 bjee19 removed their assignment Nov 20, 2023
@ja20222 ja20222 added the backlog Currently unprioritized work. May change with user feedback or as the product progresses. label Nov 20, 2023
@ja20222 ja20222 removed this from the v1.1.0 milestone Nov 20, 2023
@mpstefan mpstefan removed the backlog Currently unprioritized work. May change with user feedback or as the product progresses. label Feb 6, 2024
@mpstefan mpstefan added this to the v1.2.0 milestone Feb 6, 2024
@mpstefan mpstefan removed refined Requirements are refined and the issue is ready to be implemented. size/small Estimated to be completed within ~2 days labels Feb 6, 2024
@mpstefan mpstefan modified the milestones: v1.2.0, v2.0.0 Feb 21, 2024
@mpstefan mpstefan removed the blocked Blocked by other issue label Mar 11, 2024
@mpstefan mpstefan modified the milestones: v1.3.0, v2.1.0 Mar 29, 2024
@mpstefan mpstefan changed the title Ability to set upstream zone size Ability to set upstream zone size and keepalive settings Sep 3, 2024
@mpstefan mpstefan added refined Requirements are refined and the issue is ready to be implemented. size/large Estimated to be completed within two weeks labels Sep 3, 2024
@mpstefan mpstefan modified the milestones: v1.5.0, v2.0.0 Oct 14, 2024
@sindhushiv sindhushiv modified the milestones: v2.0.0, v1.5.0 Oct 30, 2024
@kate-osborn
Copy link
Contributor Author

kate-osborn commented Oct 31, 2024

When possible, configuration updates with NGINX Plus should be made using the NGINX Plus API so NGINX is not reloaded.

  • zone size
  • keepalives connections

It doesn't look like it is possible to set zone size or keepalive connections using the N+ API. The API doesn't support updating directives for an upstream group. You can only add/modify/delete servers from upstreams: https://demo.nginx.com/swagger-ui/?_ga=2.44370660.1560926404.1730133990-1687392834.1727393286#/

@kate-osborn
Copy link
Contributor Author

Also note:

When using load balancing methods other than the default round-robin method, it is necessary to activate them before the keepalive directive.

@kate-osborn kate-osborn self-assigned this Oct 31, 2024
@kate-osborn kate-osborn moved this from 🆕 New to 🏗 In Progress in NGINX Gateway Fabric Oct 31, 2024
@sjberman sjberman modified the milestones: v1.5.0, v1.6.0 Nov 20, 2024
@kate-osborn kate-osborn moved this from 🏗 In Progress to 👀 In Review in NGINX Gateway Fabric Dec 11, 2024
@kate-osborn kate-osborn moved this from 👀 In Review to ✅ Done in NGINX Gateway Fabric Dec 13, 2024
@kate-osborn kate-osborn closed this as completed by moving to ✅ Done in NGINX Gateway Fabric Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nginx-configuration Relates to nginx configuration enhancement New feature or request refined Requirements are refined and the issue is ready to be implemented. size/large Estimated to be completed within two weeks
Projects
Status: Done
Development

No branches or pull requests

9 participants