Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve private endpoint when stop and starts an AKS cluster #2745

Closed
arcezd opened this issue Jan 20, 2022 · 69 comments
Closed

Preserve private endpoint when stop and starts an AKS cluster #2745

arcezd opened this issue Jan 20, 2022 · 69 comments
Labels

Comments

@arcezd
Copy link

arcezd commented Jan 20, 2022

What happened:
We need to stop and start our Development aks cluster daily, just to avoid costs when it's not in use, this cluster is a private cluster, so we are accessing the Kubernetes API Server through a private endpoint.
We have a third-party network virtual appliance firewall solution deployed at our tenant to inspect/allow/block traffic between vnets or even on-premise to Azure.

According to the private endpoints documentation (and our tests), we need to register every private endpoint IP to the Azure Route Table to force the traffic to pass through the firewall.
Use Azure Firewall to inspect traffic destined to a private endpoint

We know that according to the Documentation: Stop and Start an Azure Kubernetes Service (AKS) cluster:

  • The customer provisioned PrivateEndpoints linked to private cluster need to be deleted and recreated again when you start a stopped AKS cluster.

So any workaround or any chance to create a feature request to preserve the private endpoint and don't need to recreate it every time we need to stop and start the cluster?

What you expected to happen:
We expect any option to preserve the IP of the private endpoint when we stop and starts the cluster.

How to reproduce it (as minimally and precisely as possible):
Deploy an AKS with a private endpoint, and stop and start the cluster.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"802eff1fe87ad2dd737ebbe891f30500b88beb00", GitTreeState:"clean", BuildDate:"2021-11-15T08:35:41Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
  • Size of cluster (how many worker nodes are in the cluster?)
    4 nodes
  • General description of workloads in the cluster (e.g. HTTP microservices, Java app, Ruby on Rails, machine learning, etc.)
  • Others:
@ghost ghost added the triage label Jan 20, 2022
@ghost
Copy link

ghost commented Jan 20, 2022

Hi arcezd, AKS bot here 👋
Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such:

  1. If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster.
  2. Please abide by the AKS repo Guidelines and Code of Conduct.
  3. If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics?
  4. Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS.
  5. Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue.
  6. If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

@ghost ghost added the action-required label Jan 23, 2022
@ghost
Copy link

ghost commented Jan 23, 2022

Triage required from @Azure/aks-pm

@olsenme
Copy link
Contributor

olsenme commented Jan 25, 2022

@phealy

@ghost ghost added action-required and removed action-required labels Jan 25, 2022
@ghost
Copy link

ghost commented Jan 27, 2022

Triage required from @Azure/aks-pm

@ghost
Copy link

ghost commented Feb 27, 2022

Action required from @Azure/aks-pm

@ghost ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Feb 27, 2022
@ghost
Copy link

ghost commented Mar 14, 2022

Issue needing attention of @Azure/aks-leads

1 similar comment
@ghost
Copy link

ghost commented Mar 29, 2022

Issue needing attention of @Azure/aks-leads

@asaf-upstream
Copy link

Any updates on this?
It will help our cause as well

@ghost
Copy link

ghost commented Apr 13, 2022

Issue needing attention of @Azure/aks-leads

3 similar comments
@ghost
Copy link

ghost commented Apr 29, 2022

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented May 14, 2022

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented May 29, 2022

Issue needing attention of @Azure/aks-leads

@arcezd
Copy link
Author

arcezd commented May 31, 2022

Any chance we got any response here? This is starting to be a major problem for our non-production workloads

@kethahel99
Copy link

Hello! Please can you provide an update on this? It is a much needed feature

@kethahel99
Copy link

@andyzhangx , @djdongjin , @raghulmsft, @Azure apologies, but is anyone working on/considering this request?

@ghost
Copy link

ghost commented Jun 30, 2022

Issue needing attention of @Azure/aks-leads

1 similar comment
@ghost
Copy link

ghost commented Jul 15, 2022

Issue needing attention of @Azure/aks-leads

@teeroddesigns
Copy link

This also caused us several hours of work to retrace and reconfigure things

@ghost
Copy link

ghost commented Jul 31, 2022

Issue needing attention of @Azure/aks-leads

2 similar comments
@ghost
Copy link

ghost commented Aug 15, 2022

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Aug 30, 2022

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented May 30, 2023

Issue needing attention of @Azure/aks-leads

4 similar comments
@ghost
Copy link

ghost commented Jun 14, 2023

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jun 29, 2023

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jul 14, 2023

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jul 30, 2023

Issue needing attention of @Azure/aks-leads

@maartengo
Copy link

The need to recreate the private endpoint is the main reason why we don't stop our AKS clusters. We could of course put effort into creating our own version of the reconfiguration script, but I would much rather have a supported/built-in solution from Microsoft.

My use cases:

  • Saving costs for customers who only use their clusters during working hours
  • Saving costs for our own development environments by only turning the clusters on when something needs to be developed (sort of a JIT cluster provisioning).

@levimm
Copy link

levimm commented Aug 17, 2023

you can use private cluster with APIServer Vnet Integration. The private ip will be reserved during stop/start.
https://learn.microsoft.com/en-us/azure/aks/api-server-vnet-integration#deploy-a-private-cluster

@rouke-broersma
Copy link

you can use private cluster with APIServer Vnet Integration. The private ip will be reserved during stop/start.
https://learn.microsoft.com/en-us/azure/aks/api-server-vnet-integration#deploy-a-private-cluster

We connect from remote vnets without line of sight so that doesn't solve our problem.

@orenzp
Copy link

orenzp commented Sep 6, 2023

Just open a ticket about it to Azure support to see if they have something planned to solve this issue.

@kethahel99
Copy link

you can use private cluster with APIServer Vnet Integration. The private ip will be reserved during stop/start.
https://learn.microsoft.com/en-us/azure/aks/api-server-vnet-integration#deploy-a-private-cluster

We connect from remote vnets without line of sight so that doesn't solve our problem.

Maube you can try UDR when ingressing to vnet\local FW from remote (I'm guessing peered?) vnet?
Like e.g.
Source address (remote) 10.34.0.0/24
Next hop = firewall IP in local vnet
Hop type = network virtual appliance (firewall)
Then in the firewall you allow traffic from thst source cidr to the aks api server private IP?

@rouke-broersma
Copy link

you can use private cluster with APIServer Vnet Integration. The private ip will be reserved during stop/start.
https://learn.microsoft.com/en-us/azure/aks/api-server-vnet-integration#deploy-a-private-cluster

We connect from remote vnets without line of sight so that doesn't solve our problem.

Maube you can try UDR when ingressing to vnet\local FW from remote (I'm guessing peered?) vnet?
Like e.g.
Source address (remote) 10.34.0.0/24
Next hop = firewall IP in local vnet
Hop type = network virtual appliance (firewall)
Then in the firewall you allow traffic from thst source cidr to the aks api server private IP?

Nope, no peering. No line of sight. You can create private endpoints to any vnet anywhere in azure, totally disconnected networks.

@dtu-ruth
Copy link

What is the status of this issue?

@amar-mandai
Copy link

Whats the status of this issue?

Due to this, start and stop feature is almost unusable in case of private clusters.

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

8 similar comments
Copy link
Contributor

Issue needing attention of @Azure/aks-leads

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

Copy link
Contributor

Issue needing attention of @Azure/aks-leads

@victormartinsantiago
Copy link

Any updates on this issue? Having to re-provision the private endpoint every time the cluster stops, is a pain in the ass

@levimm
Copy link

levimm commented Jul 2, 2024

From AKS perspective, we have to recreate the private endpoint during stop/start in order to build the private connection. Right now, we don't have any future plan on private v1 cluster (implemented by private link).

@microsoft-github-policy-service microsoft-github-policy-service bot removed action-required Needs Attention 👋 Issues needs attention/assignee/owner labels Jul 2, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the stale Stale issue label Jul 23, 2024
Copy link
Contributor

This issue has been automatically marked as stale because it has not had any activity for 21 days. It will be closed if no further activity occurs within 7 days of this comment.

Copy link
Contributor

This issue will now be closed because it hasn't had any activity for 7 days after stale. arcezd feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests