Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] AKS Support for Ubuntu 18.04 #1487

Closed
palma21 opened this issue Mar 11, 2020 · 37 comments
Closed

[Feature] AKS Support for Ubuntu 18.04 #1487

palma21 opened this issue Mar 11, 2020 · 37 comments
Assignees
Labels
feature-request Requested Features

Comments

@palma21
Copy link
Member

palma21 commented Mar 11, 2020

New AKS base imaged based on Ubuntu 18.04

@palma21 palma21 self-assigned this Mar 11, 2020
@jluk jluk changed the title [Feature] AKS Support for 18.04 [Feature] AKS Support for Ubuntu 18.04 Mar 11, 2020
@mangesh-silicus
Copy link

when will be GA version available for production workload

@ams0
Copy link

ams0 commented Mar 27, 2020

How does one enable it?

@palma21
Copy link
Member Author

palma21 commented Mar 31, 2020

@mangesh-silicus we're trying to have a good signal as well as providing a bit more flexibility for the upgrade. We're preliminary targeting end of H1 if all continues to go well.

@annerajb
Copy link

Not sure if this is a bug or not (kinda suspect it is...)
When I create a nodepool and my AKS Cluster was created with the 18.04 preview should the nodepool also be created with 18.04?

Right now the az aks nodepool add command has no argument for selecting 18.04 as the preview image of the node pool either (unless it's some bleeding edge preview I have to select or install to get it to show up on the help page)

@palma21
Copy link
Member Author

palma21 commented May 19, 2020

There is one on the latest preview CLI. Let me add that to the docs. Thanks for the callout.

it's the same syntax as cluster create

@annerajb
Copy link

Thanks a update did it I was able to create the nodpool using az aks nodepool add

@mayank-saggar
Copy link

@palma21 Will this update be a single global rollout or a phased one? Could you shed some light on how the older clusters will receive this update to 18.04?

@palma21
Copy link
Member Author

palma21 commented Jul 22, 2020

https://github.com/Azure/AKS/releases

AKS will default to AKS ubuntu 18.04 in upcoming GA of kubernetes 1.18 and after AKS Ubuntu 18.04 is GA as well. We recommend testing existing workloads on AKS Ubuntu 18.04 nodepools prior to GA. See how here: https://aka.ms/aks/Ubuntu1804

This means that only clusters with kubernetes v1.18+ will receive AKS Ubuntu 18.04 by default. Older clusters will receive 18.04 as they upgrade from their version to a 1.18 version.

@paolopiaggio
Copy link

Is there a way to update nodes to Ubuntu 18.04 for clusters that do not support multiple node pools?

@palma21
Copy link
Member Author

palma21 commented Aug 11, 2020

Yes, when this feature is GA, every cluster that upgrades to 1.18+ will get this node version too.

@paolopiaggio
Copy link

Thanks @palma21!
So I guess that's the only option, isn't it?
From what I understood from here, you can now test Ubuntu 18.04 nodes on existing AKS cluster, regardless of the kubernetes version.

But that's not possible with existing clusters that do not support multiple node pools and therefore the only chance, in that case, is to update to kubernetes 1.18+ (GA or enable Ubuntu 18.04 preview)...am I right?

@palma21
Copy link
Member Author

palma21 commented Aug 11, 2020

That's correct.
Most infra features to be enabled on an running cluster can only be done safely either with another nodepool or upgrade.

@palma21 palma21 added the feature-request Requested Features label Aug 12, 2020
@mrowken
Copy link

mrowken commented Sep 24, 2020

From Release Notes it looks like Ubuntu 18.04 should be used by default with AKS 1.18.8, but it isn't.
I have created cluster several times in last week (West Europe), but it still uses old version: microsoft-aks / aks / aks-ubuntu-1604-2020-q3 / 2020.09.03
How to get cluster with Ubuntu 18.04, other than using Preview feature ?

@palma21
Copy link
Member Author

palma21 commented Sep 25, 2020

Unfortunately the release hasn't finished yet, so you can't create 18.04 in another way right now. I'll update this issue as soon as it finishes and all 1.18 clusters start receiving it by default.

@palma21
Copy link
Member Author

palma21 commented Oct 1, 2020

Release arrived to:
australia east, canada central and west central us regions.

will continue to update until it's worldwide.

@nidiculageorge
Copy link

nidiculageorge commented Oct 2, 2020

Is the release available in North Europe as I have tried setting up cluster in NE and still showing old Ubuntu version

@nidiculageorge
Copy link

@palma21 Is there any expected dates the releases will be rolled out in rest of the regions (Noorth Europe and South East Asia).

@aristosvo
Copy link

aristosvo commented Oct 5, 2020

In the post before yours it's stated that the post will be updated until it's released worldwide. EDIT: In a different issue (#1625 (comment)) @palma21 predicted that they would conclude rollout previous week or start of this week.

@nidiculageorge
Copy link

nidiculageorge commented Oct 5, 2020

As we have deployed clusters in all NE and SEA regions and our application need to be tested based on the update that is Node Image version. So our testing is on hold, so just wanted to communicate the same to our stakeholders. That's why I asked for an ETA.

thankyou

@aristosvo
Copy link

aristosvo commented Oct 5, 2020

@nidiculageorge Sorry, not trying to offend you and I was too quick with my comment. For myself I'd edit my first post instead of doing two separate posts, but that's a minor detail.

In an different issue (#1625 (comment)) @palma21 predicted that they would conclude rollout previous week or start of this week.

@palma21
Copy link
Member Author

palma21 commented Oct 5, 2020

Rollout is now on

  • australiacentral
  • australiasoutheast
  • canadaeast
  • eastasia
  • southeastasia
  • uksouth
  • ukwest

Europe regions should follow. Will continue to update.

@palma21
Copy link
Member Author

palma21 commented Oct 6, 2020

Rollout is now completed on:

  • westeurope
  • northeurope
  • japaneast
  • japanwest

We expect to complete worldwide by tomorrow.

@nidiculageorge
Copy link

@palma21 Thanks for the updates.

Will the node image versions get updated automatically to Ubuntu 18.04 for an already updated cluster (AKS vresion 1.18.8) ?

@nidiculageorge
Copy link

Hello Team,

I have done the upgrade in our env

Points that i come across during the upgrade

  1. A cluster which is already upgraded to 1.18.8 and the node image version which is old for a particular nodepool cant be changed. We have to add a new Nodepool which will take the correct Nodeimage version

@devteng
Copy link

devteng commented Oct 7, 2020

Hello Team,

I have done the upgrade in our env

Points that i come across during the upgrade

  1. A cluster which is already upgraded to 1.18.8 and the node image version which is old for a particular nodepool cant be changed. We have to add a new Nodepool which will take the correct Nodeimage version

Is this done with the --node-image-only option? (https://docs.microsoft.com/en-us/azure/aks/node-image-upgrade)

@ksandermann
Copy link

@nidiculageorge @devteng
I am experiencing the same issue - can't find a way to move an AKS 1.18.8 cluster with 16.04 to 18.04.

According to MS support, the --node-image-only option only works if you already are on 1.18.8 with 16.04.

Is there really no other option to upgrade other than waiting for a newer version than 1.18.8 (and would this even work?) or adding a new nodepool?

@palma21
Copy link
Member Author

palma21 commented Oct 7, 2020

The rollout has now reached all public regions.

Node image version will never change your underlying base image, that could render some unexpected cases on prod clusters where apps might have some untested kernel dependency.

For clusters already on 1.18.8 you can add a new pool and delete the old one if you want 18.04. When you delete we do cordon and drain as well, so it's a process similar to an upgrade.

We are also about to release 1.18.9 next week. But we normally recommend node pool blue/green upgrades with testing in between vs. in place upgrade (for prod clusters).

@nidiculageorge
Copy link

@ksandermann

  1. Is the cluster upgraded to 1.18.8
  2. If yes,please add a new nodepool to existing one which will automatically take the new Node IMage version (18.04) version

@mrowken
Copy link

mrowken commented Oct 8, 2020

I can confirm that @palma21 instructions works fine to add another system node pool, change the previous pool to mode "User" and delete it.
It required me to be patient waiting for rollout to regions, but finally I was able to get new Ubuntu nodes on running cluster.
I can only admit that using az aks nodepool upgrade did nothing for me. I have no idea what is expected behaviour and use case of this function.

@ksandermann
Copy link

@nidiculageorge Yes its upgraded to 1.18.8 but still running 16.04

Adding another nodepool and replacing the existing one is not a feasible option for us and probably a lot of other users as well - As we are using terraform and this is not possible natively in the terraform azurerm provider (hashicorp/terraform-provider-azurerm#7093).

However, if moving from 1.18.8 to 1.18.9 means that 16.04 will be replaced by 18.04 and it will be released within the next weeks, thats fine for us :)

@mrowken - from what i understand is that az aks nodepool upgrade will only work if you are already on 18.04 and 1.18.8 and if there is a newer image version for the 18.04 nodes available

@mrowken
Copy link

mrowken commented Oct 8, 2020

@ksandermann I'm also using Terraform and was able to achieve it, but in ugly way. I hope MS listen to us and make it easier in the future. Especially it would be easier, if there is no single default_node_pool in azurerm_kubernetes_cluster, but you can specify a list or use only map of azurerm_kubernetes_cluster_node_pool.
My procedure:

  1. If you have limited network capacity, scale down your nodepool to be <= 0,5 x your capacity if possible.
  2. In Terraform config add another nodepool (azurerm_kubernetes_cluster_node_pool) in System mode.
  3. Use az aks nodepool update --mode User and az aks nodepool delete on the initial nodepool.
  4. Remove extra nodepool resource from Terraform config and from state using
    terraform state rm azurerm_kubernetes_cluster_node_pool.this[\"mynodepool\"].
  5. Change default_node_pool.name = "mynodepool" in Terraform config, because these new nodepool became default.

@ksandermann
Copy link

@mrowken Thanks for the hint - I also read that approach on the terraform github issue.
Unfortunately it's not a feasible option for us running 60 production clusters :/

@EPinci
Copy link

EPinci commented Oct 8, 2020

@ksandermann Maybe you can give a try on building a script that does:

az aks nodepool add --name temppool [...] --mode system
az aks nodepool delete --name oldpoolname [...]
az aks nodepool add --name oldpoolname [...] --mode system
az aks nodepool delete --name temppool [...]

The tricky bit would be passing to the last create parameters to make it exactly as you defined on terraform (the defaultpool parameters) so that on its next run it wont be detected as new and/or require redeployment.
You can code some delays or more elaborate cross scalings and you can then apply it to all your cluster?

@ksandermann
Copy link

@EPinci much appriciated, that definetly sth that might work after some finetuning on development clusters.

Anyway, as 1.18.9 will completely fix all our clusters out-of-the box, we decided to wait for it :)

@nidiculageorge
Copy link

Hi Team,

I did an Upgrade of the cluster to 1.18.8 .One of of the Nodes in Windows is in Not Ready State

image

When I did a describe node

Type Reason Age From Message


Normal NodeReady 36m (x2 over 3h9m) kubelet, aksnodwin000000 Node aksnodwin000000 status is now: NodeReady
Normal NodeHasSufficientMemory 24m (x9 over 3h9m) kubelet, aksnodwin000000 Node aksnodwin000000 status is now: NodeHasSufficientMemory
Warning ContainerGCFailed 13m (x2 over 37m) kubelet, aksnodwin000000 rpc error: code = Unknown desc = Cannot connect to the Docker
daemon at npipe:////./pipe/docker_engine. Is the docker daemon running?
Warning ContainerGCFailed 3m59s (x6 over 53m) kubelet, aksnodwin000000 rpc error: code = DeadlineExceeded desc = context deadline exceeded
Normal NodeNotReady 48s (x12 over 55m) kubelet, aksnodwin000000 Node aksnodwin000000 status is now: NodeNotReady

Any help !!!!

@palma21
Copy link
Member Author

palma21 commented Nov 11, 2020

@nidiculageorge that is a windows node, not related to this feature, has your upgrade concluded? It's normal for them to go through not ready phase during upgrade. In any case I'd recommend opening a ticket if the node didn't recover.

@palma21 palma21 closed this as completed Nov 11, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
@aritraghosh aritraghosh moved this to Archive (GA older than 1 month) in Azure Kubernetes Service Roadmap (Public) Jul 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Requested Features
Projects
Development

No branches or pull requests