This repository has been archived by the owner on Oct 24, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 521
Windows microsoft-aks VHD images not available #3700
Labels
bug
Something isn't working
Comments
4 tasks
This was referenced Aug 14, 2020
Merged
when aks-engine v0.54.0 will be hotfixed? our upstream pipeline all depends on this version, thanks. |
This was referenced Aug 15, 2020
Hi @andyzhangx, v0.54.1 has been released: https://github.com/Azure/aks-engine/releases/tag/v0.54.1 |
Patch releases have been provided . |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Updated: this issue is mitigated for new cluster creates w/ the following AKS Engine versions:
Also, for historical purposes, note that any version of AKS Engine after v0.54.1 will not be effected.
This issue describes error outcomes due to an Azure incident beginning approximately Thursday August 13 at 10 p.m. PST:
All clusters built with a reference to the
microsoft-aks
Windows VHD image reference were to scale, and all new clusters with Windows node referring to that VHD were not able to be created.This was the result of all Windows VHD images being deleted. Replacement Windows VHD images were built to enable new clusters w/ Windows node pools.
How do I know if I'm affected?
If you're running a Kubernetes cluster created by any version of AKS Engine with a Windows node pool (
"osType": "Windows"
agentPoolProfiles
configuration in your api model), then your cluster may have be effected by this incident. If you're running vanilla VMs (in other words, VMs in an an availability set, and not VMSS), then the guidance is to wait until a future AKS Engine release before performing any scale operations (see above list to determine if a suitable AKS Engine patch version is available for you). If you're running a VMSS Windows node pool, AKS Engine engineers have updated your VMSS model to ensure that future scale operations refer to a working VHD reference suitable for your cluster.How do I know if my VMSS is affected?
Updated: All existing VMSS clusters have been patched in the backend to ensure that VMSS models point to a working VHD image reference. Scale operations using the VMSS API will work as expected.
The following example commands assume that you have the
az
CLI, and that you have the open sourcejq
tool to perform JSON queries against the JSON output fromaz
. Also, we assume you have exported the subscription ID and resource group of the cluster as theSUBSCRIPTION_ID
andRESOURCE_GROUP
environment variables, respectively.On any candidate cluster that you suspect may be affected, you can query all VMSS in the resource group and look for those that are using the affected "aks-windows" image reference:
If you get any VMSS names listed from the above command, then you are running Kubernetes nodes affected by the above incident. Future scale out operations will work as a result of a backend update to the VMSS model. For your next cluster deployment, you must use one of the above listed patch versions to create your cluster using
aks-engine
.How do I know if my VMAS (availability set or "vanilla" VMs) is affected?
Again, if any VMAS are listed from the above command, you were effected.
What's the current guidance?
We are in the process of publishing patch releases for every affected AKS Engine version. Status of patches:
For new cluster create operations, we recommend using the patch version that corresponds with the "known-working" AKS Engine version used in your environment to bootstrap Kubernetes. For example, if you have had success creating Kubernetes clusters using AKS Engine v0.54.0, use v0.54.1 for your next cluster create operation.
For scaling existing clusters, you will want to replace your existing version of the
aks-engine
binary with its corresponding patch release. You can runaks-engine version
to discover which version you're using to run scale operations against your cluster:Again: for cluster running VMSS Windows node pools, those VMSS's have been automatically updated to reflect the new, working image references. If you use the Azure VMSS interface (via UI, or CLI, or SDK), or cluster-autoscaler, to scale Windows nodes in your cluster, you do not need to take any action. However, if you are using vanilla Windows VMs (VM Availability Sets, or VMAS), then you'll need to get the patched
aks-engine
binary to continue using that to scale your clusters.The text was updated successfully, but these errors were encountered: