Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement gMSA for Windows upstream kubernetes tests #2208

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,7 @@ _releasenotes

# calico manifests archive
release-*/manifests/calico-*.yaml

# gmsa files
scripts/gmsa/domain.init
gmsa-spec-writer-output.txt
12 changes: 12 additions & 0 deletions docs/book/src/developers/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
- [Mocks](#mocks)
- [E2E Testing](#e2e-testing)
- [Conformance Testing](#conformance-testing)
- [Windows gMSA conformance tests](#windows-gmsa-conformance-tests)
- [Running custom test suites on CAPZ clusters](#running-custom-test-suites-on-capz-clusters)

<!-- /TOC -->
Expand Down Expand Up @@ -537,6 +538,17 @@ With the following environment variables defined, CAPZ runs `./scripts/ci-build-
| `REGISTRY` | Your Registry |
| `TEST_K8S` | `true` |

##### Windows gMSA conformance tests

The Windows gMSA tests use the [KeyVault gMSA CCG plugin](https://github.com/microsoft/Azure-Key-Vault-Plugin-gMSA). The gMSA tests require additional setup to run:

- A VM image with the [Key Vault plugin installed](https://github.com/kubernetes-sigs/image-builder/pull/835)
- A [one time script](../../../../scripts/gmsa/setup-gmsa.sh) to run on the subscription that will provision a Key Vault, Azure Managed Identities and configure access to the Key Vault.
- On each run, the `./scripts/ci-conformance.sh` will [provision a VM](../../../../scripts/gmsa/ci-gmsa.sh) to act as the Domain Controller. The Domain controller will initialize itself and set the required values in the Key Vault.

After the cluster is created the e2e suite does some additional setup on the cluster. It ensures the appropriate secrets are set in the Key Vault then makes sure the required files are on the cluster
Worker Nodes for the test. These requirements are documented in the [gMSA e2e test](https://github.com/kubernetes/kubernetes/blob/885f14d162471dfc9a3f8d4c46430805cf6be828/test/e2e/windows/gmsa_full.go#L17-L37). More details on requirements and implementation are in the [gMSA issue](https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/1860). At the end of a test run the secrets and Domain Controller VM are removed unless `SKIP_CLEANUP` is set.

#### Running custom test suites on CAPZ clusters

To run a custom test suite on a CAPZ cluster locally, set `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID` and run:
Expand Down
23 changes: 23 additions & 0 deletions scripts/ci-conformance.sh
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,35 @@ AZURE_SSH_PUBLIC_KEY=$(< "${AZURE_SSH_PUBLIC_KEY_FILE}" tr -d '\r\n')
export AZURE_SSH_PUBLIC_KEY

cleanup() {
# clean up GMSA NODE RG
if [[ "${SKIP_CLEANUP:-}" != "true" && -n ${GMSA_ID:-} ]]; then
echo "Cleaning up gMSA resources $GMSA_NODE_RG with keyvault $CI_RG-gmsa"
az keyvault secret list --vault-name "$CI_RG"-gmsa --query "[? contains(name, '${GMSA_ID}')].name" -o tsv | while read -r secret ; do
az keyvault secret delete -n "$secret" --vault-name "$CI_RG"-gmsa
done

az group delete --name "$GMSA_NODE_RG" --no-wait -y --force-deletion-types=Microsoft.Compute/virtualMachines,Microsoft.Compute/virtualMachineScaleSets
fi

"${REPO_ROOT}/hack/log/redact.sh" || true
}

trap cleanup EXIT

if [[ "${WINDOWS}" == "true" ]]; then
if [[ $KUBETEST_WINDOWS_CONFIG =~ "windows-serial-slow" ]]; then
Copy link
Contributor

@marosset marosset Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this bit documented anywhere?
I can image someone having a bad day trying to troubleshoot why GMSA tests aren't running correctly somewhere in PROW only to find this conditional...
Maybe we could at least add a log line like 'Skipping GMSA configuration' if we aren't performing the config to help debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't. We currently run the GMSA tests in the serial/slow jobs. There isn't a strict requirement for this and in fact the tests are pretty fast, just to initial set up of the cluster and domain is slow.

I used this because i didn't want to introduce yet another ENV but maybe it would be better to have it as additional setup? It would make your suggestion and debugging simpler.

export CI_RG="${CI_RG:-capz-ci}"
export GMSA_ID="${RANDOM}"
export GMSA_NODE_RG="gmsa-dc-${GMSA_ID}"

echo "setting up domain vm in $GMSA_NODE_RG with keyvault $CI_RG-gmsa"
"${REPO_ROOT}/scripts/gmsa/ci-gmsa.sh"

# export the ip Address so it can be used in e2e test
vmname="dc-${GMSA_ID}"
vmip=$(az vm list-ip-addresses -n ${vmname} -g $GMSA_NODE_RG --query "[?virtualMachine.name=='$vmname'].virtualMachine.network.privateIpAddresses" -o tsv)
export GMSA_DNS_IP=$vmip
fi
make test-windows-upstream
else
make test-conformance
Expand Down
211 changes: 211 additions & 0 deletions scripts/gmsa/ci-gmsa.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
#!/bin/bash

# Copyright 2022 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

###############################################################################

set -o errexit
set -o nounset
set -o pipefail

REPO_ROOT=$(dirname "${BASH_SOURCE[0]}")/../..
cd "${REPO_ROOT}" || exit 1

# shellcheck source=hack/ensure-azcli.sh
source "${REPO_ROOT}/hack/ensure-azcli.sh"
# shellcheck source=hack/parse-prow-creds.sh
source "${REPO_ROOT}/hack/parse-prow-creds.sh"
# shellcheck source=hack/ensure-tags.sh
source "${REPO_ROOT}/hack/ensure-tags.sh"

ENVSUBST="${REPO_ROOT}/hack/tools/bin/envsubst"
cd "${REPO_ROOT}" && make "${ENVSUBST##*/}"

CI_RG="${CI_RG:-capz-ci}"
GMSA_NODE_RG="${GMSA_NODE_RG:-gmsa-dc}"
AZURE_LOCATION="${AZURE_LOCATION:-westus2}"
GMSA_KEYVAULT="${CI_RG}-gmsa"

# The VM requires setup that needs Role Assignment permissions
# This script checks that all that has been configured properly before creating the Azure VM
main() {
if [[ "$(az group exists --name "${CI_RG}")" == "false" ]]; then
echo "Requires pre-requisite that resource group ${CI_RG} exists"
exit 1
fi

keyvaultid=$(az keyvault show --name "${GMSA_KEYVAULT}" -g "$CI_RG" --query "id" || true)
if [[ -z $keyvaultid ]]; then
echo "Requires pre-requisite that keyvault ${GMSA_KEYVAULT} exists"
exit 1
fi

# Give permissions to write to keyvault during the domain creation to create secrets that will be used during test
domainPricipalId=$(az identity show --name domain-vm-identity --resource-group "$CI_RG" --query 'principalId' -o tsv || true)
domainId=$(az identity show --name domain-vm-identity --resource-group "$CI_RG" --query 'id' -o tsv || true)
if [[ -z $domainPricipalId ]]; then
echo "Requires pre-requisite that user identity 'domain-vm-identity' exists"
exit 1
fi

# the powershell commandlet Get-AzUserAssignedIdentity requires ability to read subid which is granted via this custom role
# see the setup-gmsa.sh for custom role creation
customSubRole=$(az role assignment list --assignee "$domainPricipalId" --query "[?roleDefinitionName=='gMSA']" || true)
if [[ $customSubRole == "[]" ]]; then
echo "The domain-vm-identity must have custom role 'gMSA'"
exit 1
fi

# this identity needs to be assigned to the the Worker nodes that is labeled during e2e set up.
userId=$(az identity show --name gmsa-user-identity --resource-group "$CI_RG" --query 'principalId' -o tsv || true)
if [[ -z $userId ]]; then
echo "Requires pre-requisite that user identity 'gmsa-user-identity' exists"
exit 1
fi

echo "Pre-reqs are met for creating Domain vm"
# the custom-data contains scripts to
# - turn this vm into a domain
# - vm is created in vnet that doesn't overlap with default capz cluster vnets
# - creates a domain admin and gmsa users
# - uploads secrets to the keyvault
# - creates a gmsa yaml spec in location c:\gmsa\gmsa-cred-spec-gmsa-e2e.yml
# this is a random temp password which gets replaced by the cloudbase-init
if [[ "$(az group exists --name "${GMSA_NODE_RG}")" == "false" ]]; then
az group create --name "$GMSA_NODE_RG" --location "$AZURE_LOCATION" --tags creationTimestamp="$TIMESTAMP"
fi

winpass=$(openssl rand -base64 32)
vmname="dc-${GMSA_ID}"
vmid=$(az vm show -n "$vmname" -g "$GMSA_NODE_RG" --query "id" || true)
if [[ -z $vmid ]]; then
echo "Creating Domain vm"
GMSA_DOMAIN_ENVSUBST="${REPO_ROOT}/scripts/gmsa/domain.init"
GMSA_DOMAIN_FILE="${REPO_ROOT}/scripts/gmsa/domain.init.tmpl"
$ENVSUBST < "$GMSA_DOMAIN_FILE" > "$GMSA_DOMAIN_ENVSUBST"
az vm create -l "$AZURE_LOCATION" -g "$GMSA_NODE_RG" -n "$vmname" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you consider using the go sdk to run these prerequisites directly in the test suite instead of using the az cli in a script? similar to what we do for private cluster custom vnet setup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, having the creation of the domain outside the test suite made it so it could be used across test entry point scripts or by someone outside the project for their testing. This still requires a few additional set before being to run tests so maybe bringing it in to test suite would be fine now.

The other aspect of this, is that testing the domain creation was cumbersome and having it out side the test suite made it easier to iterate on without having to modify the rest of the tests to be skipped why working on it.

I listed a bunch ideas in #1860 (comment)

--image cncf-upstream:capi-windows:k8s-1dot23dot5-windows-2019-containerd:2022.03.30 \
--admin-user 'azureuser' \
--admin-password "$winpass" \
--custom-data "${GMSA_DOMAIN_ENVSUBST}" \
--assign-identity "$domainId" \
--public-ip-address "" \
--subnet-address-prefix 172.16.0.0/24 \
--vnet-address-prefix 172.16.0.0/16 \
--vnet-name "${vmname}-vnet" \
--nsg "${vmname}-nsg" \
--size Standard_D4s_v3
fi

bastionId=$(az network bastion show -n gmsa-bastion -g "$GMSA_NODE_RG" --query "id" || true)
if [[ -z $bastionId && ${GMSA_BASTION:-} == "true" ]]; then
echo "Create bastion for Domain vm"
# Required inbound rules for AzureBastionSubnet
# https://docs.microsoft.com/en-us/azure/bastion/bastion-nsg
az network nsg rule create -g "$GMSA_NODE_RG" \
-n Allow-HttpsInbound \
--access allow \
--destination-address-prefix '*' \
--destination-port-range 443 \
--direction inbound \
--nsg-name "${vmname}-nsg" \
--protocol tcp \
--source-address-prefix Internet \
--source-port-range '*' \
--priority 120

az network nsg rule create -g "$GMSA_NODE_RG" \
-n Allow-GatewayManagerInboud \
--access allow \
--destination-address-prefix '*' \
--destination-port-range 443 \
--direction inbound \
--nsg-name "${vmname}-nsg" \
--protocol tcp \
--source-address-prefix GatewayManager \
--source-port-range '*' \
--priority 130

az network nsg rule create -g "$GMSA_NODE_RG" \
-n Allow-BastionHostCommunication \
--access allow \
--destination-address-prefix VirtualNetwork \
--destination-port-range 8080 5701 \
--direction inbound \
--nsg-name "${vmname}-nsg" \
--protocol '*' \
--source-address-prefix VirtualNetwork \
--source-port-range '*' \
--priority 150

# Required outbound rules for AzureBastionSubnet
# https://docs.microsoft.com/en-us/azure/bastion/bastion-nsg
az network nsg rule create -g "$GMSA_NODE_RG" \
-n Allow-SshRdpOutbound \
--access allow \
--destination-address-prefix VirtualNetwork \
--destination-port-range 22 3389 \
--direction outbound \
--nsg-name "${vmname}-nsg" \
--protocol '*' \
--source-address-prefix '*' \
--source-port-range '*' \
--priority 100

az network nsg rule create -g "$GMSA_NODE_RG" \
-n Allow-AzureCloudoutbound \
--access allow \
--destination-address-prefix AzureCloud \
--destination-port-range 443 \
--direction outbound \
--nsg-name "${vmname}-nsg" \
--protocol 'tcp' \
--source-address-prefix '*' \
--source-port-range '*' \
--priority 150

az network nsg rule create -g "$GMSA_NODE_RG" \
-n Allow-BastionCommunication \
--access allow \
--destination-address-prefix VirtualNetwork \
--destination-port-range 8080 5701 \
--direction outbound \
--nsg-name "${vmname}-nsg" \
--protocol '*' \
--source-address-prefix VirtualNetwork \
--source-port-range '*' \
--priority 120

az network nsg rule create -g "$GMSA_NODE_RG" \
-n Allow-GetSessionInfomation \
--access allow \
--destination-address-prefix Internet \
--destination-port-range 80 \
--direction outbound \
--nsg-name "${vmname}-nsg" \
--protocol '*' \
--source-address-prefix '*' \
--source-port-range '*' \
--priority 130

az network vnet subnet create -g "$GMSA_NODE_RG" --vnet-name "${vmname}-vnet" -n AzureBastionSubnet \
--address-prefixes 172.16.1.0/24 --network-security-group "${vmname}-nsg"

az network public-ip create --resource-group "$GMSA_NODE_RG" --name bastion-gmsa --sku Standard
az network bastion create --name gmsa-bastion --public-ip-address bastion-gmsa --resource-group "$GMSA_NODE_RG" --vnet-name "${vmname}-vnet"
fi
}

main
Loading