Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the E2E supporting VKS data mover environment. #8371

Merged
merged 3 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion test/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -41,25 +41,41 @@ help: ## Display this help

TOOLS_DIR := $(REPO_ROOT)/hack/tools
BIN_DIR := bin

# Try to not modify PATH if possible
GOBIN := $(REPO_ROOT)/.go/bin

TOOLS_BIN_DIR := $(TOOLS_DIR)/$(BIN_DIR)

GINKGO := $(GOBIN)/ginkgo

KUSTOMIZE := $(TOOLS_BIN_DIR)/kustomize

OUTPUT_DIR := _output/$(GOOS)/$(GOARCH)/bin

# Please reference to this document for Ginkgo label spec format.
# https://onsi.github.io/ginkgo/#spec-labels
GINKGO_LABELS ?=

# When --fail-fast is set, the entire suite will stop when the first failure occurs.
# Enable --fail-fast by default.
# https://onsi.github.io/ginkgo/#mental-model-how-ginkgo-handles-failure
FAIL_FAST ?= false

VELERO_CLI ?=$$(pwd)/../_output/bin/$(GOOS)/$(GOARCH)/velero

VELERO_IMAGE ?= velero/velero:main

PLUGINS ?=

# Flag used to tell E2E whether the Velero vSphere plugin is installed.
HAS_VSPHERE_PLUGIN ?= false

RESTORE_HELPER_IMAGE ?=

#Released version only
UPGRADE_FROM_VELERO_VERSION ?= v1.13.2,v1.14.1

# UPGRADE_FROM_VELERO_CLI can has the same format(a list divided by comma) with UPGRADE_FROM_VELERO_VERSION
# Upgrade tests will be executed sequently according to the list by UPGRADE_FROM_VELERO_VERSION
# So although length of UPGRADE_FROM_VELERO_CLI list is not equal with UPGRADE_FROM_VELERO_VERSION
Expand Down Expand Up @@ -150,7 +166,8 @@ COMMON_ARGS := --velerocli=$(VELERO_CLI) \
--velero-server-debug-mode=$(VELERO_SERVER_DEBUG_MODE) \
--uploader-type=$(UPLOADER_TYPE) \
--debug-velero-pod-restart=$(DEBUG_VELERO_POD_RESTART) \
--fail-fast=$(FAIL_FAST)
--fail-fast=$(FAIL_FAST) \
--has-vsphere-plugin=$(HAS_VSPHERE_PLUGIN)

# Make sure ginkgo is in $GOBIN
.PHONY:ginkgo
Expand Down
49 changes: 44 additions & 5 deletions test/e2e/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ These configuration parameters are expected as values to the following command l
1. `--standby-cluster-object-store-provider`: Object store provider for standby cluster.
1. `--debug-velero-pod-restart`: A switch for debugging velero pod restart.
1. `--fail-fast`: A switch for for failing fast on meeting error.
1. `--has-vsphere-plugin`: A switch to indicate whether the Velero vSphere plugin is installed for vSphere environment.

These configurations or parameters are used to generate install options for Velero for each test suite.

Expand Down Expand Up @@ -129,12 +130,13 @@ Below is a mapping between `make` variables to E2E configuration flags.
1. `INSTALL_VELERO `: `-install-velero`. Optional.
1. `DEBUG_VELERO_POD_RESTART`: `-debug-velero-pod-restart`. Optional.
1. `FAIL_FAST`: `--fail-fast`. Optional.
1. `HAS_VSPHERE_PLUGIN`: `--has-vsphere-plugin`. Optional.



### Examples

Basic examples:
#### Basic examples:

1. Run Velero tests in a kind cluster with AWS (or MinIO) as the storage provider:

Expand Down Expand Up @@ -208,7 +210,7 @@ ADDITIONAL_CREDS_FILE=/path/to/azure-creds \
make test-e2e
```

Upgrade examples:
#### Upgrade examples:

1. Run Velero upgrade tests with pre-upgrade version:

Expand All @@ -234,7 +236,7 @@ UPGRADE_FROM_VELERO_VERSION=v1.10.2,v1.11.0 \
make test-e2e
```

Migration examples:
#### Migration examples:

1. Migration between 2 cluster of the same provider tests:

Expand Down Expand Up @@ -275,7 +277,7 @@ GINKGO_LABELS="Migration" \
make test-e2e
```

## 5. Filtering tests
#### Filtering tests

In release-1.15, Velero bumps the [Ginkgo](https://onsi.github.io/ginkgo/) version to [v2](https://onsi.github.io/ginkgo/MIGRATING_TO_V2).
Velero E2E start to use [labels](https://onsi.github.io/ginkgo/#spec-labels) to filter cases instead of [`-focus` and `-skip`](https://onsi.github.io/ginkgo/#focused-specs) parameters.
Expand All @@ -285,7 +287,6 @@ Both `make run-e2e` and `make run-perf` CLI support using parameter `GINKGO_LABE
`GINKGO_LABELS` is interpreted into `ginkgo run` CLI's parameter [`--label-filter`](https://onsi.github.io/ginkgo/#spec-labels).


### Examples
E2E tests can be run with specific cases to be included and/or excluded using the commands below:

1. Run Velero tests with specific cases to be included:
Expand Down Expand Up @@ -316,6 +317,44 @@ In this example, cases are labelled as
* `Migration` and `Restic`
will be skipped.

#### VKS environment test
1. Run the CSI data mover test.

`HAS_VSPHERE_PLUGIN` should be set to `false` to not install the Velero vSphere plugin.
``` bash
CLOUD_PROVIDER=vsphere \
DEFAULT_CLUSTER=wl-antreav1301 \
STANDBY_CLUSTER=wl-antreav1311 \
DEFAULT_CLUSTER_NAME=192.168.0.4 \
STANDBY_CLUSTER_NAME=192.168.0.3 \
FEATURES=EnableCSI \
PLUGINS=gcr.io/velero-gcp/velero-plugin-for-aws:main \
HAS_VSPHERE_PLUGIN=false \
OBJECT_STORE_PROVIDER=aws \
CREDS_FILE=$HOME/aws-credential \
BSL_CONFIG=region=us-east-1 \
BSL_BUCKET=nightly-normal-account4-test \
BSL_PREFIX=nightly \
ADDITIONAL_BSL_PLUGINS=gcr.io/velero-gcp/velero-plugin-for-aws:main \
ADDITIONAL_OBJECT_STORE_PROVIDER=aws \
ADDITIONAL_BSL_CONFIG=region=us-east-1 \
ADDITIONAL_BSL_BUCKET=nightly-restrict-account-test \
ADDITIONAL_BSL_PREFIX=nightly \
ADDITIONAL_CREDS_FILE=$HOME/aws-credential \
VELERO_IMAGE=gcr.io/velero-gcp/velero:main \
RESTORE_HELPER_IMAGE=gcr.io/velero-gcp/velero-restore-helper:main \
VERSION=main \
SNAPSHOT_MOVE_DATA=true \
STANDBY_CLUSTER_CLOUD_PROVIDER=vsphere \
STANDBY_CLUSTER_OBJECT_STORE_PROVIDER=aws \
STANDBY_CLUSTER_PLUGINS=gcr.io/velero-gcp/velero-plugin-for-aws:main \
DISABLE_INFORMER_CACHE=true \
REGISTRY_CREDENTIAL_FILE=$HOME/.docker/config.json \
GINKGO_LABELS=Migration \
KIBISHII_DIRECTORY=$HOME/kibishii/kubernetes/yaml/ \
make test-e2e
```

## 6. Full Tests execution

As we provided several examples for E2E test execution, if no filter is involved and despite difference of test environment,
Expand Down
8 changes: 4 additions & 4 deletions test/e2e/backup/backup.go
Original file line number Diff line number Diff line change
Expand Up @@ -197,9 +197,9 @@ func BackupRestoreTest(backupRestoreTestConfig BackupRestoreTestConfig) {
secretKey,
)).To(Succeed())

bsls := []string{"default", additionalBsl}
BSLs := []string{"default", additionalBsl}

for _, bsl := range bsls {
for _, bsl := range BSLs {
backupName = fmt.Sprintf("backup-%s", bsl)
restoreName = fmt.Sprintf("restore-%s", bsl)
// We limit the length of backup name here to avoid the issue of vsphere plugin https://github.com/vmware-tanzu/velero-plugin-for-vsphere/issues/370
Expand All @@ -209,8 +209,8 @@ func BackupRestoreTest(backupRestoreTestConfig BackupRestoreTestConfig) {
restoreName = fmt.Sprintf("%s-%s", restoreName, UUIDgen)
}
veleroCfg.ProvideSnapshotsVolumeParam = !provideSnapshotVolumesParmInBackup
workloadNmespace := kibishiiNamespace + bsl
Expect(RunKibishiiTests(veleroCfg, backupName, restoreName, bsl, workloadNmespace, useVolumeSnapshots, !useVolumeSnapshots)).To(Succeed(),
workloadNS := kibishiiNamespace + bsl
Expect(RunKibishiiTests(veleroCfg, backupName, restoreName, bsl, workloadNS, useVolumeSnapshots, !useVolumeSnapshots)).To(Succeed(),
"Failed to successfully backup and restore Kibishii namespace using BSL %s", bsl)
}
})
Expand Down
82 changes: 55 additions & 27 deletions test/e2e/backups/deletion.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ import (
. "github.com/vmware-tanzu/velero/test/util/velero"
)

// Test backup and restore of Kibishi using restic
// Test backup and restore of Kibishii using restic

func BackupDeletionWithSnapshots() {
backup_deletion_test(true)
Expand Down Expand Up @@ -99,8 +99,6 @@ func runBackupDeletionTests(client TestClient, veleroCfg VeleroConfig, backupLoc
providerName := veleroCfg.CloudProvider
veleroNamespace := veleroCfg.VeleroNamespace
registryCredentialFile := veleroCfg.RegistryCredentialFile
bslPrefix := veleroCfg.BSLPrefix
bslConfig := veleroCfg.BSLConfig
veleroFeatures := veleroCfg.Features
for _, ns := range workloadNamespaceList {
if err := CreateNamespace(oneHourTimeout, client, ns); err != nil {
Expand Down Expand Up @@ -143,7 +141,8 @@ func runBackupDeletionTests(client TestClient, veleroCfg VeleroConfig, backupLoc
})
})
for _, ns := range workloadNamespaceList {
if providerName == Vsphere && useVolumeSnapshots {
if useVolumeSnapshots &&
veleroCfg.HasVspherePlugin {
// Wait for uploads started by the Velero Plugin for vSphere to complete
// TODO - remove after upload progress monitoring is implemented
fmt.Println("Waiting for vSphere uploads to complete")
Expand All @@ -152,7 +151,7 @@ func runBackupDeletionTests(client TestClient, veleroCfg VeleroConfig, backupLoc
}
}
}
err = ObjectsShouldBeInBucket(veleroCfg.ObjectStoreProvider, veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, bslPrefix, bslConfig, backupName, BackupObjectsPrefix)
err = ObjectsShouldBeInBucket(veleroCfg.ObjectStoreProvider, veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, veleroCfg.BSLPrefix, veleroCfg.BSLConfig, backupName, BackupObjectsPrefix)
if err != nil {
return err
}
Expand All @@ -164,9 +163,12 @@ func runBackupDeletionTests(client TestClient, veleroCfg VeleroConfig, backupLoc
for _, ns := range workloadNamespaceList {
snapshotCheckPoint, err = GetSnapshotCheckPoint(client, veleroCfg, DefaultKibishiiWorkerCounts, ns, backupName, KibishiiPVCNameList)
Expect(err).NotTo(HaveOccurred(), "Fail to get Azure CSI snapshot checkpoint")
err = SnapshotsShouldBeCreatedInCloud(veleroCfg.CloudProvider,
veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, bslConfig,
backupName, snapshotCheckPoint)
err = CheckSnapshotsInProvider(
veleroCfg,
backupName,
snapshotCheckPoint,
false,
)
if err != nil {
return errors.Wrap(err, "exceed waiting for snapshot created in cloud")
}
Expand All @@ -178,9 +180,12 @@ func runBackupDeletionTests(client TestClient, veleroCfg VeleroConfig, backupLoc
Expect(err).NotTo(HaveOccurred(), "Fail to get Azure CSI snapshot checkpoint")

// Get all snapshots base on backup name, regardless of namespaces
err = SnapshotsShouldBeCreatedInCloud(veleroCfg.CloudProvider,
veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, bslConfig,
backupName, snapshotCheckPoint)
err = CheckSnapshotsInProvider(
veleroCfg,
backupName,
snapshotCheckPoint,
false,
)
if err != nil {
return errors.Wrap(err, "exceed waiting for snapshot created in cloud")
}
Expand All @@ -206,26 +211,34 @@ func runBackupDeletionTests(client TestClient, veleroCfg VeleroConfig, backupLoc
return err
}

// Verify snapshots are deleted after backup deletion.
if useVolumeSnapshots {
err = SnapshotsShouldNotExistInCloud(veleroCfg.CloudProvider,
veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, veleroCfg.BSLConfig,
backupName, snapshotCheckPoint)
snapshotCheckPoint.ExpectCount = 0
err = CheckSnapshotsInProvider(
veleroCfg,
backupName,
snapshotCheckPoint,
false,
)
if err != nil {
return errors.Wrap(err, "exceed waiting for snapshot created in cloud")
return errors.Wrap(err, "fail to verify snapshots are deleted in provider.")
}
}

err = ObjectsShouldNotBeInBucket(veleroCfg.ObjectStoreProvider, veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, bslPrefix, bslConfig, backupName, BackupObjectsPrefix, 5)
// Verify backup metadata files are deleted in OSS after backup deletion.
err = ObjectsShouldNotBeInBucket(
veleroCfg.ObjectStoreProvider,
veleroCfg.CloudCredentialsFile,
veleroCfg.BSLBucket,
veleroCfg.BSLPrefix,
veleroCfg.BSLConfig,
backupName,
BackupObjectsPrefix,
5,
)
if err != nil {
return err
}
if useVolumeSnapshots {
if err := SnapshotsShouldNotExistInCloud(veleroCfg.CloudProvider,
veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket,
bslConfig, backupName, snapshotCheckPoint); err != nil {
return errors.Wrap(err, "exceed waiting for snapshot created in cloud")
}
}

// Hit issue: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/errors-overview.html#:~:text=SnapshotCreationPerVolumeRateExceeded
// Sleep for more than 15 seconds to avoid this issue.
Expand All @@ -242,13 +255,28 @@ func runBackupDeletionTests(client TestClient, veleroCfg VeleroConfig, backupLoc
})
})

err = DeleteObjectsInBucket(veleroCfg.ObjectStoreProvider, veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, bslPrefix, bslConfig, backupName, BackupObjectsPrefix)
if err != nil {
if err := DeleteObjectsInBucket(
veleroCfg.ObjectStoreProvider,
veleroCfg.CloudCredentialsFile,
veleroCfg.BSLBucket,
veleroCfg.BSLPrefix,
veleroCfg.BSLConfig,
backupName,
BackupObjectsPrefix,
); err != nil {
return err
}

err = ObjectsShouldNotBeInBucket(veleroCfg.ObjectStoreProvider, veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, bslPrefix, bslConfig, backupName, BackupObjectsPrefix, 1)
if err != nil {
if err := ObjectsShouldNotBeInBucket(
veleroCfg.ObjectStoreProvider,
veleroCfg.CloudCredentialsFile,
veleroCfg.BSLBucket,
veleroCfg.BSLPrefix,
veleroCfg.BSLConfig,
backupName,
BackupObjectsPrefix,
1,
); err != nil {
return err
}

Expand Down
36 changes: 26 additions & 10 deletions test/e2e/backups/ttl.go
Original file line number Diff line number Diff line change
Expand Up @@ -122,19 +122,31 @@ func TTLTest() {

var snapshotCheckPoint SnapshotCheckPoint
if useVolumeSnapshots {
if veleroCfg.CloudProvider == Vsphere {
// TODO - remove after upload progress monitoring is implemented
if veleroCfg.HasVspherePlugin {
blackpiglet marked this conversation as resolved.
Show resolved Hide resolved
By("Waiting for vSphere uploads to complete", func() {
Expect(WaitForVSphereUploadCompletion(ctx, time.Hour,
test.testNS, 2)).To(Succeed())
})
}
snapshotCheckPoint, err = GetSnapshotCheckPoint(client, veleroCfg, 2, test.testNS, test.backupName, KibishiiPVCNameList)
Expect(err).NotTo(HaveOccurred(), "Fail to get Azure CSI snapshot checkpoint")

Expect(SnapshotsShouldBeCreatedInCloud(veleroCfg.CloudProvider,
veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, veleroCfg.BSLConfig,
test.backupName, snapshotCheckPoint)).NotTo(HaveOccurred(), "Fail to get Azure CSI snapshot checkpoint")
snapshotCheckPoint, err = GetSnapshotCheckPoint(
client,
veleroCfg,
2,
test.testNS,
test.backupName,
KibishiiPVCNameList,
)
Expect(err).NotTo(HaveOccurred(), "Fail to get snapshot checkpoint")

Expect(
CheckSnapshotsInProvider(
veleroCfg,
test.backupName,
snapshotCheckPoint,
false,
),
).NotTo(HaveOccurred(), "Fail to verify the created snapshots")
}

By(fmt.Sprintf("Simulating a disaster by removing namespace %s\n", BackupCfg.BackupName), func() {
Expand Down Expand Up @@ -188,9 +200,13 @@ func TTLTest() {

By("PersistentVolume snapshots should be deleted", func() {
if useVolumeSnapshots {
Expect(SnapshotsShouldNotExistInCloud(veleroCfg.CloudProvider,
veleroCfg.CloudCredentialsFile, veleroCfg.BSLBucket, veleroCfg.BSLConfig,
test.backupName, snapshotCheckPoint)).NotTo(HaveOccurred(), "Fail to get Azure CSI snapshot checkpoint")
snapshotCheckPoint.ExpectCount = 0
Expect(CheckSnapshotsInProvider(
veleroCfg,
test.backupName,
snapshotCheckPoint,
false,
)).NotTo(HaveOccurred(), "Fail to get Azure CSI snapshot checkpoint")
blackpiglet marked this conversation as resolved.
Show resolved Hide resolved
}
})

Expand Down
Loading
Loading