CAPV: Release v29.0.0. #1459

njuettner · 2024-10-22T08:38:05Z

Towards: giantswarm/roadmap#3710

Checklist

Roadmap issue created
Release uses latest stable Flatcar
Release uses latest Kubernetes patch version

Triggering E2E tests

To trigger the E2E test for each new Release added in this PR, add a comment with the following:

/run releases-test-suites

If you want to trigger conformance tests, you can do so by adding a comment similar to the following:

/run conformance-tests PROVIDER=capa RELEASE_VERSION=29.1.0

For more details see the README.md.

vsphere/v29.0.0/release.yaml

vsphere/v29.0.0/release.diff

vsphere/v29.0.0/README.md

TheoBrigitte

Can we have observability-bundle 1.7.0 as part of this release ?

vsphere/v29.0.0/README.md

vsphere/v29.0.0/release.diff

vsphere/v29.0.0/release.yaml

Gacko · 2024-10-24T22:27:17Z

@vxav: Tests are failing, because the image hasn't been copied to vSphere, yet. Can you make sure the image for Flatcar 3975.2.2, Kubernetes 1.29.10 and OS Tooling 1.20.1 is present? Thank you!

njuettner · 2024-10-25T08:05:53Z

@vxav: Tests are failing, because the image hasn't been copied to vSphere, yet. Can you make sure the image for Flatcar 3975.2.2, Kubernetes 1.29.10 and OS Tooling 1.20.1 is present? Thank you!

Copied it 👍🏻

Gacko · 2024-10-31T15:20:31Z

The same observability-bundle version used in this PR is also being used in already released WC releases. They worked perfectly fine when they got released and we didn't have any issues with tests.

Now it seems like you recently introduced a change to the logging-operator which obviously affects the observability-bundle in existing WC releases and now needs to be fixed to make these releases work again.

As a customer I'd expect these releases to be tested and working, so cluster creation shouldn't break out of nowhere.

Can you please elaborate on what has been changed in logging-operator and how this affects existing WC releases? I'd expect these releases to be stable and immutable and implementing changes in an MC operator, which changes the behavior observability-bundle in existing releases, definitely breaks this contract.

Gacko · 2024-10-31T18:09:13Z

/run releases-test-suites TARGET_SUITES=./providers/capv/standard PREVIOUS_RELEASE=28.0.1 TARGET_RELEASES=vsphere-29.0.0

tinkerers-ci · 2024-10-31T18:39:06Z

releases-test-suites

Run name	`pr-releases-1459-releases-test-suites9rnx2`
Commit SHA	`a9620c4`
Result	Succeeded ✅

📋 View full results in Tekton Dashboard

Rerun trigger:
/run releases-test-suites

Tip

To only re-run the failed test suites you can provide a TARGET_SUITES parameter with your trigger that points to the directory path of the test suites to run, e.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard to re-run the CAPA standard test suite. This supports multiple test suites with each path separated by a comma.

Alternatively, or in addition to, you can also specify TARGET_RELEASES to trigger tests for specific releases. E.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard TARGET_RELEASES=aws-25.0.0-test.1

njuettner · 2024-11-01T09:44:00Z

@giantswarm/team-rocket I think we can finally move on, if you could take another look?

Gacko · 2024-11-01T13:21:18Z

/run releases-test-suites TARGET_SUITES=./providers/capv/upgrade PREVIOUS_RELEASE=28.0.1 TARGET_RELEASES=vsphere-29.0.0

tinkerers-ci · 2024-11-01T13:58:16Z

releases-test-suites

Run name	`pr-releases-1459-releases-test-suitesr6lqz`
Commit SHA	`a9620c4`
Result	Failed ❌

📋 View full results in Tekton Dashboard

Rerun trigger:
/run releases-test-suites

Tip

To only re-run the failed test suites you can provide a TARGET_SUITES parameter with your trigger that points to the directory path of the test suites to run, e.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard to re-run the CAPA standard test suite. This supports multiple test suites with each path separated by a comma.

Alternatively, or in addition to, you can also specify TARGET_RELEASES to trigger tests for specific releases. E.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard TARGET_RELEASES=aws-25.0.0-test.1

njuettner · 2024-11-01T15:14:18Z

/run releases-test-suites TARGET_SUITES=./providers/capv/upgrade PREVIOUS_RELEASE=28.0.1 TARGET_RELEASES=vsphere-29.0.0

tinkerers-ci · 2024-11-01T15:54:19Z

releases-test-suites

Run name	`pr-releases-1459-releases-test-suites2cwrg`
Commit SHA	`a9620c4`
Result	Failed ❌

📋 View full results in Tekton Dashboard

Rerun trigger:
/run releases-test-suites

Tip

To only re-run the failed test suites you can provide a TARGET_SUITES parameter with your trigger that points to the directory path of the test suites to run, e.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard to re-run the CAPA standard test suite. This supports multiple test suites with each path separated by a comma.

Alternatively, or in addition to, you can also specify TARGET_RELEASES to trigger tests for specific releases. E.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard TARGET_RELEASES=aws-25.0.0-test.1

QuentinBisson · 2024-11-02T14:31:56Z

/run releases-test-suites TARGET_SUITES=./providers/capv/upgrade PREVIOUS_RELEASE=28.0.1 TARGET_RELEASES=vsphere-29.0.0

tinkerers-ci · 2024-11-02T15:11:42Z

releases-test-suites

Run name	`pr-releases-1459-releases-test-suitesg68nr`
Commit SHA	`a9620c4`
Result	Failed ❌

📋 View full results in Tekton Dashboard

Rerun trigger:
/run releases-test-suites

Tip

To only re-run the failed test suites you can provide a TARGET_SUITES parameter with your trigger that points to the directory path of the test suites to run, e.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard to re-run the CAPA standard test suite. This supports multiple test suites with each path separated by a comma.

Alternatively, or in addition to, you can also specify TARGET_RELEASES to trigger tests for specific releases. E.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard TARGET_RELEASES=aws-25.0.0-test.1

QuentinBisson · 2024-11-02T15:19:18Z

/run releases-test-suites TARGET_SUITES=./providers/capv/upgrade PREVIOUS_RELEASE=28.0.1 TARGET_RELEASES=vsphere-29.0.0

Gacko · 2024-11-02T15:43:04Z

@QuentinBisson or someone else from @giantswarm/team-atlas: Can you please reply to this? It would be very helpful, even if only for documentation. Also I'd be interested in what has changed between the different runs of Releases Test Suites as I'd prefer to see them reliably fixed instead of having them pass once out of ten. 🙂

tinkerers-ci · 2024-11-02T15:55:40Z

releases-test-suites

Run name	`pr-releases-1459-releases-test-suitesq2gvn`
Commit SHA	`a9620c4`
Result	Succeeded ✅

📋 View full results in Tekton Dashboard

Rerun trigger:
/run releases-test-suites

Tip

To only re-run the failed test suites you can provide a TARGET_SUITES parameter with your trigger that points to the directory path of the test suites to run, e.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard to re-run the CAPA standard test suite. This supports multiple test suites with each path separated by a comma.

Alternatively, or in addition to, you can also specify TARGET_RELEASES to trigger tests for specific releases. E.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard TARGET_RELEASES=aws-25.0.0-test.1

QuentinBisson · 2024-11-02T15:56:00Z

@Gacko I'll write something on monday, I wanted to focus on fixing this first but I did not forget your message :)

Gacko · 2024-11-02T16:05:57Z

Ok, thank you!

I'll run the standard tests one last time and merge this PR once they pass.

/run releases-test-suites TARGET_SUITES=./providers/capv/standard TARGET_RELEASES=vsphere-29.0.0

QuentinBisson · 2024-11-02T16:24:47Z

There's currently a fixed branch of the logging operator on gcapeverde so tests should work 🤞🏻

QuentinBisson · 2024-11-02T16:25:09Z

I was going to run them again anyway 😅

tinkerers-ci · 2024-11-02T16:35:23Z

releases-test-suites

Run name	`pr-releases-1459-releases-test-suitesp5rrg`
Commit SHA	`a9620c4`
Result	Succeeded ✅

📋 View full results in Tekton Dashboard

Rerun trigger:
/run releases-test-suites

Tip

To only re-run the failed test suites you can provide a TARGET_SUITES parameter with your trigger that points to the directory path of the test suites to run, e.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard to re-run the CAPA standard test suite. This supports multiple test suites with each path separated by a comma.

Alternatively, or in addition to, you can also specify TARGET_RELEASES to trigger tests for specific releases. E.g. /run releases-test-suites TARGET_SUITES=./providers/capa/standard TARGET_RELEASES=aws-25.0.0-test.1

QuentinBisson · 2024-11-04T09:56:15Z

The same observability-bundle version used in this PR is also being used in already released WC releases. They worked perfectly fine when they got released and we didn't have any issues with tests.

Now it seems like you recently introduced a change to the logging-operator which obviously affects the observability-bundle in existing WC releases and now needs to be fixed to make these releases work again.

As a customer I'd expect these releases to be tested and working, so cluster creation shouldn't break out of nowhere.

Can you please elaborate on what has been changed in logging-operator and how this affects existing WC releases? I'd expect these releases to be stable and immutable and implementing changes in an MC operator, which changes the behavior observability-bundle in existing releases, definitely breaks this contract.

We are indeed configuring the observability-platform apps in releases via operators. We initialy built the logging and observability operators as a safety mechanism to be able to change some of our apps config on the fly (for prometheus-agent and so on) to counteract the lack/slowness of customer upgrades in the past because we were swarmed with lots of day and night alerts and that was unsufferable and waiting for a customer to upgrade was a no-go.

We used this mechanism to build some features on our apps as well like:

sharding of the monitoring agent based on the metrics on prometheus and Mimir (because that would require someone managing KEDA on WCs and no one wants to)
secret management
dynamic enabling/disabling of loggging and monitoring at the cluster level

This used to work quite well in the past but we recently enabled a feature flag on the logging-operator that replaced promtail with alloy in the observability platform giantswarm/logging-operator#246 which caused issues last week.

This changed had been manually tested in the past but it missed that the grafana-agent application was failing to deploy (because of an issue with it's CRD management in our currently deployed release of it) which broke the cluster creation test.

In the mean time, a configuration breaking change in the alloy secret management was introduced in the observability-bundle and the change was not properly reflected in the logging-operator which caused the upgrade test to fail because the secret for alloy was not created and so alloy in CAPV 28 did not actually deploy :(

Last week, this became problematic and we are definitely sorry about all of this :(. I am opening a PM today so we can investigate how we can move forward without a lot of the operator work going on behind the hood (we need some config coming from MCs like secret to talk to loki and so on but not as much as we have today) but it should be our priority that we find something that does not break any existing releases.
Would you be up to a discussion to find out how we could integrate better?

By the way, we've been having discussions about this topic for years now and I really thought everyone was aware of it. We really need to find a better way to move forward (cc @JosephSalisbury) and that will require improvements on the release and delivery process as well :)

This comment was marked as outdated.

Sign in to view

Gacko changed the title ~~Release: CAPV v29.0.0.~~ CAPV: Release v29.0.0. Oct 22, 2024

Gacko force-pushed the capv-29 branch 2 times, most recently from 583767c to d5c9eba Compare October 22, 2024 16:44

giantswarm deleted a comment from tinkerers-ci bot Oct 22, 2024

Gacko marked this pull request as ready for review October 22, 2024 18:15

Gacko requested a review from a team as a code owner October 22, 2024 18:15

giantswarm deleted a comment from tinkerers-ci bot Oct 22, 2024

giantswarm deleted a comment from tityosbot Oct 22, 2024

Gacko approved these changes Oct 22, 2024

View reviewed changes

vxav reviewed Oct 23, 2024

View reviewed changes

vsphere/v29.0.0/release.yaml Outdated Show resolved Hide resolved

vxav reviewed Oct 23, 2024

View reviewed changes

vsphere/v29.0.0/release.diff Outdated Show resolved Hide resolved

vsphere/v29.0.0/README.md Outdated Show resolved Hide resolved

Gacko force-pushed the capv-29 branch 2 times, most recently from 070d09e to 2ff2c65 Compare October 23, 2024 15:33

giantswarm deleted a comment from tinkerers-ci bot Oct 23, 2024

TheoBrigitte reviewed Oct 24, 2024

View reviewed changes

vsphere/v29.0.0/README.md Show resolved Hide resolved

vsphere/v29.0.0/README.md Show resolved Hide resolved

vsphere/v29.0.0/release.diff Show resolved Hide resolved

vsphere/v29.0.0/release.yaml Show resolved Hide resolved

Gacko force-pushed the capv-29 branch 3 times, most recently from b909de4 to 875a40f Compare October 24, 2024 18:19

This comment was marked as outdated.

Sign in to view

Gacko force-pushed the capv-29 branch from 875a40f to 664a3f3 Compare October 24, 2024 20:51

This comment was marked as outdated.

Sign in to view

njuettner requested review from a team November 1, 2024 09:43

Gacko approved these changes Nov 2, 2024

View reviewed changes

CAPV: Release v29.0.0.

00d3c47

Gacko force-pushed the capv-29 branch from a9620c4 to 00d3c47 Compare November 2, 2024 16:09

Gacko added the skip/ci Instructs PR Gatekeeper to ignore any required PR checks label Nov 2, 2024

Gacko merged commit 1329100 into master Nov 2, 2024
5 checks passed

Gacko deleted the capv-29 branch November 2, 2024 16:36

QuentinBisson mentioned this pull request Nov 4, 2024

Investigate how to integrate better with releases giantswarm/roadmap#3758

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAPV: Release v29.0.0. #1459

CAPV: Release v29.0.0. #1459

njuettner commented Oct 22, 2024 •

edited

Loading

This comment was marked as outdated.

TheoBrigitte left a comment

This comment was marked as outdated.

This comment was marked as outdated.

Gacko commented Oct 24, 2024

This comment was marked as outdated.

njuettner commented Oct 25, 2024

This comment was marked as outdated.

This comment was marked as outdated.

Gacko commented Oct 31, 2024 •

edited

Loading

This comment was marked as outdated.

This comment was marked as outdated.

Gacko commented Oct 31, 2024

tinkerers-ci bot commented Oct 31, 2024

njuettner commented Nov 1, 2024

Gacko commented Nov 1, 2024

tinkerers-ci bot commented Nov 1, 2024

njuettner commented Nov 1, 2024

tinkerers-ci bot commented Nov 1, 2024

QuentinBisson commented Nov 2, 2024

tinkerers-ci bot commented Nov 2, 2024

QuentinBisson commented Nov 2, 2024

Gacko commented Nov 2, 2024

tinkerers-ci bot commented Nov 2, 2024

QuentinBisson commented Nov 2, 2024 •

edited

Loading

Gacko commented Nov 2, 2024

QuentinBisson commented Nov 2, 2024 •

edited

Loading

QuentinBisson commented Nov 2, 2024

tinkerers-ci bot commented Nov 2, 2024

QuentinBisson commented Nov 4, 2024

CAPV: Release v29.0.0. #1459

CAPV: Release v29.0.0. #1459

Conversation

njuettner commented Oct 22, 2024 • edited Loading

Checklist

Triggering E2E tests

This comment was marked as outdated.

TheoBrigitte left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

This comment was marked as outdated.

Gacko commented Oct 24, 2024

This comment was marked as outdated.

njuettner commented Oct 25, 2024

This comment was marked as outdated.

This comment was marked as outdated.

Gacko commented Oct 31, 2024 • edited Loading

This comment was marked as outdated.

This comment was marked as outdated.

Gacko commented Oct 31, 2024

tinkerers-ci bot commented Oct 31, 2024

releases-test-suites

njuettner commented Nov 1, 2024

Gacko commented Nov 1, 2024

tinkerers-ci bot commented Nov 1, 2024

releases-test-suites

njuettner commented Nov 1, 2024

tinkerers-ci bot commented Nov 1, 2024

releases-test-suites

QuentinBisson commented Nov 2, 2024

tinkerers-ci bot commented Nov 2, 2024

releases-test-suites

QuentinBisson commented Nov 2, 2024

Gacko commented Nov 2, 2024

tinkerers-ci bot commented Nov 2, 2024

releases-test-suites

QuentinBisson commented Nov 2, 2024 • edited Loading

Gacko commented Nov 2, 2024

QuentinBisson commented Nov 2, 2024 • edited Loading

QuentinBisson commented Nov 2, 2024

tinkerers-ci bot commented Nov 2, 2024

releases-test-suites

QuentinBisson commented Nov 4, 2024

njuettner commented Oct 22, 2024 •

edited

Loading

Gacko commented Oct 31, 2024 •

edited

Loading

QuentinBisson commented Nov 2, 2024 •

edited

Loading

QuentinBisson commented Nov 2, 2024 •

edited

Loading