Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPCLOUD-2733: rebase on upstream 1.31.0 release #319

Merged
merged 447 commits into from
Oct 22, 2024

Conversation

elmiko
Copy link

@elmiko elmiko commented Oct 7, 2024

This commit rebases the autoscaler on top of the Kubernetes/Autoscaler 1.31.0 release. There are several commits that we carry on top of the upstream autoscaler and the rebase process allows us to preserve those. Here is a description of the process I used to create this PR.

(inspired by the commit description for the 1.18 rebase. pr #139)

Process

First we need to identify the carry commits that we currently have, this is done against our previous rebase to catch new changes. Once identified we will drop commits which have merged upstream and only carry unique commits. (see below for the carried and dropped commits).

Identify carry commits (run from the openshift/master branch), these are the commits that begin with UPSTREAM: up until the merge commit for the previous rebase commit (merge upstream/cluster-autoscaler-1.30.1)

git log -n 30 --oneline --no-merges

After identifying the carry commits, the next step is to create the new commit-tree that will be used for the rebase and then cherry pick the carry commits into the new branch. The following commands cover these steps:

$ git remote update # make sure we update our refs
$ git checkout cluster-autoscaler-1.31.0
$ git checkout -b merge-tmp # create a temporary branch for our merge commit
$ git checkout openshift/master # we want to be at the tip of the openshift master branch when we run the next command
$ echo 'merge upstream/cluster-autoscaler-1.31.0' | git commit-tree merge-tmp^{tree} -p HEAD -p merge-tmp -F - # create a new merge commit for our history
deadbeef12345678 # id of new merge commit
$ git branch merge-1.31 deadbeef12345678 # create a new branch for the cherry-pick work
$ git checkout merge-1.31
$ git cherry-pick <carry commits> # cherry pick the needed commits into the new branch

With the merge-1.31 branch in place, I cherry picked the carry commits which applied, resolved merge conflicts, and finally tested the resulting tree against the unit test and end-to-end suite.

Carried Commits

These commits are for features which have not yet been accepted upstream, are integral to our CI platform, or are specific to the releases we create for OpenShift.

e4674574d UPSTREAM: <carry>: Fix unstructured taint parsing in Cluster API provider
1e019705a UPSTREAM: <carry>: revert capacity annotations
d76e8cbba UPSTREAM: <carry>: add machine api label and taint functionality
2adad66c5 UPSTREAM: <carry>: Have VPA ignore phantom containers named "POD"
7e6ff29ac UPSTREAM: <carry>: Handle old Machine API specific machine delete annotation
79d8030d3 UPSTREAM: <carry>: Rename FailureMessage to ErrorMessage
7e97b6851 UPSTREAM: <carry>: configure repository for OpenShift releases

Squashed Commits

These commits were squashed into the carried commits to help reduce the length of our history. All these commits have been squashed into their topically related commits.

10db73691 UPSTREAM: <carry>: Updating atomic-openshift-cluster-autoscaler-container image to be consistent with ART for 4.18 Reconciling with https://github.com/openshift/ocp-build-data/tree/1c7e6e27058c4590e342ecabe8bc96feeb68e7e5/images/atomic-openshift-cluster-autoscaler.yml
72c134510 UPSTREAM: <carry>: Updating ose-vertical-pod-autoscaler-container image to be consistent with ART for 4.17 Reconciling with https://github.com/openshift/ocp-build-data/tree/39d776b80aa929c42b98e8c50c25162b97b14e06/images/vertical-pod-autoscaler.yml
e23c8d462 UPSTREAM: <carry>: Update VPA ocp/builder images to openshift/release images
9017412fb UPSTREAM: <carry>: Sync OWNERS file
994400fcd UPSTREAM: <carry>: update os::util::list_test_packages_under
4d9aeca00 UPSTREAM: <carry>: remove deprecated go test command

Dropped Commits

These commits were dropped.

62a8e8c24 UPSTREAM: <drop>: Update vpa deployments to 1.1.2
7ee3a7a14 UPSTREAM: <drop>: Bump vpa to 1.1.2
b2d2b7d81 UPSTREAM: <drop>: add unit test for issue 6808
bc091b9c7 UPSTREAM: <drop>: Fix 6808
a0390f8c3 UPSTREAM: <drop>: Bump vpa version to 1.1.1
fdbf883fd UPSTREAM: <drop>: Bump VPA version to 1.1.1

Of special note in this rebase is this commit

1e019705a UPSTREAM: <carry>: revert capacity annotations

due to the scale from zero changes being accepted upstream we can now drop our carried patch. but, the upstream implementation has differed slightly from our's (mainly around annotation names). we will need to carry this patch until we can fix all the providers to properly use the new annotations. This patch can be dropped once the epic contained in https://issues.redhat.com/browse/OCPCLOUD-2136 is completed.

k8s-ci-robot and others added 30 commits June 21, 2024 04:52
…viders

Remove deprecated k8s.io/legacy-cloud-providers
vpa-updater: Log the Pod namespace when evicting a Pod
…dification

Modify logic to look for ocid type prefix in OCI cloud provider builder
Add initial setup for cluster-autoscaler Azure e2e tests
…ample-manifests

Update example manifests with the latest RBAC permissions.
Add ability to the VPA admission-controller to reload it's certificate v2
Remove vendor directory, readded by mistake
…s-approver-no-more

CA - Restore aleksandra-malinowska as approver
Fix/aws asg unsafe decommission 5829
feat(clusterapi): per nodeGroup autoscaling options
…troller-logging

vpa-admission-controller: Log object's namespace
Use autoscaling.x-k8s.io rather than cluster-autoscaler.kubernetes.io
Update Azure cluster-autoscaler e2e cluster template
…ogging

vpa-recommender: Log object's namespace
@elmiko
Copy link
Author

elmiko commented Oct 14, 2024

added an upstream fix for the goimports, i have a feeling the deletion taint tests are still going to give us issues.

@elmiko
Copy link
Author

elmiko commented Oct 14, 2024

i'll add a few more ignores to the snyk config

elmiko and others added 8 commits October 14, 2024 12:32
This change carries files and modifications that are used by OpenShift
release infrastructure and related files.

* spec file
* dockerfiles
  * vertical-pod-autoscaler/Dockerfile.rhel
  * vertical-pod-autoscaler/Dockerfile.openshift
  * images/cluster-autoscaler/Dockerfile
  * images/cluster-autoscaler/Dockerfile.rhel
* hack scripts (ci and build related)
* Makefile
* JUnit tools
* update gitignore
* update/remove OWNERS files
* ci-operator config yaml
* add vendor folders for cluster-autoscaler and balancer
* add Snyk file to exclude vendor directories and problematic cloud
  providers on scan
…otation

The delete annotation upstream has a different format, but is now
inferred dynamically from the API group. If we update this in MAO to use
the new format, we can drop this old key
This change re-adds the machine api support for labels and taints on
node groups. The code was removed upstream as it is openshift specific,
see this pull request[0].

It also adds in the functionality of the upstream override annotation
for labels and taints[1] to support
https://issues.redhat.com/browse/MIXEDARCH-259

[0]: kubernetes#5249
[1]: kubernetes#5382
the upstream annotations for the scale from zero capacity resources is
slighty different than the openshift implementation. the largest
difference is the addition of a gpu type annotation. openshift does not
yet utilize this annotation and thus this patch should be carried until
the machineset controllers for the various providers on openshift have
been modified to use the new annotations.

another important change is the modification of the memory annotation.
previously in openshift we expected this value to be a count of memory
in Mebibytes. the conversion function and tests have been modified to
allow continued openshift operation.

this change can be dropped when the annotations in openshift have been
updated, the progress for this effort can be followed at
https://issues.redhat.com/browse/OCPCLOUD-944
…ider

This change corrects the behavior for parsing taints from the
unstructured scalable resource. This is required on OpenShift as our
implementation is slightly different from the upstream.
this change fixes the import order so that the goimports tool does not
complain about the ordering.
@JoelSpeed
Copy link

The security issues presented here may warrant a discussion with the VPA team, I can see some about unsanitised input, yet, I suspect the answer will be "we control the input so it's ok"

Otherwise changes here are looking good

Are the periodic issues known?

@elmiko
Copy link
Author

elmiko commented Oct 15, 2024

cc @joelsmith , happy for your input on the security warnings

@elmiko
Copy link
Author

elmiko commented Oct 15, 2024

Are the periodic issues known?

yes, these are not new. they are related to the deletion taint issue we are working on.

@joelsmith
Copy link

I went to the Snyk dashboard and I have told Snyk to ignore the VPA issues as they were all false alarms. There is still one in tools/import-verifier/import-verifier.go which I will leave for the Cloud team to ignore. I'll re-run the test to make sure that the VPA ones are properly ignored.

/test security

@elmiko
Copy link
Author

elmiko commented Oct 16, 2024

thanks Joel!

@elmiko
Copy link
Author

elmiko commented Oct 16, 2024

i did some research and have added an ignore for the import-verifier. it is only used during the build process to inspect the imports on go files, it currently is only used from the verify-imports.sh script, and it has an empty list for input restrictions. which means that currently it does not check anything.

while it is possible that an attacker could get into the supply chain and change the import restrictions, the result of this would most likely be limited as we currently do not have any restrictions. while they might be able to break part of the build process, it seems unlikely that there would be deleterious effects on the code that weren't merged in other patches.

/test security

@elmiko
Copy link
Author

elmiko commented Oct 16, 2024

/test security

1 similar comment
@elmiko
Copy link
Author

elmiko commented Oct 17, 2024

/test security

@elmiko
Copy link
Author

elmiko commented Oct 17, 2024

/retest

Copy link

openshift-ci bot commented Oct 17, 2024

@elmiko: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/git-history a1f5171 link false /test git-history
ci/prow/e2e-aws-periodic-pre a1f5171 link false /test e2e-aws-periodic-pre
ci/prow/e2e-azure-periodic-pre a1f5171 link false /test e2e-azure-periodic-pre

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@elmiko
Copy link
Author

elmiko commented Oct 22, 2024

@JoelSpeed looks like we are just down to the pathologically failing deletion taint tests

@JoelSpeed
Copy link

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 22, 2024
Copy link

openshift-ci bot commented Oct 22, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 22, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit b533c8b into openshift:master Oct 22, 2024
12 of 15 checks passed
@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

Distgit: vertical-pod-autoscaler
This PR has been included in build ose-vertical-pod-autoscaler-container-v4.18.0-202410221510.p0.gb533c8b.assembly.stream.el9.
All builds following this will include this PR.

@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

Distgit: atomic-openshift-cluster-autoscaler
This PR has been included in build atomic-openshift-cluster-autoscaler-container-v4.18.0-202410221510.p0.gb533c8b.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.