Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gate: disallow .status.loadBalancer on non-LB svc #119789

Conversation

thockin
Copy link
Member

@thockin thockin commented Aug 7, 2023

Fixes #119700

The fact that the .status.loadBalancer field can be set while .spec.type is not "LoadBalancer" is a flub. Any spec update will already clear .status.ingress, so it's hard to really rely on this. After this change, updates which try to set this combination will fail validation.

Existing cases of this will not be broken. Any spec/metadata update will clear it (no error) and this is the only stanza of status.

New gate "AllowServiceLBStatusOnNonLB" is off by default, but can be enabled if this change actually breaks someone, which seems exceeedingly unlikely.

/kind bug

Setting the `status.loadBalancer` of a Service whose `spec.type` is not `"LoadBalancer"` was previously allowed, but any update to the `metadata` or `spec` would wipe that field. Setting this field is no longer permitted unless `spec.type` is  `"LoadBalancer"`.  In the very unlikely event that this has unexpected impact, you can enable the `AllowServiceLBStatusOnNonLB` feature gate, which will restore the previous behavior.  If you do need to set this, please file an issue with the Kubernetes project to help contributors understand why you need it.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 7, 2023
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.28 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.28.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Sun Aug 6 22:30:55 UTC 2023.

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 7, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 7, 2023
@thockin
Copy link
Member Author

thockin commented Aug 7, 2023

I'd like to make an argument for this to go into 1.28, but if it can't make the .0 then we should wait until 29 (and fix comments in this commit).

@kubernetes/release-managers

@pacoxu
Copy link
Member

pacoxu commented Aug 7, 2023

Unit test failed in TestServiceStatusStrategy for this change,
Othere2e failures can be fixed after #119454 is merged.

@sftim
Copy link
Contributor

sftim commented Aug 8, 2023

Changelog suggestion

-Setting the `status.loadBalancer` of a Service whose `spec.type` is not "LoadBalancer" was previously allowed, but any update to the `metadata` or `spec` would wipe that field.  Setting this field is no longer permitted unless `spec.type` is  "LoadBalancer".  In the very unlikely event that this has unexpected impact, the "AllowServiceLBStatusOnNonLB" feature gate may be set to true, which will restore the previous behavior.  If you need to set this, please file an issue with the Kubernetes project to help us understand why.
+Setting the `status.loadBalancer` of a Service whose `spec.type` is not `"LoadBalancer"` was previously allowed, but any update to the `metadata` or `spec` would wipe that field.
+Setting this field is no longer permitted unless `spec.type` is  `"LoadBalancer"`.  In the very unlikely event that this has unexpected impact, you can enable the `AllowServiceLBStatusOnNonLB` feature gate, which will restore the previous behavior.
+If you need to set this, please file an issue with the Kubernetes project to help contributors understand why you need it.

@@ -5441,7 +5441,7 @@ func ValidateServiceUpdate(service, oldService *core.Service) field.ErrorList {
// ValidateServiceStatusUpdate tests if required fields in the Service are set when updating status.
func ValidateServiceStatusUpdate(service, oldService *core.Service) field.ErrorList {
allErrs := ValidateObjectMetaUpdate(&service.ObjectMeta, &oldService.ObjectMeta, field.NewPath("metadata"))
allErrs = append(allErrs, ValidateLoadBalancerStatus(&service.Status.LoadBalancer, field.NewPath("status", "loadBalancer"))...)
allErrs = append(allErrs, ValidateLoadBalancerStatus(&service.Status.LoadBalancer, field.NewPath("status", "loadBalancer"), &service.Spec)...)
Copy link
Member

@aojea aojea Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not check directly here?

Suggested change
allErrs = append(allErrs, ValidateLoadBalancerStatus(&service.Status.LoadBalancer, field.NewPath("status", "loadBalancer"), &service.Spec)...)
if !utilfeature.DefaultFeatureGate.Enabled(features.AllowServiceLBStatusOnNonLB) && spec.Type != core.ServiceTypeLoadBalancer && len(status.Ingress) != 0 {
allErrs = append(allErrs, field.Forbidden(ingrPath, "may only be used when `spec.type` is 'LoadBalancer'"))
} else {
allErrs = append(allErrs, ValidateLoadBalancerStatus(&service.Status.LoadBalancer, field.NewPath("status", "loadBalancer"))...)
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it felt more contained in the status-specific path. Other than avoiding one argument, is there a reason I am not seeing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid to plumb the spec on the status path

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is that if we make service.Spec an argument of ValidateLoadBalancerStatus we may open a door to all kind of validations there , in this case have ValidateLoadBalancerStatus that is can be called from any serviceType, I think that this is our problem, we allow it from any service type.

with my suggesting once we remove the feature gate the code seems easy to read, as ValidateLoadBalancerStatus only will validate the status for types LoadBalancer that is what you expect from the name

pkg/features/kube_features.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 11, 2023
@thockin thockin force-pushed the deprecate_svc_lb_ingress_with_clusterip branch from 43af7ca to 0c9c6d8 Compare August 17, 2023 21:37
@thockin
Copy link
Member Author

thockin commented Aug 17, 2023

Rebased. Unit test updated.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 17, 2023
@thockin
Copy link
Member Author

thockin commented Aug 18, 2023

I fixed the integration test, I think. The rest need the e2e fix in #119454

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Aug 18, 2023
@aojea
Copy link
Member

aojea commented Aug 19, 2023

failed [FAILED] Could not patch service status%!(EXTRA *errors.StatusError=Service "test-service-7pblz" is invalid: status.loadBalancer.ingress: Forbidden: may only be used when spec.type is 'LoadBalancer'): Service "test-service-7pblz" is invalid: status.loadBalancer.ingress: Forbidden: may only be used when spec.type is 'LoadBalancer'
In [It] at: test/e2e/network/service.go:3406 @ 08/18/23 23:19:51.638

proof this fixes the problem

/retest

@@ -45,7 +45,7 @@ func GetEtcdStorageDataForNamespace(namespace string) map[schema.GroupVersionRes
ExpectedEtcdPath: "/registry/configmaps/" + namespace + "/cm1",
},
gvr("", "v1", "services"): {
Stub: `{"metadata": {"name": "service1"}, "spec": {"externalName": "service1name", "ports": [{"port": 10000, "targetPort": 11000}], "selector": {"test": "data"}}}`,
Stub: `{"metadata": {"name": "service1"}, "spec": {"type": "LoadBalancer", "ports": [{"port": 10000, "targetPort": 11000}], "selector": {"test": "data"}}}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change, why do you have WIP in the commit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple tests which interate all of the API types looking for /status subresources. There's a table of inputs for spec and another table for status, which it tries to apply to /status. This test tried to apply status.loadBalancer.ingress while the spec is type: ClusterIP. That fails now.

"WIP" in comment because I was not 100% confident in it.

@thockin
Copy link
Member Author

thockin commented Aug 19, 2023

I don't understand the remaining integration fail so I am rebasing and will see if it pops up again.

@thockin thockin force-pushed the deprecate_svc_lb_ingress_with_clusterip branch 2 times, most recently from 9782e72 to 962495f Compare August 19, 2023 20:23
@thockin
Copy link
Member Author

thockin commented Aug 19, 2023

I think I got it now. These tests are WAY too brittle.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

The fact that the .status.loadBalancer field can be set while .spec.type
is not "LoadBalancer" is a flub.  Any spec update will already clear
.status.ingress, so it's hard to really rely on this.  After this
change, updates which try to set this combination will fail validation.

Existing cases of this will not be broken.  Any spec/metadata update
will clear it (no error) and this is the only stanza of status.

New gate "AllowServiceLBStatusOnNonLB" is off by default, but can be
enabled if this change actually breaks someone, which seems exceeedingly
unlikely.
@thockin thockin force-pushed the deprecate_svc_lb_ingress_with_clusterip branch from 62267a0 to a930892 Compare August 20, 2023 23:41
@thockin
Copy link
Member Author

thockin commented Aug 21, 2023

/retest

@thockin
Copy link
Member Author

thockin commented Aug 21, 2023

is green now!

allErrs = append(allErrs, field.Invalid(idxPath.Child("hostname"), ingress.Hostname, msg))
}
if isIP := (netutils.ParseIPSloppy(ingress.Hostname) != nil); isIP {
allErrs = append(allErrs, field.Invalid(idxPath.Child("hostname"), ingress.Hostname, "must be a DNS name, not an IP address"))
Copy link
Member

@aojea aojea Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically I think an IPv4 is a valid dns name, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git blame says this was this way since forever so 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think DNS says a hostname which is all numeric is not valid, precisely because it is ambiguous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not why I ask this things, now I had to spend some time going through the RFCs just to know the truth 😄

I found this in https://www.ietf.org/rfc/rfc1123.txt

2.  GENERAL ISSUES

   This section contains general requirements that may be applicable to
   all application-layer protocols.

   2.1  Host Names and Numbers

      The syntax of a legal Internet host name was specified in RFC-952
      [DNS:4].  One aspect of host name syntax is hereby changed: the
      restriction on the first character is relaxed to allow either a
      letter or a digit.  Host software MUST support this more liberal
      syntax.

      Host software MUST handle host names of up to 63 characters and
      SHOULD handle host names of up to 255 characters.

      Whenever a user inputs the identity of an Internet host, it SHOULD
      be possible to enter either (1) a host domain name or (2) an IP
      address in dotted-decimal ("#.#.#.#") form.  The host SHOULD check
      the string syntactically for a dotted-decimal number before
      looking it up in the Domain Name System.

      DISCUSSION:
           This last requirement is not intended to specify the complete
           syntactic form for entering a dotted-decimal host number;
           that is considered to be a user-interface issue.  For
           example, a dotted-decimal number must be enclosed within
           "[ ]" brackets for SMTP mail (see Section 5.2.17).  This
           notation could be made universal within a host system,
           simplifying the syntactic checking for a dotted-decimal
           number.

           If a dotted-decimal number can be entered without such
           identifying delimiters, then a full syntactic check must be
           made, because a segment of a host domain name is now allowed
           to begin with a digit and could legally be entirely numeric
           (see Section 6.1.2.4).  However, a valid host name can never
           have the dotted-decimal form #.#.#.#, since at least the
           highest-level component label will be alphabetic.

@aojea
Copy link
Member

aojea commented Aug 22, 2023

3 greens on the last runs of integration job

https://prow.k8s.io/pr-history/?org=kubernetes&repo=kubernetes&pr=119789

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: a997b249e6903a4f165466df50fbfba880c22373

@thockin
Copy link
Member Author

thockin commented Aug 22, 2023

/retest

@k8s-ci-robot
Copy link
Contributor

@thockin: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-node-e2e-containerd a930892 link true /test pull-kubernetes-node-e2e-containerd
pull-kubernetes-unit a930892 link unknown /test pull-kubernetes-unit
pull-kubernetes-integration a930892 link unknown /test pull-kubernetes-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@thockin
Copy link
Member Author

thockin commented Aug 22, 2023

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Service status.loadBalancer.ingress can be set when type != LoadBalancer
5 participants