Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different protocols in the same Service definition with type=LoadBalancer #1438

Merged
merged 17 commits into from
Jul 14, 2020

Conversation

janosi
Copy link
Contributor

@janosi janosi commented Jan 8, 2020

This is the KEP for
kubernetes/kubernetes issue kubernetes/kubernetes#23880
kubernetes/kubernetes PR kubernetes/kubernetes#75831

kubernetes/enhancements issue: #1435

@k8s-ci-robot
Copy link
Contributor

Welcome @janosi!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @janosi. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 8, 2020
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/network Categorizes an issue or PR as relevant to SIG Network. labels Jan 8, 2020
@janosi
Copy link
Contributor Author

janosi commented Jan 9, 2020

/assign @caseydavenport

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janosi here's some nit feedback that you might find useful (I hope you do)


Once that limit is removed those Service definitions will reach the Cloud Provider LB controller implementations. The logic of the particular Cloud Provider LB controller and of course the actual capabilities and architecture of the backing Cloud Load Balancer services determines how the actual exposure of the application really manifests. For this reason it is important to understand the capabilities of those backing services and to design this feature accordingly.

### User Stories [optional]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

### User Stories

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


As a Kubernetes cluster user I want to expose an application that provides its service on different protocols with a single cloud provider load balancer IP. In order to achieve this I want to define different `ports` with mixed protocols in my Service definition of `type: LoadBalancer`

### Implementation Details/Notes/Constraints [optional]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Implementation Details/Notes/Constraints [optional]
### Implementation Details/Notes/Constraints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


### Option Control Alternatives

#### Annotiation in the Service definition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Annotiation in the Service definition
#### Annotation in the Service definition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 216 to 221
#### OpenStack

The OpenStack CPI supports TCP, UDP, HTTP(S) in Service definitions and can configure the Octavia listeners with the protocols defined in the Service.
OpenStack Octavia supports TCP, UDP and HTTP(S) on listeners, an own listener must be configured for each protocol, and different listeners can be used on the same LB instance.

Summary: the OpenStack based clouds that use Octavia v2 as their LBaaS seems to support this feature once implemented. Pricing is up to their model.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Octavia version should >= 5.0.0 which first release in OpenStack Train

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thinalai Thank you for the comment! Indeed, I can see this bugfix in the 5.0.0 release notes
Fixed bug which prevented the creation of listeners for different protocols on the same port (i.e: tcp port 53, and udp port 53).
I add this note to the KEP.

@andrewsykim
Copy link
Member

For reference, the initial issue/PRs that added validation against mixed protocols:
Issue: kubernetes/kubernetes#20394
PR: kubernetes/kubernetes#20600

Copy link
Member

@andrewsykim andrewsykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, overall this seems very reasonable to me.

Can we squash the commits into more meaningful commits please?


[GCP forwarding_rules_charges](https://cloud.google.com/compute/network-pricing#forwarding_rules_charges) suggest that the same K8s Service definition would result in the creation of 2 forwarding rules in GCP. This has the same fixed price up to 5 forwarding rule instances, and each additional rule results in extra cost.

A user can ask for an internal TCP/UDP Load Balancer via a K8s Service definition that also has the annotation `cloud.google.com/load-balancer-type: "Internal"`. Forwarding rules are also part of the GCE Internal TCP/UDP Load Balancer architecture, but in case of Internal TCP/UDP Load Balancer it is not supported to define different forwarding rules with different protocols for the same IP address. That is, for Services with type=LoadBalancer and with annotation `cloud.google.com/load-balancer-type: "Internal"` this feature would not be supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would a user find out this feature isn't supported for this particular case? It sounds like either searching logs or realizing the LB is not serving the forwarding rules as expected?

Copy link
Contributor Author

@janosi janosi Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting point. The current GCP CPI implementation takes the first protocol value it finds in the Service definition and uses that for all ports in the Service when creating an internal LB. There is no check for violations of the "single protocol" rule, i.e. it assumes that the filter in the K8s API already blocked a mixed protocol setup.
If this filter is removed by this feature then - according to my understanding - the GCP CPI will setup an internal load balancer with the same protocol on all forwarding rules, even if the trigger was a Service with mixed protocol setup. There is no error feedback to the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have arbitrary status, but we could add it. status.loadBalancer could grow a field or two. Or we could use events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Thank you!

Which solution would you prefer: the new field(s) in the status.loadBalanceror the events?

I do not see them as contradictory, though. Events are moved to external troubleshooting systems pretty fast, so maybe it is worth to keep the actual state of the Service in-cluster too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see improvements to the LB status fields if possible.
At the very least, there should be some form of conditions to allow for better extensibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - status for this would be good, and events are orthogonal (both are good). That said this limitation has been removed :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean users can use different protocols on internal GCE LBs now?
I assume, documentation update is also coming? Here it still states that multiple forwarding rules for the same IP can have either UDP or TCP, but not both: https://cloud.google.com/load-balancing/docs/internal#multiple_forwarding_rule
And backend services also has similar limitation: https://cloud.google.com/load-balancing/docs/internal#backend-service

Or it is only my bad English parser :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean users can use different protocols on internal GCE LBs now?

Yes :)

I'll ping docs people. See https://cloud.google.com/kubernetes-engine/docs/how-to/internal-load-balancing#enabling_shared_ip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


## Proposal

The first thing proposed here is to lift the hardcoded limitation in Kubernetes that currently rejects Service definitions with different protocols if their type is LoadBalancer. Kubernetes would not reject Service definitions like this from that point:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded limitation -> API validation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@janosi
Copy link
Contributor Author

janosi commented Jan 16, 2020

@andrewsykim

Can we squash the commits into more meaningful commits please?

Fixed. Squashed my internal (i.e. before PR creation) commits into one.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have come around on my willingness to enable this. I accept that "it might cost money" is less important than "do what I asked".

Some comments, but mostly LGTM


[GCP forwarding_rules_charges](https://cloud.google.com/compute/network-pricing#forwarding_rules_charges) suggest that the same K8s Service definition would result in the creation of 2 forwarding rules in GCP. This has the same fixed price up to 5 forwarding rule instances, and each additional rule results in extra cost.

A user can ask for an internal TCP/UDP Load Balancer via a K8s Service definition that also has the annotation `cloud.google.com/load-balancer-type: "Internal"`. Forwarding rules are also part of the GCE Internal TCP/UDP Load Balancer architecture, but in case of Internal TCP/UDP Load Balancer it is not supported to define different forwarding rules with different protocols for the same IP address. That is, for Services with type=LoadBalancer and with annotation `cloud.google.com/load-balancer-type: "Internal"` this feature would not be supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have arbitrary status, but we could add it. status.loadBalancer could grow a field or two. Or we could use events.

- if a user is happy with 2 NLB (Service) instances for TCP and UDP still the user has two more forwarding rules to be billed - i.e. it has the same effect on pricing as if those TCP and UDP endpoints were behind the same NLB instance
- already now the bills of users is affected if they have more than 5 forwarding rules as the result of their current Service definitions (e.g. 6 or more port definitions in a single Serice, or if the number of all port definitions in different Services is 6 or more, etc.)

That is, if we consider the "single Service with 6 ports" case the user has to pay more for that single Service ("single LB instance") than for another Service (another LB instance) with 5 or less ports already now. It is not the number of LBs that matters. This phenomenon is already there with the current practice, and the enabling of mixed protocols will not change it to the worse.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right? We can (and do) set up a single Forwarding Rule for multiple ports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I was wrong here. Excuse me for this, fixed.


The other risk is in the introduction of this feature without an option control mechanism, i.e. as a general change in Service handling. In that case there is the question whether this feature should be a part of the conformance test set, because it can affect the conformance of cloud providers.

A possible mitigation is to put the feature behind option control.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow existing services to have this condition in 1.18 (e.g. implementing it and not rejecting updates to objects that already are in this state) but not allow new services (e.g. rejecting a create or an update that enters this state), then we can allow create in 1.19, knowing that rollback to 1.18 is safe.

I think? Am I missing anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems right to me -- we followed the same approach when we added finalizers for Service LBs. The only thing to consider is the code path won't be vetted as much if we don't allow new services to use it at all, at the very least there should be an opt-in for mixed protocols so we can verify it works on v1.18 but we need to make sure users know the consequence of enabling (i.e. no rollback). Maybe we do need a feature gate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, these changes are always nuanced. I think the right thing is to allow existing uses and allow new uses IFF gate is enabled. That puts anyone who uses it under the gate at rollback risk.

I have asked @liggitt for a consult, since he is involved in all of these :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right thing is to allow existing uses and allow new uses IFF gate is enabled.

That's correct. The essential thing that default configurations (e.g. no opt into alpha feature gate) of version N does not allow new uses (via creation or update of an object that doesn't currently use the relaxed validation) that would be considered invalid by version N-1.

That puts anyone who uses it under the gate at rollback risk.

That is why we put relaxed validation and new fields in GA objects under an alpha gate for one release, and limit API server rollback/skew to a single version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking - it doesn't HAVE to be "alpha", it could just be off-by-default beta (not much difference practically).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To resolve this comment, change this to reference the definitive plan, as opposed to options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Let's see if it is better now. Please check.

The implementation of the basic logic is ready in this PR:
https://github.com/kubernetes/kubernetes/pull/75831

Currently a feature gate is used to control its activation status. Though if we want to keep this feature behind option control even after it reaches its GA state we should come up with a different solution, as feature gates are used to control the status of a feature as that graduates from alpha to GA, and they are not meant for option control for features with GA status.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure an alpha is warranted here, but maybe. It's not a huge change, except in the cloud providers it could be. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we skip alpha as long as rollbacks don't break LBs as you mentioned here #1438 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand right then that the agreed change is the following:

  • in 1.18 creating a mixed Service is allowed if the feature flag is ON. We add notification to the documentation that in this case rollback is not possible, or at least can break things depending on the cloud provider implementation
  • in 1.18 we allow update of such Services even if the feature flag in OFF
  • in 1.19 we change the create path so, that even if the feature flag is OFF mixed Services can be created
    ?
    Thank you!

Copy link
Member

@liggitt liggitt Jan 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in 1.18 creating a mixed Service is allowed if the feature flag is ON. We add notification to the documentation that in this case rollback is not possible, or at least can break things depending on the cloud provider implementation

That's correct, and that means the feature gate should be alpha and disabled by default (we should not enable something by default that breaks skew/rollback).

in 1.18 we allow update of such Services even if the feature flag in OFF

Correct. That either means someone enabled-then-disabled the feature flag, or someone created the service via a future release API server. In both cases, we want to preserve/allow update of that object.

in 1.19 we change the create path so, that even if the feature flag is OFF mixed Services can be created
?

Typically, we would leave the create/update logic the same as in 1.18, and simply promote the feature flag to beta and enable it by default in 1.19. That gives a release to get broader feedback and bug reports, but allows someone to turn off the feature and stop allowing new uses if problems are found, without cutting a new Kubernetes 1.19.x patch release.

Once the feature graduates to GA and is locked on (e.g. in 1.20), then all feature-gate-conditional logic can be dropped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt @thockin @andrewsykim I think I updated the document according to this logic. Please check and suggest how to continue.

@thockin
Copy link
Member

thockin commented Jan 16, 2020 via email

@janosi
Copy link
Contributor Author

janosi commented Jan 21, 2020

@caseydavenport @thockin @andrewsykim Do you think there is a chance to get it into 1.18 enhancements? Or do you see some gaps here? Thank you!

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

- a feature flag shall control whether loadbalancer Services with mixed protcol configuration are accepted or not at validation time
- we must add a note to the documentation that if such Service is created then it may break things after a rollback - it depends on the cloud provider implementation
- the update of such services shall be possible even if the feature flag is OFF. This is to prepare for the next release when the feature flag is removed from the create path, too, and after a rollback to the first release the update of existing Service objects must be possible
- the CPI implementations shall be prepared to deal with Services with mixed protocol configurations. It does not mean that the CPI implementations have to accept such Services. It means that the CPI implementations shall be able to process such Services. The CPI implementation can still reject such a Service if it cannot manage the corresponding cloud load balancers for those Services, or for any other reasons. The point is, that the system (Service - CPI implementation - cloud LB) must be in a consitent state in the end, and the user must have meaningful information about that state. For exempla, if the Service is not accepted by the CPI implementation this fact shall be available for the user.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to "reject" a service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, just like currently some CPIs do not accept a service if for example not all ports are TCP. My assumption was that that is a kind of accepted way how a CPI can "reject" a service. Similarly, the CPI can still "reject" the service on the same way if it does not want to support this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded.

#### Proposed solution

In the first release:
- a feature flag shall control whether loadbalancer Services with mixed protcol configuration are accepted or not at validation time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gate covers new uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fxied.

@janosi
Copy link
Contributor Author

janosi commented Jan 25, 2020

@caseydavenport @thockin @andrewsykim Would you like to see some more updates on this KEP?

@janosi
Copy link
Contributor Author

janosi commented Jan 29, 2020

@thockin event though we missed the 1.18 deadline with this one, I would like to progress. Please advise what is next. If the KEP is good to go, I can update the necessary meta fields, and it can be merged then. If there are still things to update in it, I would like to know those.
Also I wonder if this KEP could be a candidate for an "exception request" so we could still have it in 1.18.

@trinitronx
Copy link

@andrewsykim
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 6, 2020
@k8s-ci-robot k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label Jul 8, 2020
@janosi janosi requested a review from thockin July 8, 2020 18:21

In the first release:
- a feature flag shall control whether new loadbalancer Services with mixed protcol configuration can be created or not
- we must add a note to the documentation that if such Service is created then it may break things after a rollback - it depends on the cloud provider implementation
- the update of such Services shall be possible even if the feature flag is OFF. This is to prepare for the next release when the feature flag is removed from the create path, too, and after a rollback to the first release the update of existing Service objects must be possible
- the CPI implementations shall be prepared to deal with Services with mixed protocol configurations. It does not mean that the CPI implementations have to accept such Services. It means that the CPI implementations shall be able to process such Services. The CPI implementation can still validate the Service objects to check whether any of those has mixed protocol configuration and if the CPI cannot support such Service for any reasons the CPI shall indicate that clearly to the user, and the CPI should not create Cloud LB resources for the "accepted part" of the Service. The point is, that the system (Service - CPI implementation - cloud LB) must be in a consitent state in the end, and the user must have meaningful information about that state. For example, if the Service is not accepted by the CPI implementation this fact shall be available for the user, either as a new field in `Service.status.loadBalancer` or as a new Event (or both).
- the CPI implementations shall be prepared to deal with Services with mixed protocol configurations. Either via supporting such Service definitions, or clearly indicating to the users that the Service could not be processed as specified. As we can see from our analysis some CPIs support other protocols than TCP and UPD in the Service definitions, while others support only TCP and UDP. That is the term "mixed protocol support" does not always mean that all possible protocol values are supported by a CPI. For this reason a nicely behaving CPI shall
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "UPD" instead of "UDP"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


[GCP forwarding_rules_charges](https://cloud.google.com/compute/network-pricing#forwarding_rules_charges) suggest that the same K8s Service definition would result in the creation of 2 forwarding rules in GCP. This has the same fixed price up to 5 forwarding rule instances, and each additional rule results in extra cost.

A user can ask for an internal TCP/UDP Load Balancer via a K8s Service definition that also has the annotation `cloud.google.com/load-balancer-type: "Internal"`. Forwarding rules are also part of the GCE Internal TCP/UDP Load Balancer architecture, but in case of Internal TCP/UDP Load Balancer it is not supported to define different forwarding rules with different protocols for the same IP address. That is, for Services with type=LoadBalancer and with annotation `cloud.google.com/load-balancer-type: "Internal"` this feature would not be supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean users can use different protocols on internal GCE LBs now?

Yes :)

I'll ping docs people. See https://cloud.google.com/kubernetes-engine/docs/how-to/internal-load-balancing#enabling_shared_ip

"description": "The protocol for this port",
"type": "string"
},
"name": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need name in status

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

"description": "The name of this port within the service.",
"type": "string"
},
"ready": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there more states that a single port or set of ports might be in? I'm wondering if bool is right or enum or even Conditions. Bool is simple, maybe good enough...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking about it, but could not find more states for the time being.

"description": "Specifies whether the port was configured for the load-balancer.",
"type": "boolean"
},
"message": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a lot of precedent on this sort of thing, other than Conditions. Perhaps simply mapping ports here, and adding Conditions to status would be more expressive. E.g.:

conditions:
 - type: LoadBalancerMixedProtocolNotSupported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a "conditions" field right into "Service.status.loadBalancer", and defined the first condition with this name.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're at it, should there be a condition for just this protocol is not supported? i.e. for CPI that can do LB mixed protocols, but the protocol for this port cannot work, either because it's flat-out not supported, e.g., UDP on AWS ELB, or because for some reason the Load Balancer cannot be built with multiple protocols when this one is present.

At least in the AWS UDP-on-ELB case, this would be better than just an event saying 'We couldn't do that' as we get now. The event still carries details, but being able to set a condition would be nice too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think the new "portStatuses" could be used to indicate problems with the particular ports? I.e. the "Service.status.conditions" would show problem with the mixed protocol setup, and then the "ingress.portStatuses" could be used to provide more information.

},
"type": "array"
},
"portStatuses": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want ports under ingress - ingress is a plural on purpose, so ports should be tracked based on each ingress point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, moved the portStatuses under ingress.

"description": "LoadBalancerStatus represents the status of a load-balancer.",
"properties": {
"ingress": {
"description": "Ingress is a list containing ingress points for the load-balancer. Traffic intended for the service ld be sent to these ingress points.",
Copy link

@TBBle TBBle Jul 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ld"? Checking the docs for LoadBalancerStatus, that should be "should". Edit: I also recognise (now) that it's not intended to be changed here, as this is a pre-existing structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, bad copy-paste. Fixed.

Move portStatus inside the ingress definition of an LB.
Add a conditions field to the Service.status.loadBalancer and add the first official condition "LoadBalancerMixedProtocolNotSupported to it.
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 11, 2020
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done, IMO


```json
"io.k8s.api.core.v1.LoadBalancerCondition": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say, it's considerably harder to review this as openAPI than as Go code :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it sounded like a good choice for API specification. 😄 +1 that Go would be easier 😄

],
"type": "object"
},
"io.k8s.api.core.v1.PortStatus": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is kube-proxy expected to consume this in any way into iptables/IPVS rules or is this just for reporting? Might be worth clarifying this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. If IPVS and iptables set up captures for the LBVIP + proto + port, and the LB itself doesn't support mixed protocols, should kube-proxy also not capture them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder about the exact handling. I mean, no rules at all, or rather drop those packets that are anyway not handled by the LB on the external interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I added a kube-proxy chapter. Please check and if you are OK with that please remove the hold flag from this one.

@thockin thockin added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jul 13, 2020
@thockin
Copy link
Member

thockin commented Jul 13, 2020

I'm going to Approve this now, but I like Andrew's question. We can either answer it as a followup or as one more rev on this. Remove the hold as you see fit.

Also, I set the "squash" merge option for this, so you don't need to manually squash.

/approve
/lgtm
/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jul 13, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: janosi, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 13, 2020
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 14, 2020
@janosi
Copy link
Contributor Author

janosi commented Jul 14, 2020

...ahh, and a new lgtm if you are still OK with this one.

@thockin thockin removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 14, 2020
@thockin
Copy link
Member

thockin commented Jul 14, 2020

/lgtm

hold removed

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 14, 2020
@k8s-ci-robot k8s-ci-robot merged commit 4ccf8f5 into kubernetes:master Jul 14, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Jul 14, 2020
@janosi janosi deleted the mixedprotocollb branch August 16, 2020 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/network Categorizes an issue or PR as relevant to SIG Network. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.