Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sig-network: Add egress-source-ip-support KEP #1105

Closed
wants to merge 3 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 176 additions & 0 deletions keps/sig-network/20190613-egress-source-ip-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
---
title: egress-source-ip-support
authors:
- "@mkimuram"
owning-sig: sig-network
participating-sigs:
reviewers:
- TBD
approvers:
- TBD
editor: "@mkimuram"
creation-date: 2019-06-13
last-updated: 2019-06-13
status: provisional
see-also:
- TBD
replaces:
superseded-by:
---

# egress-source-ip-support

## Table of Contents

- [Title](#title)
- [Table of Contents](#table-of-contents)
- [Release Signoff Checklist](#release-signoff-checklist)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [User Stories [optional]](#user-stories-optional)
- [Story 1](#story-1)
- [Story 2](#story-2)
- [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Test Plan](#test-plan)
- [Graduation Criteria](#graduation-criteria)
- [Examples](#examples)
- [Alpha -> Beta Graduation](#alpha---beta-graduation)
- [Beta -> GA Graduation](#beta---ga-graduation)
- [Removing a deprecated flag](#removing-a-deprecated-flag)
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
- [Version Skew Strategy](#version-skew-strategy)
- [Implementation History](#implementation-history)
- [Drawbacks [optional]](#drawbacks-optional)
- [Alternatives [optional]](#alternatives-optional)
- [Infrastructure Needed [optional]](#infrastructure-needed-optional)

[Tools for generating]: https://github.com/ekalinin/github-markdown-toc

## Release Signoff Checklist
- [ ] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR)
- [ ] KEP approvers have set the KEP status to `implementable`
- [ ] Design details are appropriately documented
- [ ] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [ ] Graduation criteria is in place
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

## Summary

Egress source IP is a feature to assign a static egress source IP for packets from a pod to outside k8s cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to limit this to a pod or something more stable like a label selector? Naming pods explicitly in resources can be fragile -- pod names are meant to be temporary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be good to define what "outside the k8s cluster" means. Where is the boundary? Is it when the packet leaves the node, some notion of network, etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. As I mentioned in the SIG meeting. I meant "within private network", network like across cloud was not within my scope. I will update the KEP.

Also, label based approach sounds good, because the use case includes multiple pod to IP mapping case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(TLDR; I'm newbie, and you are allowed to ignore this comment )

I would say that I understand the original term of "outside of k8s cluster", as destination seem to hint that it is outside the k8s cluster pod CIDR and service CIDR.

If the destination was inside the kubernetes cluster, this wouldn't make much sense IMO.
(ie destination could use a NetworkPolicy for securing it)

From my understanding, the keepalived is just yet another VIP that acts as an service type LoadBalancer, which kube-egress needs to be able to bind + using the iptables and routing entries to do the right thing for ensuring pod's get's source ip set to keepalive VIP before leaving the node.

(Note; new in here, newbie, I might be totally off, just wanted to leave a note, but coming from a world where I used to pet servers in a datacenter, keepalived is a good friend for providing HA for a load balancer IP that is put into DNS as it uses software defined VRRP to ensure VIP is always available. From sig-network meeting and getting [ei]ngress towards such a keepalive VIP , it seems someone has already been playing around/been on an adventure to get it to work together with cloud providers using service type LoadBalancer: https://github.com/munnerz/keepalived-cloud-provider )

And also mention during the sig-network meeting if running in cloud, you could probably achieve the stories by getting a dedicated source ip for the whole cluster using platform/cloud provider dependent configuration of [ie]ngress to/from the kubernetes cluster, that would solve the use cases of reaching the destination that requires a whitelist of source IP.

KEP is targeting set of PODs, using label selector makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have made you confuse by not putting details on what should be done and how it is achieved in my PoC implementation. I will update the KEP to clarify them.

However, short explanation will be:

My goal is not to make PodIP or ClusterIP to be visible to applications not running on k8s cluster.
Instead, the goal is to make certain pods' source IPs to be fixed one to applications not running on k8s cluster. To do that I'm thinking about making N:1 map of Pods and VIP when PodIP is SNATed. (Actually, VIP could be assigned by anyway as k8s LoadBalancer allows it, but my PoC implementation just used keepalived-vip.)

My intention of excluding usecases like cross-cloud above, was to exclude scenarios like there are another SNAT, VPN, and so on between k8s cluster and applications, which will require much more works than just doing such a mapping when going out from k8s cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are use cases for both "outside the cluster but within the private network" and "all the way out to the internet". (eg, take any of the user stories below and assume that the kubernetes cluster is in a public cloud but the database server is not)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somethings popping up to my mind:

I wonder if it is something that could be done on service IP's, they are already VIPs inside the k8s cluster. So the SNATing you already do in PoC, could it be applied to kube-proxy?
How would this work for IPVS?
Ie is it okey to only support iptabels and not IPVS?

(I think I have a vague memory of maybe @thockin mentioning about a customer who wanted likewise support for egress on service vips during the sig-network call?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are use cases for both "outside the cluster but within the private network" and "all the way out to the internet". (eg, take any of the user stories below and assume that the kubernetes cluster is in a public cloud but the database server is not)

O.K. Let's also consider "all the way out to the internet". And if needed, let's set another milestone to achieve it.

is it okey to only support iptabels and not IPVS?

I think that it is a good idea to allow different implementation to forward packets for egress as kube-proxy do. Also, we might be able to leverage service as a mechanism to trace the PodIP. I will add description on this to "Design Details" section of this KEP to discuss this in detail.


## Motivation

In k8s, egress traffic has its source IP translated (SNAT) to appear as the node IP when it leaves the cluster. However, there are many devices and software that use IP based ACLs to restrict incoming traffic for security reasons and bandwidth limitations. As a result, this kind of ACLs outside k8s cluster will block packets from the pod, which causes a connectivity issue. To resolve this issue, we need a feature to assign a particular static egress source IP to a pod.

Related discussions are done in [here](https://github.com/kubernetes/kubernetes/issues/12265) and [here](https://github.com/cloudnativelabs/kube-router/issues/434).

### Goals

Provide users with an official and common way to assign a static egress source IP for packets from a pod to outside k8s cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a static egress source IP for packages from one or more pods" isn't it? (eg, Story 2 below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix it.


### Non-Goals

TBD

## Proposal

Expose an egress API to user like below to allow user to assign a static egress source IP to specific pod(s).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you proposing a Custom Resource Definition, or an actual API resource here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my PoC implementation, it uses CRD, for it is implemented as a k8s operator to reconcile the iptables rules and routing tables on all nodes. However, I think that we still have a choice to define k8s API or keep it as CRD. .


```
apiVersion: egress.mapper.com/v1alpha1
kind: Egress
metadata:
name: example-pod1-egress
spec:
ip: 192.168.122.222
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the restrictions/instructions for this IP be? Any IP in the node CIDR?

Also, is this meant to be sharable, or unique per-Egress?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is "Any IP in the node CIDR" and is sharable. I will add this to KEP.

kind: pod
namespace: default
name: pod1
```

PoC implementation is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what is the expectation of how the KEP'ed version of the feature would differ from the PoC?

More specifically, if this is something that can already be implemented entirely outside of kubernetes, then does it benefit from being moved into kubernetes?

Are you expecting that kubernetes would adopt essentially the PoC implementation, or merely the API surrounding it? Would this be something that would be core to Kubernetes, or would it be implemented by network plugins (who might be able to optimize in various ways that a generic implementation could not)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More specifically, if this is something that can already be implemented entirely outside of kubernetes, then does it benefit from being moved into kubernetes?

Benefits that I expects are:

  • Make it work with any CNI drivers. (Some CNI drivers won't work well just by my current PoC implementation)
  • Define stable API that can keep compatible with future k8s versions (I won't stick to make it a core k8s API, as long as, it can keep computability. As volume snapshot feature is implemented as CRD.)
  • Make use of existing k8s mechanisms like kube-proxy and service, if possible and useful

Then, it will provide users with the same UX across any k8s cluster and it will decrease developers burden to maintain the compatibility for this feature.

- https://github.com/mkimuram/egress-mapper
- https://github.com/steven-sheehy/kube-egress/pull/1

### User Stories [optional]

#### Story 1
As a user of Kubernetes, I have a pod which requires an access to a database that restricts access by source ip and exists outside the k8s cluster.
So, a pod which requires database access needs a specific egress source IP when sending packets to the database.

#### Story 2
As a user of Kubernetes, I have multiple pods which require an access to a database that restricts access by source ip and exists outside the k8s cluster.
So, multiple pods which require database access need a specific egress source IP when sending packets to the database.

#### Story 3
As a user of Kubernetes, I have some pods which require an access to different databases that restrict access by source ip and exists outside the k8s cluster.
So, some pods which require database access need a specific egress source IP when sending packets to the database, and other pods need another specific egress source IP.

### Implementation Details/Notes/Constraints [optional]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should explain how this will work on the nodes themselves. For example (and I haven't looked too deep into the implementation) does your implementation assign all the egress-source-ips to the node that the pod lives on? If so, how would that work in cloud environments that have tighter constraints on the IPs available to nodes. That kind of thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment.

It assigns each IP on one of the nodes by leveraging keepalived-vip (https://github.com/kubernetes-retired/contrib/tree/master/keepalived-vip), then forward packets from the node that pod lives on to the node that have the specific IP by iptables rule and routing table.


TBD

### Risks and Mitigations

TBD

## Design Details

### Test Plan

**Note:** *Section not required until targeted at a release.*

TBD

### Graduation Criteria

**Note:** *Section not required until targeted at a release.*

TBD

#### Examples

TBD

##### Alpha -> Beta Graduation

TBD

##### Beta -> GA Graduation

TBD

##### Removing a deprecated flag

TBD

### Upgrade / Downgrade Strategy

TBD

### Version Skew Strategy

TBD

## Implementation History

TBD

## Drawbacks [optional]

TBD

## Alternatives [optional]

TBD

## Infrastructure Needed [optional]

TBD