Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: Best practices for ip pool setup #2679

Closed
askfongjojo opened this issue Mar 27, 2023 · 3 comments
Closed

Documentation: Best practices for ip pool setup #2679

askfongjojo opened this issue Mar 27, 2023 · 3 comments

Comments

@askfongjojo
Copy link

askfongjojo commented Mar 27, 2023

The default ip pool is currently used for all SNAT IP address assignment as well as any ephemeral ip address assignment if user does not provide an alternate pool name in their create instance request payload. I notice that SNAT IP are consumed at a rate of 1 for every 4 VMs (this is anecdotal: when I have 10 address in the default ip pool, the "no external IP address" error starts to show up after 40 VMs; when I have 15, it gets up to 60 VMs). IOW, the ip addresses are exhausted rather quickly.

For customers who run most of their applications behind the firewall and want to preserve the precious IPv4 external IP addresses, it seems that they should specify something other than IPv4 address in the default ip pool (e.g. a private VLAN that has routes to the internet, or a range of IPv6 addresses). The IPv4 public addresses will live in another pool that operator advertises to the end-users strictly for apps that need inbound external access and cannot handle IPv6. Would these be the right recommendations? Does other system usage of the default pool have a dependency on having IPv4 public IP addresses in it?

@bnaecker
Copy link
Collaborator

Your anecdata around the SNAT IP use is correct. We currently allocate 1/4 of the port-range for each IP address when creating a new VM with an SNAT IP address. That means we have 4 VMs per IP. To be clear, this is true of IPv6 as well, but you'd need 2 ^ 66 allocated addresses before you ran out.

I don't quite follow all the questions in your second paragraph. If the instances need some form of outbound network access, they have a few options:

  • Use IPv6, which has the same 4-VMs-per-IP situation, but the subnets are vastly larger.
  • Use IPv4 and some kind of proxy. By that, I mean something like: one instance with an external IP address (either SNAT or Ephemeral); to which all other instances use VPC-private addressing to communicate; and which proxies all internal traffic out to the external network. A VLAN-based approach may work as well, I'd have to think about it more.

I think that's basically what you're getting at in your recommendations. IPv4 is limited, and customers will have to either separate out different networks using something like a VLAN or a proxy/bastion. They can also use IPv6. Making clear that we do reserve 4 VMs per IP is a good idea, so that they can decide themselves how to handle.

Does other system usage of the default pool have a dependency on having IPv4 public IP addresses in it?

An IPv4 address in the pool would be needed for hosting external facing services such as Nexus on the customer network. That's the only one I'm currently aware of, but other services in the future like routing daemons would also need them.

@askfongjojo
Copy link
Author

askfongjojo commented Mar 28, 2023

I think that's basically what you're getting at in your recommendations. IPv4 is limited, and customers will have to either separate out different networks using something like a VLAN or a proxy/bastion. They can also use IPv6. Making clear that we do reserve 4 VMs per IP is a good idea, so that they can decide themselves how to handle.

Yes, that's essentially my question. If we don't document how addresses in ip pools are consumed, operator may put the entire IPv4 public IP block (those allocated to Oxide) into the default pool and be surprised that they are all consumed sooner than expected. However, if they assign a VLAN or some proxy/bastion addresses that supports only outbound internet access, it may be a problem if Nexus is getting its external IP from the default pool to get inbound access.

@askfongjojo
Copy link
Author

The ip-pool-services (omicron #1531) should solve the conflict. RSS should use that IP pool for Nexus and any other future Oxide external-facing services. I've verified that SNAT doesn't steal from the service ip pool so it can have those handful of IPv4 addresses customer provides for control plane's exclusive use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants