Skip to content

Commit

Permalink
IP pools data model and API rework (#4261)
Browse files Browse the repository at this point in the history
Closes #2148
Closes #4002
Closes #4003 
Closes #4006

## Background

#3985 (and followups #3998 and #4007) made it possible to associate an
IP pool with a silo so that instances created in that silo would get
their ephemeral IPs from said pool by default (i.e., without the user
having to say anything other than "I want an ephemeral IP"). An IP pool
associated with a silo was not accessible for ephemeral IP allocation
from other silos — if a disallowed pool was specified by name at
instance create time, the request would 404.

However! That was the quick version, and the data model left much to be
desired. The relation was modeled by adding a nullable `silo_id` and
sort-of-not-really-nullable `is_default` column directly on the IP pool
table, which has the following limitations (and there are probably
more):

* A given IP pool could only be associated with at most one silo, could
not be shared
* The concept of `default` was treated as a property of the pool itself,
rather than a property of the _association_ with another resource, which
is quite strange. Even if you could associate the pool with multiple
silos, you could not have it be the default for one and not for the
other
* There is no way to create an IP pool without associating it with
either the fleet or a silo
* Extending this model to allow association at the project level would
be inelegant — we'd have to add a `project_id` column (which I did in
#3981 before removing it in #3985)

More broadly (and vaguely), the idea of an IP pool "knowing" about silos
or projects doesn't really make sense. Entities aren't really supposed
to know about each other unless they have a parent-child relationship.

## Changes in this PR

### No such thing as fleet-scoped pool, only silo

Thanks to @zephraph for encouraging me to make this change. It is
dramatically easier to explain "link silo to IP pool" than it is to
explain "link resource (fleet or silo) to IP pool". The way to recreate
the behavior of a single default pool for the fleet is to simply
associate a pool with all silos. Data migrations ensure that existing
fleet-scoped pools will be associated with all silos. There can only be
one default pool for a silo, so in the rare case where pool A is a fleet
default and pool B is default on silo S, we associate both A and B with
S, but only B is made silo default pool.

### API

These endpoints are added. They're pretty self-explanatory.

```
ip_pool_silo_link                        POST     /v1/system/ip-pools/{pool}/silos
ip_pool_silo_list                        GET      /v1/system/ip-pools/{pool}/silos
ip_pool_silo_unlink                      DELETE   /v1/system/ip-pools/{pool}/silos/{silo}
ip_pool_silo_update                      PUT      /v1/system/ip-pools/{pool}/silos/{silo}
```

The `silo_id` and `is_default` fields are removed from the `IpPool`
response as they are now a property of the `IpPoolLink`, not the pool
itself.

I also fixed the silo-scoped IP pools list (`/v1/ip-pools`) and fetch
(`/v1/ip-pools/{pool}`) endpoints, which a) did not actually filter for
the current silo, allowing any user to fetch any pool, and b) took a
spurious `project` query param that didn't do anything.

### DB

The association between IP pools and fleet or silo (or eventually
projects, but not here) is now modeled through a polymorphic join table
called `ip_pool_resource`:

ip_pool_id | resource_type | resource_id | is_default
-- | -- | -- | --
123 | silo | 23 | true
123 | silo | 4 | false
~~65~~ | ~~fleet~~ | ~~FLEET_ID~~ | ~~true~~

Now, instead of setting the association with a silo or fleet at IP pool
create or update time, there are separate endpoints for adding and
removing an association. A pool can be associated with any number of
resources, but a unique index ensures that a given resource can only
have one default pool.

### Default IP pool logic

If an instance ephemeral IP or a floating IP is created **with a pool
specified**, we simply use that pool if it exists and is linked to the
user's silo.

If an instance ephemeral IP or a floating IP is created **without a pool
unspecified**, we look for a default pool for the current silo. If there
is a pool linked with the current silo with `is_default=true`, use that.
Otherwise, there is no default pool for the given scope and IP
allocation will fail, which means the instance create or floating IP
create request will fail.

The difference introduced in this PR is that we do not fall back to
fleet default if there is no silo default because we have removed the
concept of a fleet-scoped pool.

### Tests and test helpers

This is the source of a lot of noise in this PR. Because there can no
longer be a fleet default pool, we can no longer rely on that for tests.
The test setup was really confusing. We assumed a default IP pool
existed, but we still had to populate it (add a range) if we had to do
anything with it. Now, we don't assume it exists, we create it and add a
range and associate it with a silo all in one helper.

## What do customers have to do when they upgrade?

They should not _have_ to do anything at upgrade time.

If they were relying on a single fleet default pool to automatically be
used by new silos, when they create silos in the future they will have
to manually associate each new silo with the desired pool. We are
working on ways to make that easier or more automatic, but that's not in
this change. It is less urgent because silo creation is an infrequent
operation.

If they are _not_ using the previously fleet default IP pool named
`default` and do not want it to exist, they can simply delete any IP
ranges it contains, unlink it from all silos and delete it. If they are
not using it, there should not be any IPs allocated from it (which means
they can delete it).

---------

Co-authored-by: Justin Bennett <[email protected]>
  • Loading branch information
david-crespo and zephraph authored Jan 5, 2024
1 parent 69733d8 commit bf49833
Show file tree
Hide file tree
Showing 52 changed files with 2,633 additions and 1,120 deletions.
1 change: 1 addition & 0 deletions common/src/api/external/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -739,6 +739,7 @@ pub enum ResourceType {
LoopbackAddress,
SwitchPortSettings,
IpPool,
IpPoolResource,
InstanceNetworkInterface,
PhysicalDisk,
Rack,
Expand Down
23 changes: 21 additions & 2 deletions end-to-end-tests/src/bin/bootstrap.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ use end_to_end_tests::helpers::{generate_name, get_system_ip_pool};
use omicron_test_utils::dev::poll::{wait_for_condition, CondCheckError};
use oxide_client::types::{
ByteCount, DeviceAccessTokenRequest, DeviceAuthRequest, DeviceAuthVerify,
DiskCreate, DiskSource, IpRange, Ipv4Range, SiloQuotasUpdate,
DiskCreate, DiskSource, IpPoolCreate, IpPoolSiloLink, IpRange, Ipv4Range,
NameOrId, SiloQuotasUpdate,
};
use oxide_client::{
ClientDisksExt, ClientHiddenExt, ClientProjectsExt,
Expand Down Expand Up @@ -38,9 +39,27 @@ async fn main() -> Result<()> {

// ===== CREATE IP POOL ===== //
eprintln!("creating IP pool... {:?} - {:?}", first, last);
let pool_name = "default";
client
.ip_pool_create()
.body(IpPoolCreate {
name: pool_name.parse().unwrap(),
description: "Default IP pool".to_string(),
})
.send()
.await?;
client
.ip_pool_silo_link()
.pool(pool_name)
.body(IpPoolSiloLink {
silo: NameOrId::Name(params.silo_name().parse().unwrap()),
is_default: true,
})
.send()
.await?;
client
.ip_pool_range_add()
.pool("default")
.pool(pool_name)
.body(IpRange::V4(Ipv4Range { first, last }))
.send()
.await?;
Expand Down
4 changes: 4 additions & 0 deletions end-to-end-tests/src/helpers/ctx.rs
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,10 @@ impl ClientParams {
.build()?;
Ok(Client::new_with_client(&base_url, reqwest_client))
}

pub fn silo_name(&self) -> String {
self.rss_config.recovery_silo.silo_name.to_string()
}
}

async fn wait_for_records(
Expand Down
56 changes: 35 additions & 21 deletions nexus/db-model/src/ip_pool.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,10 @@
//! Model types for IP Pools and the CIDR blocks therein.
use crate::collection::DatastoreCollectionConfig;
use crate::impl_enum_type;
use crate::schema::ip_pool;
use crate::schema::ip_pool_range;
use crate::schema::ip_pool_resource;
use crate::Name;
use chrono::DateTime;
use chrono::Utc;
Expand Down Expand Up @@ -35,42 +37,23 @@ pub struct IpPool {
/// Child resource generation number, for optimistic concurrency control of
/// the contained ranges.
pub rcgen: i64,

/// Silo, if IP pool is associated with a particular silo. One special use
/// for this is associating a pool with the internal silo oxide-internal,
/// which is used for internal services. If there is no silo ID, the
/// pool is considered a fleet-wide pool and will be used for allocating
/// instance IPs in silos that don't have their own pool.
pub silo_id: Option<Uuid>,

pub is_default: bool,
}

impl IpPool {
pub fn new(
pool_identity: &external::IdentityMetadataCreateParams,
silo_id: Option<Uuid>,
is_default: bool,
) -> Self {
pub fn new(pool_identity: &external::IdentityMetadataCreateParams) -> Self {
Self {
identity: IpPoolIdentity::new(
Uuid::new_v4(),
pool_identity.clone(),
),
rcgen: 0,
silo_id,
is_default,
}
}
}

impl From<IpPool> for views::IpPool {
fn from(pool: IpPool) -> Self {
Self {
identity: pool.identity(),
silo_id: pool.silo_id,
is_default: pool.is_default,
}
Self { identity: pool.identity() }
}
}

Expand All @@ -93,6 +76,37 @@ impl From<params::IpPoolUpdate> for IpPoolUpdate {
}
}

impl_enum_type!(
#[derive(SqlType, Debug, Clone, Copy, QueryId)]
#[diesel(postgres_type(name = "ip_pool_resource_type"))]
pub struct IpPoolResourceTypeEnum;

#[derive(Clone, Copy, Debug, AsExpression, FromSqlRow, PartialEq)]
#[diesel(sql_type = IpPoolResourceTypeEnum)]
pub enum IpPoolResourceType;

Silo => b"silo"
);

#[derive(Queryable, Insertable, Selectable, Clone, Debug)]
#[diesel(table_name = ip_pool_resource)]
pub struct IpPoolResource {
pub ip_pool_id: Uuid,
pub resource_type: IpPoolResourceType,
pub resource_id: Uuid,
pub is_default: bool,
}

impl From<IpPoolResource> for views::IpPoolSilo {
fn from(assoc: IpPoolResource) -> Self {
Self {
ip_pool_id: assoc.ip_pool_id,
silo_id: assoc.resource_id,
is_default: assoc.is_default,
}
}
}

/// A range of IP addresses for an IP Pool.
#[derive(Queryable, Insertable, Selectable, Clone, Debug)]
#[diesel(table_name = ip_pool_range)]
Expand Down
19 changes: 16 additions & 3 deletions nexus/db-model/src/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ use omicron_common::api::external::SemverVersion;
///
/// This should be updated whenever the schema is changed. For more details,
/// refer to: schema/crdb/README.adoc
pub const SCHEMA_VERSION: SemverVersion = SemverVersion::new(22, 0, 0);
pub const SCHEMA_VERSION: SemverVersion = SemverVersion::new(23, 0, 1);

table! {
disk (id) {
Expand Down Expand Up @@ -504,7 +504,14 @@ table! {
time_modified -> Timestamptz,
time_deleted -> Nullable<Timestamptz>,
rcgen -> Int8,
silo_id -> Nullable<Uuid>,
}
}

table! {
ip_pool_resource (ip_pool_id, resource_type, resource_id) {
ip_pool_id -> Uuid,
resource_type -> crate::IpPoolResourceTypeEnum,
resource_id -> Uuid,
is_default -> Bool,
}
}
Expand Down Expand Up @@ -1426,8 +1433,9 @@ allow_tables_to_appear_in_same_query!(
);
joinable!(system_update_component_update -> component_update (component_update_id));

allow_tables_to_appear_in_same_query!(ip_pool_range, ip_pool);
allow_tables_to_appear_in_same_query!(ip_pool_range, ip_pool, ip_pool_resource);
joinable!(ip_pool_range -> ip_pool (ip_pool_id));
joinable!(ip_pool_resource -> ip_pool (ip_pool_id));

allow_tables_to_appear_in_same_query!(inv_collection, inv_collection_error);
joinable!(inv_collection_error -> inv_collection (inv_collection_id));
Expand Down Expand Up @@ -1478,6 +1486,11 @@ allow_tables_to_appear_in_same_query!(
allow_tables_to_appear_in_same_query!(dns_zone, dns_version, dns_name);
allow_tables_to_appear_in_same_query!(external_ip, service);

// used for query to check whether an IP pool association has any allocated IPs before deleting
allow_tables_to_appear_in_same_query!(external_ip, instance);
allow_tables_to_appear_in_same_query!(external_ip, project);
allow_tables_to_appear_in_same_query!(external_ip, ip_pool_resource);

allow_tables_to_appear_in_same_query!(
switch_port,
switch_port_settings_route_config
Expand Down
65 changes: 27 additions & 38 deletions nexus/db-queries/src/db/datastore/external_ip.rs
Original file line number Diff line number Diff line change
Expand Up @@ -76,22 +76,18 @@ impl DataStore {
.fetch_for(authz::Action::CreateChild)
.await?;

// If the named pool conflicts with user's current scope, i.e.,
// if it has a silo and it's different from the current silo,
// then as far as IP allocation is concerned, that pool doesn't
// exist. If the pool has no silo, it's fleet-scoped and can
// always be used.
let authz_silo_id = opctx.authn.silo_required()?.id();
if let Some(pool_silo_id) = pool.silo_id {
if pool_silo_id != authz_silo_id {
return Err(authz_pool.not_found());
}
// If this pool is not linked to the current silo, 404
if self.ip_pool_fetch_link(opctx, pool.id()).await.is_err() {
return Err(authz_pool.not_found());
}

pool
}
// If no name given, use the default logic
None => self.ip_pools_fetch_default(&opctx).await?,
None => {
let (.., pool) = self.ip_pools_fetch_default(&opctx).await?;
pool
}
};

let pool_id = pool.identity.id;
Expand Down Expand Up @@ -147,36 +143,29 @@ impl DataStore {
) -> CreateResult<ExternalIp> {
let ip_id = Uuid::new_v4();

// See `allocate_instance_ephemeral_ip`: we're replicating
// its strucutre to prevent cross-silo pool access.
let pool_id = if let Some(name_or_id) = params.pool {
let (.., authz_pool, pool) = match name_or_id {
NameOrId::Name(name) => {
LookupPath::new(opctx, self)
.ip_pool_name(&Name(name))
.fetch_for(authz::Action::CreateChild)
.await?
}
NameOrId::Id(id) => {
LookupPath::new(opctx, self)
.ip_pool_id(id)
.fetch_for(authz::Action::CreateChild)
.await?
}
};

let authz_silo_id = opctx.authn.silo_required()?.id();
if let Some(pool_silo_id) = pool.silo_id {
if pool_silo_id != authz_silo_id {
return Err(authz_pool.not_found());
}
// TODO: NameOrId resolution should happen a level higher, in the nexus function
let (.., authz_pool, pool) = match params.pool {
Some(NameOrId::Name(name)) => {
LookupPath::new(opctx, self)
.ip_pool_name(&Name(name))
.fetch_for(authz::Action::Read)
.await?
}
Some(NameOrId::Id(id)) => {
LookupPath::new(opctx, self)
.ip_pool_id(id)
.fetch_for(authz::Action::Read)
.await?
}
None => self.ip_pools_fetch_default(opctx).await?,
};

pool
} else {
self.ip_pools_fetch_default(opctx).await?
let pool_id = pool.id();

// If this pool is not linked to the current silo, 404
if self.ip_pool_fetch_link(opctx, pool_id).await.is_err() {
return Err(authz_pool.not_found());
}
.id();

let data = if let Some(ip) = params.address {
IncompleteExternalIp::for_floating_explicit(
Expand Down
Loading

0 comments on commit bf49833

Please sign in to comment.