Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

silo admin cannot manage TLS certificates (but is supposed to be able to) #4532

Closed
davepacheco opened this issue Nov 20, 2023 · 2 comments · Fixed by #4669
Closed

silo admin cannot manage TLS certificates (but is supposed to be able to) #4532

davepacheco opened this issue Nov 20, 2023 · 2 comments · Fixed by #4669
Assignees

Comments

@davepacheco
Copy link
Collaborator

@augustuswm reported trying to update a rack's TLS certificates and getting an unepected 403. Details about the user:

oxide --insecure current-user view
success
CurrentUser {
    display_name: "[email protected]",
    id: 8f7df011-ef5f-41c1-8f7a-c87dd90f723a,
    silo_id: a705588a-3e62-4c19-a227-c073476c33fc,
    silo_name: Name(
        "sandbox",
    ),
}

oxide --insecure current-user groups
Group {
    display_name: "all",
    id: 61b8fdd3-b32a-4d16-be6b-81633ad9c657,
    silo_id: a705588a-3e62-4c19-a227-c073476c33fc,
}

oxide --insecure silo policy view --silo a705588a-3e62-4c19-a227-c073476c33fc
success
SiloRolePolicy {
    role_assignments: [
        SiloRoleRoleAssignment {
            identity_id: 61b8fdd3-b32a-4d16-be6b-81633ad9c657,
            identity_type: SiloGroup,
            role_name: Admin,
        },
        SiloRoleRoleAssignment {
            identity_id: 8f7df011-ef5f-41c1-8f7a-c87dd90f723a,
            identity_type: SiloUser,
            role_name: Admin,
        },
    ],
}

He expected (correctly) that Silo Administrators are supposed to be able to modify TLS certificates. And indeed, the authz check for creating a certificate checks the CreateChild action on the SiloCertificateList:

opctx.authorize(authz::Action::CreateChild, &authz_cert_list).await?;

and the Oso policy file grants the corresponding permission to Silo admins:

# Both Fleet and Silo administrators can see and modify the Silo's
# certificates.
"list_children" if "admin" on "parent_silo";
"list_children" if "admin" on "parent_fleet";
"create_child" if "admin" on "parent_silo";
"create_child" if "admin" on "parent_fleet";

The failure was actually here:

19:10:22.371Z DEBG 1cfdb5b6-e568-436a-a85f-7fecf1b8eef2 (dropshot_external): roles
    actor_id = 8f7df011-ef5f-41c1-8f7a-c87dd90f723a
    authenticated = true
    local_addr = 172.30.2.6:443
    method = POST
    remote_addr = 75.72.239.35:51658
    req_id = 1a1f3341-f246-49b2-9476-7de61e34a4ca
    roles = RoleSet { roles: {} }
    uri = //v1/certificates
19:10:22.374Z DEBG 1cfdb5b6-e568-436a-a85f-7fecf1b8eef2 (dropshot_external): authorize result
    action = Read
    actor = Some(Actor::SiloUser { silo_user_id: 8f7df011-ef5f-41c1-8f7a-c87dd90f723a, silo_id: a705588a-3e62-4c19-a227-c073476c33fc, .. })
    actor_id = 8f7df011-ef5f-41c1-8f7a-c87dd90f723a
    authenticated = true
    local_addr = 172.30.2.6:443
    method = POST
    remote_addr = 75.72.239.35:51658
    req_id = 1a1f3341-f246-49b2-9476-7de61e34a4ca
    resource = DnsConfig
    result = Err(Forbidden)
    uri = //v1/certificates
19:10:22.374Z INFO 1cfdb5b6-e568-436a-a85f-7fecf1b8eef2 (dropshot_external): request completed
    error_message_external = Forbidden
    error_message_internal = Forbidden
    file = /home/build/.cargo/git/checkouts/dropshot-a4a923d29dccc492/fa728d0/dropshot/src/server.rs:841
    latency_us = 43717
    local_addr = 172.30.2.6:443
    method = POST
    remote_addr = 75.72.239.35:51658
    req_id = 1a1f3341-f246-49b2-9476-7de61e34a4ca
    response_code = 403
    uri = //v1/certificates

It's a failure to read the DnsConfig. I expect this is because before we call into the datastore to create the certificate (which is the authz check I linked above), we first validate it, which requires fetching the list of external DNS zones to figure out what the silo's external DNS names are going to be:

let silo_fq_dns_names =
self.silo_fq_dns_names(opctx, authz_silo.id()).await?;
let kind = params.service;
let new_certificate = db::model::Certificate::new(
authz_silo.id(),
Uuid::new_v4(),
kind.into(),
params,
&silo_fq_dns_names,
)?;
let cert = self
.db_datastore
.certificate_create(opctx, new_certificate)
.await?;

that goes into silo_fq_dns_names(), which indeed winds up here, checking the "read" privilege on DNS_CONFIG:

opctx.authorize(authz::Action::Read, &authz::DNS_CONFIG).await?;

I suspect this was introduced among the recent changes to validate the TLS certificates better. @jgallagher does this sound plausible? It's hard because we don't have a lot of testing around minimum privileges -- see #1374 for more on that.

In terms of fixes: right now, we use one resource (DnsConfig) to cover all of the DNS data, including the list of external DNS zone names and all the DNS names in both internal and external DNS. While it seems reasonable to me to say that Silo Admins should be able to see the list of external DNS zone names (i.e., the DNS domains that have been delegated to the rack), it seems excessive to grant them permission to see all the DNS names in both internal and external DNS. So one option would be to split this authz resource into two parts, grant privileges on one of them to "Silo Admin", and use that when listing external DNS zones. That would solve this problem but doesn't feel particularly easy or clean.

Another option is to use the pattern we use elsewhere, where Nexus assumes a privileged, internal-only identity to fetch this information on behalf of the authenticated user. We'd want to be careful to make sure that we're not exposing anything we shouldn't by doing this. (For example, if we decide the list of external DNS zone names is privileged information, we probably wouldn't want to let you discover it by trying to upload certificates for various possible names and seeing if they get rejected?) But my intuition is that a Silo Admin already needs to know the list of external DNS zone names in order to know what TLS certificates they need to provide (or what DNS names those certificates are valid for).

@jgallagher
Copy link
Contributor

I suspect this was introduced among the recent changes to validate the TLS certificates better. @jgallagher does this sound plausible?

Yep, definitely - #4100 is indeed the PR where the call to silo_fq_dns_names was added.

But my intuition is that a Silo Admin already needs to know the list of external DNS zone names in order to know what TLS certificates they need to provide (or what DNS names those certificates are valid for).

We can't change external DNS zone names today, but if we could, what authz resource do you think that would fall under?

@davepacheco
Copy link
Collaborator Author

But my intuition is that a Silo Admin already needs to know the list of external DNS zone names in order to know what TLS certificates they need to provide (or what DNS names those certificates are valid for).

We can't change external DNS zone names today, but if we could, what authz resource do you think that would fall under?

To my earlier comment I think we could model it in two different ways:

  • DnsConfigExternallyDelegatedDomainList (probably want a better name): a new synthetic resource. We check this instead of DnsConfig when you're listing or trying to change the list of DNS domains delegated to the rack. The policy grants read privileges on this to Silo Admins and anybody who can read DnsConfig. The policy grants write access only to people who can write DnsConfig (fleet administrators).
  • Don't change anything about the resources here but instead use the "internal Nexus context on behalf of the user" pattern. This is similar to what we do during provisioning when Nexus wants to list what sleds are available to decide where to put an Instance. Most users can't see the sleds. So Nexus does it _as itself) in a narrow, controlled way.

I lean towards the second because I think it better reflects what we're trying to achieve. We don't really want to grant this person privileges to read part of the DNS config (although it'd be fine if we did do that). We just want Nexus to be able to validate the certificate.

jgallagher added a commit that referenced this issue Dec 11, 2023
The additional cert validation added in #4100 broke the ability for silo
admins to upload new certs, because it introduced a call to fetch the
rack DNS configuration (in order to assemble the FQDNs for the silo to
check that the cert is valid for them). This PR fixes that by using an
elevated internal privilege for that DNS config lookup.

Fixes #4532.
@jgallagher jgallagher self-assigned this Dec 11, 2023
jgallagher added a commit that referenced this issue Dec 12, 2023
The additional cert validation added in #4100 broke the ability for silo
admins to upload new certs, because it introduced a call to fetch the
rack DNS configuration (in order to assemble the FQDNs for the silo to
check that the cert is valid for them). This PR fixes that by using an
elevated internal privilege for that DNS config lookup.

Fixes #4532.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants