Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strong silo isolation #1340

Closed
Tracked by #849
mx-shift opened this issue Jul 1, 2022 · 4 comments
Closed
Tracked by #849

Strong silo isolation #1340

mx-shift opened this issue Jul 1, 2022 · 4 comments
Labels
mvp security Related to security.
Milestone

Comments

@mx-shift
Copy link

mx-shift commented Jul 1, 2022

Recent conversations have been asking whether a user from silo A would have any reason to look at users or resources in silo B. While general sentiment is that cross-silo access to resources (instances, VPCs, etc) can be omitted for v1, users isn't quite so clear.

At a minimum, we need to handle the initial setup case. Assume a fresh rack where RSS has created an initial silo R that uses local authentication only and has a single user provisioned, user R_admin (_ notation for clarity and conciseness). User R_admin is granted CRUD permissions on silos themselves with the intent that this silo is primarily used for initial setup and recovery purposes. User R_admin creates a new silo, A, with an IdP configuration to allow login via SSO and JIT account provisioning. At this point silo A contains no user accounts as no logins via SSO have occurred. So how does user R_admin designate a user in silo A as a Silo Administrator?

One option is to have someone login to silo A via SSO which causes their account to be created as user A_user1. User R_admin can now see that account when the list the user accounts in silo A (note the cross-silo access here). R_admin can now assign the Silo Administrator role to A_user1. This has a few disadvantages:

  • Users with Silo CRUD permissions are implicitly given access to at least list users and modify ACLs in all silos
  • If R_admin != A_user1, the process need to be passed back and forth between them
  • If A_user1 accidentally removes their Silo Admin role, only R_admin can recover it
  • Every role assignment would need to be made individually after each user has logged in

Alternatively, Silo configuration could specify a SAML assertion key that contains group membership provided by the IdP as well as a mapping of group identifiers to role assignments that define "default" role assignments. For example, a SAML assertion might say user B_user1 is part of IdP groups B_IDP_DOMAIN_ADMINS and B_IDP_IT while silo B's configuration says B_IDP_DOMAIN_ADMINS grants Silo Administrator and B_IDP_ENGINEERING grants Collaborator. This can all be set up before any users log into silo B. When a user does login to silo B via SSO, Nexus would extract the groups from the SAML assertion, translate them to roles, and apply those roles to the user. Since this happens on each login, even if B_user1 removes their Silo Administrator role by accident, logging out and back in would restore it.

@davepacheco
Copy link
Collaborator

For further reference, this was discussed at some length in the second half hour of today's "operator api" meeting. I think proposals included:

  • having privileges above the Silo (i.e., at fleet level) do not confer any privileges (even read privileges) on anything within the Silo -- only the ability to manipulate the config of the Silo (which might include IdP config, which might include which IdP groups get which roles within the Silo)
  • I think there was a second thing?

@zephraph
Copy link
Contributor

zephraph commented Jul 5, 2022

The second thing is somewhat articulated here: #1329 (comment)

The TL;DR is we don't have a clear definition of what is a global resource vs what is a silo scoped resource in the API. The separation is mildly muddled. This is just an ergonomic API thing, but I talked through the options in the comment above.

@david-crespo
Copy link
Contributor

I don’t think there’s anything left here that we want to to. Reopen if that’s not true.

@davepacheco
Copy link
Collaborator

davepacheco commented Mar 22, 2024

I think any remaining work is covered under #1681. (edit: wrong link the first time)

leftwo pushed a commit that referenced this issue Jun 26, 2024
Added a new package, crucible-dtrace that pulls from buildomat a package
that contains a set of DTrace scripts.  These scripts are extracted into
the global zone at /opt/oxide/crucible_dtrace/

Update Crucible to latest includes these updates:
Clean up dependency checking, fixing space leak (#1372)
Make a DTrace package (#1367)
Use a single context in all messages (#1363)
Remove `DownstairsWork`, because it's redundant (#1371)
Remove `WorkState`, because it's implicit (#1370)
Do work immediately upon receipt of a job, if possible (#1366)
Move 'do work for one job' into a helper function (#1365)
Remove `DownstairsWork` from map when handling it (#1361)
Using `block_in_place` for IO operations (#1357)
update omicron deps; use re-exported dropshot types in oximeter-producer configuration (#1369)
Parameterize more tests (#1364)
Misc cleanup, remove sqlite references. (#1360)
Fix `Extent::close` docstring (#1359)
Make many `Region` functions synchronous (#1356)
Remove `Workstate::Done` (unused) (#1355)
Return a sorted `VecDeque` directly (#1354)
Combine `proc_frame` and `do_work_for` (#1351)
Move `do_work_for` and `do_work` into `ActiveConnection` (#1350)
Support arbitrary Volumes during replace compare (#1349)
Remove the SQLite backend (#1352)
Add a custom timeout for buildomat tests (#1344)
Move `proc_frame` into `ActiveConnection` (#1348)
Remove `UpstairsConnection` from `DownstairsWork` (#1341)
Move Work into ConnectionState (#1340)
Make `ConnectionState` an enum type (#1339)
Parameterize `test_repair.sh` directories (#1345)
Remove `Arc<Mutex<Downstairs>>` (#1338)
Send message to Downstairs directly (#1336)
Consolidate `on_disconnected` and `remove_connection` (#1333)
Move disconnect logic to the Downstairs (#1332)
Remove invalid DTrace probes. (#1335)
Fix outdated comments (#1331)
Use message passing when a new connection starts (#1330)
Move cancellation into Downstairs, using a token to kill IO tasks (#1329)
Make the Downstairs own per-connection state (#1328)
Move remaining local state into a `struct ConnectionState` (#1327)
Consolidate negotiation + IO operations into one loop (#1322)
Allow replacement of a target in a read_only_parent (#1281)
Do all IO through IO tasks (#1321)
Make `reqwest_client` only present if it's used (#1326)
Move negotiation into Downstairs as well (#1320)
Update Rust crate clap to v4.5.4 (#1301)
Reuse a reqwest client when creating Nexus clients (#1317)
Reuse a reqwest client when creating repair client (#1324)
Add % to keep buildomat happy (#1323)
Downstairs task cleanup (#1313)
Update crutest replace test, and mismatch printing. (#1314)
Added more DTrace scripts. (#1309)
Update Rust crate async-trait to 0.1.80 (#1298)
leftwo added a commit that referenced this issue Jun 26, 2024
Update Crucible and Propolis to the latest

Added a new package, crucible-dtrace that pulls from buildomat a package
that contains a set of DTrace scripts. These scripts are extracted into the 
global zone at /opt/oxide/crucible_dtrace/

Crucible latest includes these updates:
Clean up dependency checking, fixing space leak (#1372) Make a DTrace
package (#1367)
Use a single context in all messages (#1363)
Remove `DownstairsWork`, because it's redundant (#1371) Remove
`WorkState`, because it's implicit (#1370)
Do work immediately upon receipt of a job, if possible (#1366) Move 'do
work for one job' into a helper function (#1365) Remove `DownstairsWork`
from map when handling it (#1361) Using `block_in_place` for IO
operations (#1357)
update omicron deps; use re-exported dropshot types in oximeter-producer
configuration (#1369) Parameterize more tests (#1364)
Misc cleanup, remove sqlite references. (#1360)
Fix `Extent::close` docstring (#1359)
Make many `Region` functions synchronous (#1356)
Remove `Workstate::Done` (unused) (#1355)
Return a sorted `VecDeque` directly (#1354)
Combine `proc_frame` and `do_work_for` (#1351)
Move `do_work_for` and `do_work` into `ActiveConnection` (#1350) Support
arbitrary Volumes during replace compare (#1349) Remove the SQLite
backend (#1352)
Add a custom timeout for buildomat tests (#1344)
Move `proc_frame` into `ActiveConnection` (#1348)
Remove `UpstairsConnection` from `DownstairsWork` (#1341) Move Work into
ConnectionState (#1340)
Make `ConnectionState` an enum type (#1339)
Parameterize `test_repair.sh` directories (#1345)
Remove `Arc<Mutex<Downstairs>>` (#1338)
Send message to Downstairs directly (#1336)
Consolidate `on_disconnected` and `remove_connection` (#1333) Move
disconnect logic to the Downstairs (#1332)
Remove invalid DTrace probes. (#1335)
Fix outdated comments (#1331)
Use message passing when a new connection starts (#1330) Move
cancellation into Downstairs, using a token to kill IO tasks (#1329)
Make the Downstairs own per-connection state (#1328) Move remaining
local state into a `struct ConnectionState` (#1327) Consolidate
negotiation + IO operations into one loop (#1322) Allow replacement of a
target in a read_only_parent (#1281) Do all IO through IO tasks (#1321)
Make `reqwest_client` only present if it's used (#1326) Move negotiation
into Downstairs as well (#1320)
Update Rust crate clap to v4.5.4 (#1301)
Reuse a reqwest client when creating Nexus clients (#1317) Reuse a
reqwest client when creating repair client (#1324) Add % to keep
buildomat happy (#1323)
Downstairs task cleanup (#1313)
Update crutest replace test, and mismatch printing. (#1314) Added more
DTrace scripts. (#1309)
Update Rust crate async-trait to 0.1.80 (#1298)

Propolis just has this one update:
Allow boot order config in propolis-standalone
---------

Co-authored-by: Alan Hanson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mvp security Related to security.
Projects
None yet
Development

No branches or pull requests

5 participants