-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add resource limits #4605
Add resource limits #4605
Conversation
In RFD-427 we specify that for the initial implementation we want to require that all silos have a quota. In considering the API and implementaiton more carefully I decided that at least temporarily we should drop create/delete given that deletions should never happen and creations should only happen when the silo is created (or via some internal process which creates quotas for pre-existing silos)
There will be a follow up PR to add capacity/utilization to the API both at the silo and system levels. Given that, and the general lack of actionability of quotas to silo users, I've just dropped the quota view from the API.
0afe397
to
e34542b
Compare
e34542b
to
96877b2
Compare
return external::Error::InvalidRequest { message: "Insufficient Capacity: Not enough CPUs to complete request. Either stop unused instances to free up resources or contact the rack operator to request a capacity increase.".to_string() } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually it would be nice to use @sunshowers's InsufficientCapacity
error from #4573
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, maybe not. I think we should distinguish between hitting up against human set capacity limits vs hardware limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zephraph and I chatted about this and post- this is what I suggested post-#4573:
- Have the
InsufficientCapacity
variant carry around an enum, e.g.InsufficientCapacityKind::UserQuota
and::System
, and theinsufficient_capacity
constructor accept that as an argument (forcing callers to specify whichInsufficientCapacityKind
variant they want) - In the message, include the kind: "Insufficient capacity (user quota)" and "... (system)"
- Consider returning different HTTP error codes for each variant.
cc @askfongjojo who I think has some thoughts about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sunshowers @zephraph: Thanks for looping me in. Once silo quotas are fully in place, user should rarely hit the System
capacity error, assuming that the sum of all silo quotas should not exceed the total usable capacity of the rack. But I can think of two situations when the System
capacity error would be hit before the UserQuota
one:
- Some sleds are in maintenance or taken out of the provisioning pool
- All sleds are relatively full and the user is trying to provision a very large VM.
As such, perhaps the UserQuota
kind error will be a HTTP 400 (end user needs to free up some capacity themselves) whereas the System
one is a 507 (operator needs to migrate some instances, check the sled status, force some silos to reduce their usage temporarily, etc.)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useful distinction. I think we should definitely make the difference clear in the error code. I don't know if they should have different status codes, though. I can imagine a situation where the system quota violation is also user-correctable by freeing up enough stuff. Or, like you say, if the instance they're trying to create is simply very large.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I'm just using the InsufficientCapacity
error wholesale. This isn't the ideal state, it would be good to follow up with a better error state, but it's sufficient for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created an issue to denote the need for better error messages: #4680
silo_user_list GET /v1/system/users | ||
silo_user_view GET /v1/system/users/{user_id} | ||
silo_view GET /v1/system/silos/{silo} | ||
system_quotas_list GET /v1/system/silo-quotas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed this endpoint from /v1/system/quotas
to /v1/system/silo-quotas
just b/c it's going under the silos tag now (as per @ahl's suggestion) and it seemed to fit better. I will likely just delete this endpoint in the capacity and utilization branch in favor of an endpoint that returns both quotas and provisioned counts.
schema/crdb/20.0.0/up02.sql
Outdated
-- ~70% of 128 threads leaving 30% for internal use | ||
90 AS cpus, | ||
-- 708 GiB (~70% of memory leaving 30% for internal use) | ||
760209211392 AS memory_bytes, | ||
-- 850 GiB (total storage / 3.5) | ||
912680550400 AS storage_bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are on ongoing discussions about abandoning the use of "assumed" usable capacity figures. But if we still want to use them, these numbers should be multiplied by the number of sleds and physical disks (excluding the M2s). So a half-rack will provide 90*16 = 1440 threads, and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is replaced with an arbitrarily high limit, significantly more than can be provisioned on even a full rack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thank you for taking on the heroic effort of tackling this. The CTE and DB migration code make sense to me -- this cut all the way across Nexus, and looks great.
A few comments below, but the only thing I'd consider a "blocker" is the from_sled_count
function, which is IMO kinda misleading. Everything else is just nits!
nexus/db-queries/src/db/queries/virtual_provisioning_collection_update.rs
Outdated
Show resolved
Hide resolved
…n_update.rs Co-authored-by: Sean Klein <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unbelievable work. The core diesel bit is wild. Had a couple of comments, nothing important.
storage: silo_quotas.storage.into(), | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you need to go from view to model? Is this used?
-- Adds quotas for any existing silos without them. | ||
-- The selected quotas are based on the resources of a half rack | ||
-- with 30% CPU and memory reserved for internal use and a 3.5x tax | ||
-- on storage for replication, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment no longer accurate I think 😂
.transaction_async(|conn| async move { | ||
diesel::insert_into(quotas_dsl::silo_quotas) | ||
.values(SiloQuotas::arbitrarily_high_default( | ||
DEFAULT_SILO.id(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a comment on the possibly-surprising-in-the-future fact that we do this for default silo but not internal silo.
This PR is a follow up to #4605 which adds views into capacity and utilization both at the silo and system level. API: |op|method|url| |--|--|--| |silo_utilization_list | GET | /v1/system/utilization/silos | |silo_utilization_view | GET | /v1/system/utilization/silos/{silo} | |utilization_view | GET | /v1/utilization | I'm not entirely satisfied w/ the silo utilization endpoints. They could be this instead: |op|method|url| |--|--|--| |silo_utilization_list | GET | /v1/system/silos-utilization | |silo_utilization_view | GET | /v1/system/silos/{silo}/utilization | Also take special note of the views ```rust // For the eyes of end users /// View of the current silo's resource utilization and capacity #[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)] pub struct Utilization { /// Accounts for resources allocated to running instances or storage allocated via disks or snapshots /// Note that CPU and memory resources associated with a stopped instances are not counted here /// whereas associated disks will still be counted pub provisioned: VirtualResourceCounts, /// The total amount of resources that can be provisioned in this silo /// Actions that would exceed this limit will fail pub capacity: VirtualResourceCounts, } // For the eyes of an operator /// View of a silo's resource utilization and capacity #[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)] pub struct SiloUtilization { pub silo_id: Uuid, pub silo_name: Name, /// Accounts for resources allocated by in silos like CPU or memory for running instances and storage for disks and snapshots /// Note that CPU and memory resources associated with a stopped instances are not counted here pub provisioned: VirtualResourceCounts, /// Accounts for the total amount of resources reserved for silos via their quotas pub allocated: VirtualResourceCounts, } ``` For users in the silo I use `provisioned` and `capacity` as the language. Their `capacity` is represented by the quota set by an operator. For the operator `provisioned` is the same but `allocated` is used to denote the amount of resources allotted via quotas. --- Note: I had planned to add a full system utilization endpoint to this PR but that would increase the scope. Instead will ship that API as a part of the next release. We can calculate some version of the full system utilization on the client by listing all the silos and their utilization. --------- Co-authored-by: Sean Klein <[email protected]>
This PR aims to introduce
quotas
as a concept into Nexus for allowing operators to enforce virtual resource limits at the silo level. The initial implementation will be limited to checks during instance start, disk creation, and snapshot creation. We will not being doing advanced quota recalculation as system resources change. We will not yet be enforcing intelligent quota caps where the sum of all quotas must be less than the theoretical available system virtual resources.The implementation of this functionality is shaped by RFD-427 but some desired functionality will be deferred given time/complexity constraints.
Longer term I believe the shape of quotas and perhaps even their relationship to silos may change. This PR implements a simplified version that matches closely to how the virtual resource provisioning tables are already built out. I know there's some oddness around the shape of the quotas table with it not having its own ID and otherwise being mildly divergent from other resources, but this was largely to ensure we could migrate to another solution and not overcomplicate the initial implementation.
TODO