Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nexus] add sled provision state #4520

Merged
merged 10 commits into from
Nov 29, 2023

Conversation

sunshowers
Copy link
Contributor

@sunshowers sunshowers commented Nov 20, 2023

Add the notion of a sled provision state to Nexus. Currently, we will only use this to prevent new resources and regions from being provisioned to sleds.

This PR includes:

  1. Database updates and schema migrations.
  2. Database APIs in nexus-db-queries.
  3. An HTTP API.
  4. Tests for resource and region allocation.

Questions:

  • Verify transactionality -- I think this is good enough as is but we do look up the old sled value and update it in two separate txns.
  • I haven't touched this before, so I might have made some newbie mistakes! I feel like I got a decent sense.
TODO list

TODO:

Fixes (maybe partially) #2483.

Created using spr 1.3.5
@sunshowers sunshowers marked this pull request as draft November 20, 2023 01:58
@sunshowers sunshowers changed the title [nexus] add tracking for provision state [wip] [nexus] add tracking for provision state Nov 20, 2023
Created using spr 1.3.5
Created using spr 1.3.5
Created using spr 1.3.5
@sunshowers sunshowers changed the title [wip] [nexus] add tracking for provision state [nexus] add tracking for provision state Nov 23, 2023
@sunshowers sunshowers changed the title [nexus] add tracking for provision state [nexus] add sled provision state Nov 23, 2023
@sunshowers sunshowers marked this pull request as ready for review November 23, 2023 01:34
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I have a smattering of comments, but my big two are:

  1. We gotta do a minor rename to make the schema change work
  2. The transactional semantics of altering the provisionable state need some tweaking to make the return value consistent -- but I have a small essay discussing some of our options there

schema/crdb/13.0.0/up5.sql Outdated Show resolved Hide resolved
nexus/db-queries/src/db/datastore/sled.rs Outdated Show resolved Hide resolved
nexus/db-queries/src/db/datastore/sled.rs Outdated Show resolved Hide resolved
nexus/tests/integration_tests/schema.rs Outdated Show resolved Hide resolved
nexus/types/src/external_api/params.rs Show resolved Hide resolved
Created using spr 1.3.5
Created using spr 1.3.5
Created using spr 1.3.5
schema/crdb/dbinit.sql Outdated Show resolved Hide resolved
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thanks for the fixes!

Created using spr 1.3.5
Created using spr 1.3.5
Created using spr 1.3.5
@sunshowers sunshowers merged commit 67cd482 into main Nov 29, 2023
21 of 22 checks passed
@sunshowers sunshowers deleted the sunshowers/spr/nexus-add-tracking-for-provision-state branch November 29, 2023 07:57
Comment on lines +13178 to +13183
"description": "This is a state that isn't known yet.\n\nThis is defined to avoid API breakage.",
"type": "string",
"enum": [
"unknown"
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this how you intended this state to appear in the OpenAPI doc? I'm not clear on how this will prevent API breakage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #4608

sunshowers added a commit that referenced this pull request Dec 8, 2023
…4573)

While developing #4520, I observed that we were producing a number of
error messages that were:

* 503 Service Unavailable,
* With only an internal message attached
* But where the message is both safe and useful to display to clients.

This is my attempt to make the situation slightly better. To achieve
this, I made a few changes:

1. Make all the client errors carry a new `MessagePair` struct, which
consists of an external message. (Along the way, correct the definition
of e.g. the `Conflict` variant: it actually is an external message, not
an internal one.)
2. Define a new `InsufficientCapacity` variant that consists of both an
external and an internal message. This variant resolves to a 507
Insufficient Storage error, and has a more helpful message than just
"Service Unavailable".
3. Turn some current 503 errors into client errors so that the message
is available externally.

Looking for feedback on this approach!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants