Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement global images #915

Merged
merged 21 commits into from
Apr 21, 2022
Merged

Conversation

jmpesp
Copy link
Contributor

@jmpesp jmpesp commented Apr 14, 2022

Fill in image creation code, adding a few fields to the image from RFD
4 (version and digest specifically). Also add block_size due to the
constraint that block_size must be present during any volume creation.

Store volume construction requests created from the parameters in
global_image_create. This commit does not implement global_image_delete
because that involves much work to figure out how to safely do this (are
any disks using it? are other volumes using this as a sub volume?).

Both "Image" and "GlobalImage" are required to separate authz policy, so
this commit separates the two. Project images will be implemented in
another PR.

Implement authz for global images. GlobalImagesList is a dummy authz
resource that was added to have a place to check list_children
permissions for.

Fill in image creation code, adding a few fields to the image from RFD
4 (version and digest specifically). Also add block_size due to the
constraint that block_size must be present during any volume creation.

Store volume construction requests created from the parameters in
global_image_create. This commit does not implement global_image_delete
because that involves much work to figure out how to safely do this (are
any disks using it? are other volumes using this as a sub volume?).

Both "Image" and "GlobalImage" are required to separate authz policy, so
this commit separates the two. Project images will be implemented in
another PR.

Implement authz for global images. GlobalImagesList is a dummy authz
resource that was added to have a place to check list_children
permissions for.
@jmpesp jmpesp requested review from davepacheco and smklein April 14, 2022 18:39
@jmpesp
Copy link
Contributor Author

jmpesp commented Apr 14, 2022

@davepacheco one unresolved part is that GlobalImage and GlobalImageList are both children of Fleet: https://github.com/oxidecomputer/omicron/pull/915/files#diff-f5f8df6872a0c2cc4cbc022eb186649c66c8aee85c51329b639736a82424d305R2776

I couldn't make Fleet -> GlobalImageList -> GlobalImage work, would appreciate a look

nexus/src/authz/api_resources.rs Outdated Show resolved Hide resolved
common/src/api/external/mod.rs Outdated Show resolved Hide resolved
nexus/src/db/datastore.rs Outdated Show resolved Hide resolved
common/src/sql/dbinit.sql Show resolved Hide resolved
nexus/src/authz/api_resources.rs Outdated Show resolved Hide resolved
}

params::ImageSource::Snapshot(_id) => {
unimplemented!()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about returning a 500 here instead of panicking? (You could even use unimplemented_todo, but it's not necessary.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opted for 503 in 3b88391

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using 503 for this case makes it harder for us to identify operational issues (usually 503s) vs. bugs (usually 500s) in the field...but hopefully this will never ship so it won't matter!

nexus/src/sagas.rs Outdated Show resolved Hide resolved
nexus/src/sagas.rs Outdated Show resolved Hide resolved
.global_image_fetch(&opctx, image_id)
.await
.map_err(|e| {
ActionError::action_failed(Error::internal_error(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this and the later one an internal error instead of directly using e?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get

the trait `From<omicron_common::api::external::Error>` is not implemented for `ActionError`

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant .map_err(ActionError::action_failed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in ef02727

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. There's another one further down I think.

nexus/src/sagas.rs Outdated Show resolved Hide resolved
@davepacheco
Copy link
Collaborator

@davepacheco one unresolved part is that GlobalImage and GlobalImageList are both children of Fleet: https://github.com/oxidecomputer/omicron/pull/915/files#diff-f5f8df6872a0c2cc4cbc022eb186649c66c8aee85c51329b639736a82424d305R2776

I couldn't make Fleet -> GlobalImageList -> GlobalImage work, would appreciate a look

I did the same thing with ConsoleSession. I don't think I thought too hard about it. I can help dig in next week if you'd like to resolve it but I don't think it's a big deal either way.

@davepacheco
Copy link
Collaborator

Very nice! I had several suggestions above but none are critical.

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome, only a couple minor comments.

common/src/sql/dbinit.sql Outdated Show resolved Hide resolved
nexus/src/external_api/views.rs Outdated Show resolved Hide resolved
nexus/src/sagas.rs Outdated Show resolved Hide resolved
nexus/src/sagas.rs Outdated Show resolved Hide resolved
nexus/tests/integration_tests/images.rs Show resolved Hide resolved
@@ -54,6 +55,24 @@ async fn test_unauthorized(cptestctx: &ControlPlaneTestContext) {
let client = &cptestctx.external_client;
let log = &cptestctx.logctx.log;

// Run a httptest server
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this is more explicit, we do this for:

Clickhouse (bind to port zero):

let clickhouse = dev::clickhouse::ClickHouseInstance::new(0).await.unwrap();

Nexus (use ConfigDropshot's default, which is 127.0.0.1:0):

let server = omicron_nexus::Server::start(&config, rack_id, &logctx.log)
.await
.unwrap();

/// Dropshot configuration for external API server
pub dropshot_external: ConfigDropshot,
/// Dropshot configuration for internal API server
pub dropshot_internal: ConfigDropshot,

https://docs.rs/dropshot/0.6.0/src/dropshot/config.rs.html#54-61

Etc.

Dropshot's local_addr() method makes it easy to access the "real" port that was allocated after the fact:

server.http_server_internal.local_addr(),

Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you decide not to do something similar with this mock that we do with CockroachDB and others?

@@ -442,11 +442,8 @@ CREATE TABLE omicron.public.image (
project_id UUID NOT NULL,
volume_id UUID NOT NULL,

/* Optional URL: Images may be backed by either a URL or a volume */
url STRING(8192),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you removed the "optional" comments here but the fields are still optional. Is that the intent or are they supposed to be NOT NULL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are supposed to be optional

fleet: Fleet,
/// ConsoleSessionList is a synthetic resource used for modeling who has access
/// to create sessions.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like something about the way this was merged caused this stuff to show up as part of your change. That's weird.

Some(Box::new(serde_json::from_str(volume.data()).map_err(
|e| {
ActionError::action_failed(Error::internal_error(
&e.to_string(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe want a little more context in this message? ("failed to deserialize volume data"?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 added in 4752359

params::DiskSource::Blank { block_size } => {
db::model::BlockSize::try_from(*block_size).map_err(|e| {
ActionError::action_failed(Error::internal_error(
&e.to_string(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe want a little more context in this message? I'm not sure what should go here -- I'm just imagining we see this error in the log. We'll be able to tell it came from this saga action, at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll at least see invalid block size {} from the db::model::BlockSize try_from, though I'm not sure how we would ever hit this error because params::BlockSize try_from also validates that it's one of the (currently) three supported values?

@jmpesp jmpesp merged commit f9bc90e into oxidecomputer:main Apr 21, 2022
@jmpesp jmpesp deleted the create_some_images branch April 21, 2022 12:54
@smklein smklein mentioned this pull request Apr 24, 2022
5 tasks
leftwo pushed a commit that referenced this pull request Sep 11, 2023
Crucible:
update rust crate base64 to 0.21.3 (#913)
update rust crate slog-async to 2.8 (#915)
update rust crate async-recursion to 1.0.5 (#912)
Move active jobs into a separate data structure and optimize `ackable_work` (#908)
Check repair IDs correctly (#910)
update actions/checkout action to v4 (#903)
update rust crate tokio to 1.32 (#890)
Remove single-item Vecs (#898)
Move "extent under repair" into a helper function (#907)
offset_mod shouldn't be randomized (#905)
Only rehash if a write may have failed (#899)
Make negotiation state an enum (#901)
Test update for fast write ack and gather errors on test failure (#897)

Propolis:
Update cpuid-gen util for better coverage
Make storage backend config more flexible and consistent
Use correct register sizes for PIIX3 PM device
Update bitflags dependency
fix softnpu port order (#517)
Use hex formatting for unhandled MMIO/PIO/MSRs
Update deps for new crucible and oximeter
Update standalone-with-crucible docs (#514)
leftwo added a commit that referenced this pull request Sep 12, 2023
Crucible:
update rust crate base64 to 0.21.3 (#913)
update rust crate slog-async to 2.8 (#915)
update rust crate async-recursion to 1.0.5 (#912)
Move active jobs into a separate data structure and optimize
`ackable_work` (#908) Check repair IDs correctly (#910)
update actions/checkout action to v4 (#903)
update rust crate tokio to 1.32 (#890)
Remove single-item Vecs (#898)
Move "extent under repair" into a helper function (#907) offset_mod
shouldn't be randomized (#905)
Only rehash if a write may have failed (#899)
Make negotiation state an enum (#901)
Test update for fast write ack and gather errors on test failure (#897)

Propolis:
Update cpuid-gen util for better coverage
Make storage backend config more flexible and consistent Use correct
register sizes for PIIX3 PM device
Update bitflags dependency
fix softnpu port order (#517)
Use hex formatting for unhandled MMIO/PIO/MSRs
Update deps for new crucible and oximeter
Update standalone-with-crucible docs (#514)

Co-authored-by: Alan Hanson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants