Encode resource limits into schema #3864

zephraph · 2023-08-09T21:02:24Z

This PR aims to encode resource limits like max memory, max vCPU, etc explicitly into the schema so that clients can auto generate validation code for those checks.

As a consequence of taking on this work I had to shuffle around where we were defining these constant limits. Schema definitions for things like vCPU count actually live in common and (most) of the limits where in app. App uses common but not vice versa so it seemed to make sense to move these values over to common. An alternative here could be to create a new top level crate called omicron-limits or something. I chatted with @smklein about these options and we both think the addition to common is a lighter touch.

zephraph · 2023-08-11T18:35:36Z

openapi/nexus.json

+            "description": "The amount of memory to allocate to the instance, in bytes.\n\nMust be between 1 and 256 GiB.",
+            "type": "integer",
+            "format": "uint64",
+            "minimum": 1073741824,
+            "maximum": 274877906944


This worked out nicely! The numbers are a bit hard to read but I think the calculated description is helpful in that regard.

nexus-client/src/lib.rs

smklein · 2024-01-23T17:22:14Z

openapi/nexus.json

+                "minimum": 1073741824,
+                "maximum": 1098437885952


Cool to see this in the API, I hope this can make our clients better!

So, this actually isn't right. It's not the min/max that's wrong, it's the schema. We still want ByteCount to be preserved but we also want the limits. It took me a little digging to figure out what to do here.

According to the spec two models can be composed into a single object via allOf. So what we actually want here is an allOf with the ByteCount and these limited definitions. My latest commit has something that should produce that, but typify pukes on it here:

https://github.com/oxidecomputer/typify/blob/1f97f167923f001818d461b1286f8a5242abf8b1/typify-impl/src/merge.rs#L690

I plan to look at this more tomorrow.

I mean... the unimplemented! isn't wrong...

nexus-client/src/lib.rs

smklein · 2024-01-23T17:32:25Z

common/src/limits.rs

+pub const MIN_MEMORY_BYTES_PER_INSTANCE: u32 = 1 << 30; // 1 GiB
+pub const MAX_MEMORY_BYTES_PER_INSTANCE: u64 = 256 * (1 << 30); // 256 GiB


Nitpick: For all these "_BYTES" constants, could we use u64?

Actually, could we hold off on that until another PR? It has some decently wide ranging repercussions. There are a lot of cases that we have to move from try to try_from. Further there are situations where we're adding two u32 ints together which require them to both be transformed. We can do it, it's just going to touch a lot of places and it'd be nice to not have that all in this PR.

Sure, we can punt

@smklein you and your love of u64s....

I propose a simple plan, charge $0.005 / hour (USD) effective February 1st for all u32 usage until we migrate to a fully u64 world.

ahl · 2024-01-25T17:27:37Z

This PR aims to encode resource limits like max memory, max vCPU, etc explicitly into the schema so that clients can auto generate validation code for those checks.

Is that valuable? Why is that valuable? How do you envision the user experience changing / improving for what kinds of (erroneous) operations? It seems that there are other situations where we might kick back a value as being too large e.g. if the number of CPUs indicated is larger than what remains in one's allocation.

In other words, there is often additional validation that is done beyond what's expressed in the schema.

In addition, one might imagine additional limits that vary for different users. For example, customers might have policy limits to say "you can't have a VM larger than 4 vCPUS (i.e. because I'm cruel)" that I can imagine us supporting.

To be clear: I'm ambivalent about this change, and can see the clarity this provides in the SDKs. I also see the potential additional complexity and future stumbling blocks.

zephraph · 2024-01-29T15:56:31Z

Is that valuable? Why is that valuable? How do you envision the user experience changing / improving for what kinds of (erroneous) operations? It seems that there are other situations where we might kick back a value as being too large e.g. if the number of CPUs indicated is larger than what remains in one's allocation.

We hard code upper and lower limits for resources in several places. We should clearly communicate that those limits exist. Having a hard upper boundary doesn't mean that a submission in the acceptable range will always be successful. I don't believe making these constraints explicit will indicate that. There are going to be arbitrary computed constraints that of course shouldn't be in the schema.

I understand the relative value of this is low compared to other on-going work but I think it's sweating the details that makes our product excellent. The goal here is to provide clarity at what I hope is only a marginal complexity cost. This PR can't be merged until there's a PR on typify to enable the schema to be correct here, so I'll move it back to draft for the time being.

zephraph · 2024-02-23T03:38:12Z

I think, eventually, it would be good for someone to tackle something along similar lines. If we have hard code limits, showing them in the schema does seem to make sense. Also, wrangling all the consts to a semi-shared place seems like it would be an improvement.

zephraph added 5 commits August 9, 2023 15:15

Relocate limit constants to common

8fe589d

Update missed limit references

6e76814

Add maximum to CPU count

b6f7530

Merge branch 'main' into limits-in-schema

d2554f2

Add memory limits to instance

41f2384

zephraph commented Aug 11, 2023

View reviewed changes

zephraph marked this pull request as ready for review August 11, 2023 18:36

zephraph added 4 commits August 11, 2023 14:41

Fix clippy issue

6f7c7e1

Add disk size limits

9dd2737

Update sled agent spec

6a5ae3c

Fix clippy errors

58503b2

zephraph commented Aug 11, 2023

View reviewed changes

nexus-client/src/lib.rs Outdated Show resolved Hide resolved

zephraph linked an issue Aug 16, 2023 that may be closed by this pull request

Reflect instance resource limits in schema #3826

Open

david-crespo and others added 6 commits August 22, 2023 14:24

Merge branch 'main' into limits-in-schema

9c0916c

Merge branch 'main' into limits-in-schema

86525a3

Merge branch 'main' into limits-in-schema

8267ad0

Fix broken import after main merge

95c6837

Merge branch 'main' into limits-in-schema

7d0dbe5

Merge branch 'main' into limits-in-schema

16f0044

zephraph requested review from smklein and ahl January 23, 2024 15:50

smklein approved these changes Jan 23, 2024

View reviewed changes

zephraph added 2 commits January 23, 2024 23:24

Fix type error in e2e test

030211f

Trial a (not working) fix for schema updates

894041d

zephraph marked this pull request as draft January 29, 2024 15:56

zephraph closed this Feb 23, 2024

zephraph deleted the limits-in-schema branch February 23, 2024 03:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encode resource limits into schema #3864

Encode resource limits into schema #3864

zephraph commented Aug 9, 2023

zephraph Aug 11, 2023

smklein Jan 23, 2024

zephraph Jan 24, 2024

ahl Jan 25, 2024

smklein Jan 23, 2024

zephraph Jan 23, 2024

smklein Jan 23, 2024

ahl Jan 25, 2024 •

edited

Loading

smklein Jan 25, 2024

ahl commented Jan 25, 2024

zephraph commented Jan 29, 2024

zephraph commented Feb 23, 2024

		pub const MIN_MEMORY_BYTES_PER_INSTANCE: u32 = 1 << 30; // 1 GiB
		pub const MAX_MEMORY_BYTES_PER_INSTANCE: u64 = 256 * (1 << 30); // 256 GiB

Encode resource limits into schema #3864

Encode resource limits into schema #3864

Conversation

zephraph commented Aug 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahl Jan 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahl commented Jan 25, 2024

zephraph commented Jan 29, 2024

zephraph commented Feb 23, 2024

ahl Jan 25, 2024 •

edited

Loading