-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[nexus] Populate IP pool, nexus service information, during rack setup #2358
Conversation
// <pool_restriction> AND | ||
// time_deleted IS NULL | ||
// SELECT * FROM ( | ||
// SELECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're curious about how this works, there's an example fiddle which shows the logic:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking my understanding -- we're adding another nested subquery, so that we can filter the entire list of candidates by equality on the IP address. Is that accurate?
If so, I really wonder if we should restructure this. The main issue I have is that we're calling generate_series(0, last_address - first_address)
, and then filtering that. That range can be very big, and we're going to by definition select zero or one of them if the explicit IP is provided. As a example, if you change the IPs in your linked query to 10.0.0.1
and 10.0.255.255
, it runs in 25ms. If you change the last address to 10.10.255.255
, then it's 250ms. This is part of a larger issue with the performance of this query, but I think it'd make sense to improve it where we can.
This gets annoyingly more complicated, as any query with some conditional evaluation. But we could feasibly do something like this. That uses a case expression to select exactly the IP address we're asking for, if it's within the IP range, or NULL if it's not. I've not looked at how to embed that in the rest of the query, but there are similar queries in this file. I think the case where the user requested an IP that's not in the range would be handled in the same way as the error when the range is completely used. That could be good or bad, and we may need to have a different ELSE
clause in that conditional evaluation to distinguish between them. I'm not sure we care.
Anyway, that's a lot of work, so I'll let you decide if you want to deal with it now. If not, let's at least add a note to this issue that this part of the query could be improved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I admittedly struggled a bit with this query, because:
- On the one hand, I really want to have a "unified allocation flow" for IP allocation. I don't want to, for example, forget to update the collection rcgen for service IPs vs instance IPs...
- ... but on the other hand, yeah, scanning through addresses when we know we want to allocate one is kinda non-optimal.
I went ahead and used this optimized form in 1637370 , which more-or-less matches the fiddle you provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, lots of effort here! I have some questions about the details of the updated query, and a few nits, but overall looks excellent.
nexus/src/db/queries/external_ip.rs
Outdated
@@ -392,23 +393,34 @@ impl NextExternalIp { | |||
// each IP Pool, along with the pool/range IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring isn't accurate anymore. We're selecting zero or one IP address (and other candidate information) or the entire sequence, depending on whether explicit_ip
is provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in bc87a47
// <pool_restriction> AND | ||
// time_deleted IS NULL | ||
// SELECT * FROM ( | ||
// SELECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking my understanding -- we're adding another nested subquery, so that we can filter the entire list of candidates by equality on the IP address. Is that accurate?
If so, I really wonder if we should restructure this. The main issue I have is that we're calling generate_series(0, last_address - first_address)
, and then filtering that. That range can be very big, and we're going to by definition select zero or one of them if the explicit IP is provided. As a example, if you change the IPs in your linked query to 10.0.0.1
and 10.0.255.255
, it runs in 25ms. If you change the last address to 10.10.255.255
, then it's 250ms. This is part of a larger issue with the performance of this query, but I think it'd make sense to improve it where we can.
This gets annoyingly more complicated, as any query with some conditional evaluation. But we could feasibly do something like this. That uses a case expression to select exactly the IP address we're asking for, if it's within the IP range, or NULL if it's not. I've not looked at how to embed that in the rest of the query, but there are similar queries in this file. I think the case where the user requested an IP that's not in the range would be handled in the same way as the error when the range is completely used. That could be good or bad, and we may need to have a different ELSE
clause in that conditional evaluation to distinguish between them. I'm not sure we care.
Anyway, that's a lot of work, so I'll let you decide if you want to deal with it now. If not, let's at least add a note to this issue that this part of the query could be improved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good to me. @bnaecker already did a thorough review and I'll let him take the lead here.
Thanks for all the rework @smklein, this all looks good to me! Feel free to merge at your discretion. |
Summary
My long-term goal is to have Nexus be in charge of provisioning all services.
For that to be possible, Nexus must be able to internalize all input during the handoff from RSS. This PR extends the RSS -> Nexus handoff to include:
Nexus Changes
Database Records
nexus_service
record, which just includes the information about the in-use external IP address.IP Address Allocation
explicit_ip
option, which lets callers perform an allocation with an explicit request for a single IP address. You might ask the question: "Why not just directly create a record with the IP address in question, if you want to create it?" We could! But we'd need to recreate all the logic which validates that the IP address exists within the known-to-the-DB IP ranges within the pool.nexus/src/db/queries/external_ip.rs
Rack Initialization
nexus/src/db/datastore/rack.rs
.Populate
omicron_nexus::db::datastore::datastore_test
.Sled Agent changes
Fixes: #1958
Unblocks: #732