-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachprod: rewrite roachprod start
#51790
roachprod: rewrite roachprod start
#51790
Conversation
I've tested this with a few versions manually, but I also did that before and things still slipped through the crack. How do I make sure this is not breaking things any further? |
Feel free to reject this PR, I'm more than happy to send out a more targeted patch and leave things as is if this is too controversial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to reject this PR, I'm more than happy to send out a more targeted patch and leave things as is if this is too controversial.
I'm not going to have the time to properly review this patch this week. So if there is something that needs to be fixed soon, a more targeted patch would be useful. That said, start
has evolved over time and I can see the benefit in doing a clean slate rewrite.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @petermattis)
44ff1c6
to
72c9b0f
Compare
72c9b0f
to
5d14709
Compare
I'm good with reviewing this (even the complex form), just today I'm a bit swamped. Will get to this soon. |
This is pretty tough to review. Is there really not a more targetted patch that we could pull out that addresses the issue without rewriting everything at the same time? Or can we split the rewrite into a separate commit so that we aren't "sneaking in the fix for #51497"? I have a hard time even finding where that logic change is in this. |
Hm, ok, let me re-work this. I'll break this up so it's clearer where the code movement is happening. |
This will come in handy when understanding future commits. Release note: None
The ServerNodes API is a bit confusing to work with (see earlier commit improving it's documentation), and given it's iterable an index `n` into the returned list is either identifying the n-th node in the list, or node n within the cluster. To this end, using the name `nodeIdx` to refer to the former and `node` to refer to the latter, should hopefully clarify which one it is that we're referring to. Also, these loop bodies are pretty large, we shouldn't be using `i` so liberally anyway (especially when it's not a simple loop iteration index). Release note: None
5d14709
to
64bb390
Compare
PTAL, I'm keeping this PR focused on just the re-write, and outlining in comments what the current problems are without actually addressing them. I'll get to those after. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for splitting this up! It made it much easier to review.
Reviewed 1 of 2 files at r1, 1 of 1 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, 1 of 1 files at r5, 1 of 1 files at r6, 1 of 1 files at r7, 1 of 1 files at r8, 1 of 1 files at r9, 1 of 1 files at r10, 1 of 1 files at r11.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @irfansharif, @jlinder, and @knz)
pkg/cmd/roachprod/install/cockroach.go, line 520 at r4 (raw file):
for i, vpc := range h.c.VPCs { if i > 0 && vpc != h.c.VPCs[0] { advertisePublicIP = true
return true
and return false
below?
And add a few helper methods to start thinning down `roachprod start` code (this commit pulls out `useStartSingleNode`, `shouldAdvertisePublicIP`, `getEnvVars`, `distributeCerts`). Release note: None
The code here is a bit duplicated with the generation of `encryptArgs`, but in future commits we'll be separating out the generation of args provided to `cockroach start`, and this, though related, does not apply. Release note: None
These helpers focus on generating the raw commands that are used when initializing a cluster, and setting cluster settings following. Release note: None
These helpers actually initialize the cluster, and set cluster settings following. Release note: None
This helper generates arguments provided to `cockroach start`. We improve comments while here. Release note: None
This helper, in the same style as `generateClusterSettingCmd` and `generateInitCmd`, generates the raw command used to start a given node. Release note: None
This helper is responsible for actually starting a given cockroach node. Release note: None
This will be addressed in follow-on PRs. Release note: None
64bb390
to
e769049
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick turn around! Going to just blaze ahead to catch nightly failures tonight (hopefully this adds nothing more than already present).
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jlinder, @knz, and @nvanbenschoten)
This PR was included in a batch that was canceled, it will be automatically retried |
bors r+ |
Already running a review |
Build succeeded: |
..and the setting of cluster settings for single node clusters. `roachprod start --sequential` was broken in cockroachdb#51329, and the broken-ness outlined in TODOs in cockroachdb#51790. This PR just addresses those TODOs. Fixes cockroachdb#51497 Fixes cockroachdb#51721 Fixes cockroachdb#51738 Fixes cockroachdb#51768 Fixes cockroachdb#51769 Fixes cockroachdb#51776 Release note: None
..and the setting of cluster settings for single node clusters. `roachprod start --sequential` was broken in cockroachdb#51329, and the broken-ness outlined in TODOs in cockroachdb#51790. This PR just addresses those TODOs. Fixes cockroachdb#51497 Fixes cockroachdb#51721 Fixes cockroachdb#51738 Fixes cockroachdb#51768 Fixes cockroachdb#51769 Fixes cockroachdb#51776 Release note: None
51893: roachprod: fixup `roachprod --sequential` r=irfansharif a=irfansharif ..and the setting of cluster settings for single node clusters. `roachprod start --sequential` was broken in #51329, and the broken-ness outlined in TODOs in #51790. This PR just addresses those TODOs. Fixes #51497 Fixes #51721 Fixes #51738 Fixes #51768 Fixes #51769 Fixes #51776 Release note: None Co-authored-by: irfan sharif <[email protected]>
This is quite the workhorse, and does a lot and has to be compatible
with a lot of existing CRDB versions. It's grown organically as a result
and I'm finding it a bit difficult to maintain, breaking it down a bit
makes it clearer what the structure of it all is and would've perhaps
prevented me introducing bugs like #51497.
Do scrutinize the PR closely, we use
roachprod start
everywhere. It'smostly mindless code movement but it's pretty fragile code.
Release note: None