Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachprod: enable roachprod to be configured for external use #120340

Merged
merged 5 commits into from
May 31, 2024

Conversation

ajwerner
Copy link
Contributor

@ajwerner ajwerner commented Mar 12, 2024

After this change, roachprod fully works on GCP at my company with the following configuration:

ROACHPROD_DNS_REQUIRED_PROVIDERS=gce
[email protected]
ROACHPROD_GCE_DEFAULT_PROJECT=dem-ephemeral
ROACHPROD_GCE_DEFAULT_SERVICE_ACCOUNT=roachprod@dem-ephemeral.iam.gserviceaccount.com
ROACHPROD_GCE_DNS_DOMAIN=roachprod.exray.cloud
ROACHPROD_GCE_DNS_MANAGED_DOMAIN=roachprod-managed.dataexmachina.dev
ROACHPROD_GCE_DNS_MANAGED_ZONE=roachprod-managed
ROACHPROD_GCE_DNS_PROJECT=dem-ephemeral
ROACHPROD_GCE_DNS_ZONE=roachprod

All of these configuration options can also be set via flags.

See individual commits.

Epic: none
Release note: none

@ajwerner ajwerner requested a review from a team as a code owner March 12, 2024 14:53
@ajwerner ajwerner requested review from srosenberg and renatolabs and removed request for a team March 12, 2024 14:53
Copy link

blathers-crl bot commented Mar 12, 2024

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Mar 12, 2024
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@srosenberg srosenberg requested a review from DarrylWong March 12, 2024 16:23
@srosenberg srosenberg added backport-23.1.x Flags PRs that need to be backported to 23.1 backport-23.2.x Flags PRs that need to be backported to 23.2. labels Mar 12, 2024
@@ -131,7 +133,7 @@ func initFlags() {
providerOptsContainer[providerName].ConfigureCreateFlags(createCmd.Flags())

for _, cmd := range []*cobra.Command{
destroyCmd, extendCmd, listCmd, syncCmd, gcCmd,
destroyCmd, extendCmd, listCmd, syncCmd, gcCmd, setupSSHCmd, startCmd, pgurlCmd, adminurlCmd,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Methinks only setupSSHCmd calls Sync, which in turn may invoke SyncDNS. The others don't need parameterized DNS flags.

Copy link
Member

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refactoring hardcoded configs and making roachprod useful externally! I played around with the PR locally; things appear to work as they should. Still, it wouldn't hurt to do a full run of the nightlies as a sanity check.

@renatolabs @herkolategan I'd appreciate another pair of eyes.

Copy link

blathers-crl bot commented May 22, 2024

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@andreimatei
Copy link
Contributor

I rebased the patches. A review would be greatly appreciated; this PR will help us do other good things for cockroach :)
In particular, I want to propose some roachtest patches and getting roachprod working locally in an easier way would be great.

@srosenberg
Copy link
Member

I rebased the patches. A review would be greatly appreciated; this PR will help us do other good things for cockroach :) In particular, I want to propose some roachtest patches and getting roachprod working locally in an easier way would be great.

Thanks and sorry for the delay! I am happy with the changes. The reason it probably stalled is we didn't do any smoke testing. Lemme do some local testing as well as kick off a nightly run with SELECT_PROBABILITY.

@srosenberg
Copy link
Member

Will check the smoke test, SELECT_PROBABILITY=0.6, when it completes.

@andreimatei
Copy link
Contributor

Ugh, I think there might be something broken, so please hold on on the testing. I think things are good when roachprod is used as a binary, but not when roachtest uses it as a library. I'm working on roachtest myself, so I noticed.
I'll ping back when I've figured it out. Sorry for the noise.

@srosenberg
Copy link
Member

Ugh, I think there might be something broken, so please hold on on the testing. I think things are good when roachprod is used as a binary, but not when roachtest uses it as a library. I'm working on roachtest myself, so I noticed. I'll ping back when I've figured it out. Sorry for the noise.

No worries! Yep, seeing a lot of these in CI,

ERROR: (gcloud.compute.project-info.describe) The project property is set to the empty string, which is invalid.

@andreimatei
Copy link
Contributor

OK, I believe I've fixed the issue. roachtests run fine for me now.
I've also had to pile on one more commit to optionally disable the integration with the Prometheus registration service. PTAL
Thanks!

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 6 files at r6, 2 of 3 files at r8, 5 of 5 files at r9, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @DarrylWong, @herkolategan, and @renatolabs)

@srosenberg
Copy link
Member

Smoke test: SELECT_PROBABILITY=0.6.

@andreimatei
Copy link
Contributor

I've pushed the fix to the problem I was describing in #124678 (comment).
If you kick off another smoke test, I'll verify in the runner logs that the cluster reuse is working. (a more limited run of only a few tests, for example roachtest run acceptance, should do if that's easy).
Otherwise, the previous run looked fine, right?
Thanks!

@srosenberg
Copy link
Member

Kicked off another run of SELECT_PROBABILITY=0.6.

@andreimatei
Copy link
Contributor

Cluster reuse looks good now -- you see Using existing cluster in the runner's logs.

@srosenberg
Copy link
Member

Smoke test results look good! Resolving merge conflict and merging. Thanks for making roachprod ready for external use!

ajwerner added 4 commits May 30, 2024 14:24
This commit introduces the email domain flag persistently and sets its
default from the environment using ROACHPROD_EMAIL_DOMAIN.

Epic: None

Release note: None
This allows both flag and environment variable configuration for a
variety of critical pieces of functionality.

Epic: None

Release note: None
These subcommands also benefit from this configuration.

Epic: None

Release note: None
Before this change, DNS would only sync if the AWS provider was active.
This is now configurable.

Epic: None

Release note: None
@srosenberg srosenberg force-pushed the roachprod-hacking branch from 00db517 to 439daf3 Compare May 30, 2024 19:05
@srosenberg
Copy link
Member

srosenberg commented May 30, 2024

Final smoke test SELECT_PROBABILITY=0.25, since there were non-trivial merge conflicts after [1] got merged.

[1] #124604

This patch makes an empty value for the COCKROACH_PROM_HOST_URL env var
disable the roachprod's cluster registration with the Prometheus
registration service. This is useful for people who don't work at
CockroachLabs and don't have access to this service.

Release note: None
Epic: None
@srosenberg srosenberg force-pushed the roachprod-hacking branch from 439daf3 to ff46295 Compare May 30, 2024 20:45
@srosenberg srosenberg added backport-24.1.x Flags PRs that need to be backported to 24.1. and removed backport-23.1.x Flags PRs that need to be backported to 23.1 labels May 30, 2024
@srosenberg
Copy link
Member

Final smoke test looks good, let's 🚢

bors r+

@craig craig bot merged commit f2e7709 into cockroachdb:master May 31, 2024
22 checks passed
Copy link

blathers-crl bot commented May 31, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 48f0dc8 to blathers/backport-release-23.2-120340: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2.x failed. See errors above.


error creating merge commit from 48f0dc8 to blathers/backport-release-24.1-120340: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 24.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1. O-community Originated from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants