Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachprod: fix update & get latest #124592

Merged
merged 1 commit into from
May 29, 2024
Merged

Conversation

srosenberg
Copy link
Member

@srosenberg srosenberg commented May 23, 2024

Previously, roachprod binary was fetched directly from
the artifacts of a corresponding TeamCity build.
After moving it to Github actions with Engflow, the
artifact is no longer mapped to a unique URL, thus
making it cumbersome to fetch via an API.

This change updates Roachtest Nightly (in TC for
the foreseeable future) to upload the roachprod binary
into GCS, where we already store other artifacts, e.g.,
(roachperf) stats.json. Consequently, we update
scripts/roachprod-get-latest.sh and roachprod update
to fetch the binary from the new location. Finally, we
add additional sanity checks around "staleness", including
a warning in roachprod (on startup), if the binary is older
than two weeks.

Note that this PR makes only linux/amd64 roachprod binaries
available. For mac and arm64 builds, we'd have to configure
CI to upload the corresponding binaries.

Informs: #120750
Epic: CRDB-8035
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@srosenberg srosenberg force-pushed the sr/120750 branch 7 times, most recently from 7e1deb2 to f766f01 Compare May 28, 2024 18:55
@srosenberg srosenberg marked this pull request as ready for review May 28, 2024 18:55
@srosenberg srosenberg requested review from a team as code owners May 28, 2024 18:55
@srosenberg srosenberg requested review from DarrylWong, renatolabs, rickystewart and vidit-bhat and removed request for a team May 28, 2024 18:55
@srosenberg
Copy link
Member Author

Emulated "staleness" check which emits this warning in roachprod,

roachprod
WARN: roachprod binary is >= 2 weeks old (569h7m14.325s); latest sha: "abcdef"
WARN: Consider updating the binary: `roachprod update`

@srosenberg srosenberg requested a review from herkolategan May 28, 2024 18:57
@@ -2,7 +2,7 @@

set -euo pipefail

source "$(dirname $0)/roachtest_arch_util.sh"
source $root/build/teamcity/util/roachtest_arch_util.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we depend on the variable $root, this won't work unless the person running the script has already executed teamcity-support.sh in this shell. (And then e.g. ran this script with . roachtest_compile_bits.sh) Maybe true in all situations where we use this script, but worth pointing out.

Copy link
Member Author

@srosenberg srosenberg May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It'd be nice to always export root or something similar. I find the other pattern (i.e., dirname) to be error-prone, especially when refactoring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a comment like ("must run build/teamcity-support.sh before executing this script") at the top here just to make the dependency clear, then.

--slack-token="${SLACK_TOKEN}" \
--suite nightly \
"${TESTS}"
#build/teamcity-roachtest-invoke.sh \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mistake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Forgot to uncomment after streamlining the process of testing in CI :)

arch=$(get_host_arch)
os=linux
sha=${BUILD_VCS_NUMBER}
gsutil cp bin/roachprod gs://$bucket/binaries/$branch/$os/$arch/roachprod.$sha
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Would it be better if we instead just had one file $os/$arch/roachprod.$branch.latest? Or, even just one latest binary for master (per OS/arch combo)?

We only need the latest one, right? Realistically, people are not going to be downloading roachprod for older SHA's. This would also obviate the need for garbage collection on the bucket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, even just one latest binary for master (per OS/arch combo)?

That was my original thought too. However, I opted for keeping nightly builds in case of regressions.

Realistically, people are not going to be downloading roachprod for older SHA's.

In most cases, that's true. When we do encounter a regression, it would be nice to tell folks, grab the binary with this SHA or wait until a fix is merged. The latter may take time, so it's nice to have a quick "workaround".

Copy link
Member Author

@srosenberg srosenberg May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a retention policy for binaries,

retention_policy

DEFAULT_BRANCH="master"
DEFAULT_OS=$(uname | tr '[:upper:]' '[:lower:]')
DEFAULT_ARCH=$(arch | tr '[:upper:]' '[:lower:]')
BUCKET="cockroach-nightly"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any garbage collection set up, or is the idea that we would be holding on to all of these roachprods indefinitely? Each is ~100MB. Storage costs are going to start adding up quickly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could configure a retainment policy on the bucket... but considering this is one binary per release, per night, then our annual cost is ~$28, using crude arithmetic (4* 100 * 365) / 1024) * 0.2. So, it's not that expensive :)

set -x

# Uploads roachprod and roachtest binaries to GCS.
function upload_binaries {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only run the roachtest nightlies on Linux machines, wouldn't this mean we never publish Mac binaries?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I don't think it's a deal breaker for this PR, but we may need to find/patch a job that already builds for mac, maybe "local roachtest"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local roachtests don't run in TC on master any more since the switch to GitHub Actions. We could add a new build configuration specifically for this.

Until this is resolved, let's remove "Fixes #120750" from the commit/PR description.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the commit msg. I'll do a quick poll to see how many would prefer that mac and arm64 is also available.

@rickystewart
Copy link
Collaborator

Separately, lots of lint failures. Consider using lintonbuild so you don't have to go back-and-forth with CI :)

@srosenberg
Copy link
Member Author

Consider using lintonbuild so you don't have to go back-and-forth with CI :)

I did for a while actually, until it got too slow :) Especially, when you're doing multiple builds (on different branches) back-to-back.

@srosenberg srosenberg force-pushed the sr/120750 branch 5 times, most recently from 5d634f0 to e18f1f4 Compare May 28, 2024 22:01
Previously, roachprod binary was fetched directly from
the artifacts of a corresponding TeamCity build.
After moving it to Github actions with Engflow, the
artifact is no longer mapped to a unique URL, thus
making it cumbersome to fetch via an API.

This change updates Roachtest Nightly (in TC for
the foreseeable future) to upload the roachprod binary
into GCS, where we already store other artifacts, e.g.,
(roachperf) stats.json. Consequently, we update
`scripts/roachprod-get-latest.sh` and `roachprod update`
to fetch the binary from the new location. Finally, we
add additional sanity checks around "staleness", including
a warning in roachprod (on startup), if the binary is older
than two weeks.

Note that this PR makes only linux/amd64 roachprod binaries
available. For mac and arm64 builds, we'd have to configure
CI to upload the corresponding binaries.

Informs: cockroachdb#120750
Epic: CRDB-8035
Release note: None
@srosenberg
Copy link
Member Author

Everything is now green and ready to go. PTAL!

@craig craig bot merged commit dc04a90 into cockroachdb:master May 29, 2024
22 checks passed
Copy link

blathers-crl bot commented May 29, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 780b298 to blathers/backport-release-23.2-124592: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2.x failed. See errors above.


error creating merge commit from 780b298 to blathers/backport-release-24.1-124592: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 24.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

herkolategan added a commit to herkolategan/cockroach that referenced this pull request Jun 1, 2024
The `roachtest_compile_bits.sh` now require `$root` to be defined, see cockroachdb#124592
for details. This change adds the required exports to the
`roachtest_weekly_impl.sh` script.

Release Note: None
Epic: None
craig bot pushed a commit that referenced this pull request Jun 1, 2024
124957: roachtest: export $root for roachtest weekly r=srosenberg a=herkolategan

The `roachtest_compile_bits.sh` now require `$root` to be defined, see #124592 for details. This change adds the required exports to the `roachtest_weekly_impl.sh` script.

Release Note: None
Epic: None

Co-authored-by: Herko Lategan <[email protected]>
herkolategan added a commit to herkolategan/cockroach that referenced this pull request Jun 7, 2024
The `roachtest_compile_bits.sh` now require `$root` to be defined, see cockroachdb#124592
for details. This change adds the required exports to the
`roachtest_weekly_impl.sh` script.

Release Note: None
Epic: None
herkolategan added a commit to herkolategan/cockroach that referenced this pull request Jun 7, 2024
The `roachtest_compile_bits.sh` now require `$root` to be defined, see cockroachdb#124592
for details. This change adds the required exports to the
`roachtest_weekly_impl.sh` script.

Release Note: None
Epic: None
Dhruv-Sachdev1313 pushed a commit to Dhruv-Sachdev1313/cockroach that referenced this pull request Jun 7, 2024
The `roachtest_compile_bits.sh` now require `$root` to be defined, see cockroachdb#124592
for details. This change adds the required exports to the
`roachtest_weekly_impl.sh` script.

Release Note: None
Epic: None
@renatolabs
Copy link
Contributor

We can close #120750 now, yes? Or was there a reason this PR was only marked as Informs?

@rickystewart
Copy link
Collaborator

@renatolabs Check out the discussion here.

srosenberg added a commit to srosenberg/cockroach that referenced this pull request Nov 21, 2024
It appears a change to TC wrapper scripts in [1]
may have caused the `root` var. to become unbound.
This PR adds the missing `source` statement.

[1] cockroachdb#124592

Epic: none

Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Nov 21, 2024
It appears a change to TC wrapper scripts in [1]
may have caused the `root` var. to become unbound.
This PR adds the missing `source` statement.

[1] cockroachdb#124592

Epic: none

Release note: None
craig bot pushed a commit that referenced this pull request Nov 21, 2024
135913: ci: fix private roachtest nightly r=darrylwong a=srosenberg

It appears a change to TC wrapper scripts in [1]
may have caused the `root` var. to become unbound. This PR adds the missing `source` statement.

[1] #124592

Epic: none

Release note: None

135921: kvcoord: Disable follower reads for TestProxyTracing r=iskettaneh a=iskettaneh

This commit disables follower reads for the test TestProxyTracing. We noticed that sometimes, the test gets slow, and by the time we issue a read request, it's served via follower reads instead of proxying it to the leaseholder.

Fixes: #135493

Release note: None

Co-authored-by: Stan Rosenberg <[email protected]>
Co-authored-by: Ibrahim Kettaneh <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Nov 21, 2024
It appears a change to TC wrapper scripts in [1]
may have caused the `root` var. to become unbound.
This PR adds the missing `source` statement.

[1] #124592

Epic: none

Release note: None
blathers-crl bot pushed a commit that referenced this pull request Nov 21, 2024
It appears a change to TC wrapper scripts in [1]
may have caused the `root` var. to become unbound.
This PR adds the missing `source` statement.

[1] #124592

Epic: none

Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Nov 22, 2024
It appears a change to TC wrapper scripts in [1]
may have caused the `root` var. to become unbound.
This PR adds the missing `source` statement.

[1] cockroachdb#124592

Epic: none

Release note: None
herkolategan added a commit to herkolategan/cockroach that referenced this pull request Nov 27, 2024
Previously, we included the `--crdb_test` build flag for `roachprod`. This
binary is only built to be used as part of the `roachprod update` logic. Adding
the flag causes roachprod to produce a log of metamorphic vars which is not
expected behaviour when using it as a CLI tool.

Informs: cockroachdb#124592

Epic: None
Release note: None
herkolategan added a commit to herkolategan/cockroach that referenced this pull request Nov 27, 2024
Previously, we included the `--crdb_test` build flag for `roachprod`. This
binary is only built to be used as part of the `roachprod update` logic. Adding
the flag causes roachprod to produce a log of metamorphic vars which is not
expected behaviour when using it as a CLI tool.

Informs: cockroachdb#124592

Epic: None
Release note: None
herkolategan added a commit to herkolategan/cockroach that referenced this pull request Nov 28, 2024
Previously, we included the `--crdb_test` build flag for `roachprod`. This
binary is only built to be used as part of the `roachprod update` logic. Adding
the flag causes roachprod to produce a log of metamorphic vars which is not
expected behaviour when using it as a CLI tool.

Informs: cockroachdb#124592

Epic: None
Release note: None
craig bot pushed a commit that referenced this pull request Nov 28, 2024
136229: kvserver: bring TestStoreRangeLease to a leader lease world r=arulajmani a=arulajmani

This patch renames  what was previously TestStoreRangeLease by adding a EpochLease suffix to it. Then, we rewrite TestStoreRangeLease to use leader leases instead.

References #133763

Release note: None

136269: kvserver: always use epoch based leases in TestRangeQuiescence r=arulajmani a=arulajmani

Only epoch based leases can be quiesced. Explicitly use these for TestRangeQuiescence.

References #133763

Release note: None

136281: roachprod: remove cdrb_test for nightly r=srosenberg a=herkolategan

Previously, we included the `--crdb_test` build flag for `roachprod`. This binary is only built to be used as part of the `roachprod update` logic. Adding the flag causes roachprod to produce a log of metamorphic vars which is not expected behaviour when using it as a CLI tool.

Informs: #124592

Epic: None
Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Herko Lategan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants