Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bazel: don't use different gotags for test and non-test scenarios #72838

Merged
merged 1 commit into from
Nov 18, 2021

Conversation

rickystewart
Copy link
Collaborator

@rickystewart rickystewart commented Nov 16, 2021

In a couple different places in-tree we use the build tags crdb_test
and crdb_test_off to support conditional compilation. Unfortunately,
this results in low cache utilization when swapping between builds and
tests, since recompiling with different gotags causes Bazel to
recompile EVERYTHING, including the entire standard library, with these
different tags. This means that you can't build cockroach and then run
a quick test using cached build artifacts, for example.

Fix this by adding a new configuration, //build/toolchains:crdb_test.
We use select() in pkg/util/buildutil/BUILD.bazel to choose between
crdb_test_on.go and crdb_test_off.go accordingly.

Also add logic to build/bazelutil/check.sh to detect new uses of these
build constraints and warn people that the logic also needs to be
Bazelfied.

Closes #71857.

Release note: None

@rickystewart rickystewart requested a review from rail November 16, 2021 21:39
@rickystewart rickystewart requested review from a team as code owners November 16, 2021 21:39
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! What will be the outcome caching-wise with this change? The files that (directly or indirectly) depend on the util package will be recompiled when we switch between builds and tests? It might be worthwhile to separate the code further into a tiny subpackage to reduce that surface.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @rickystewart)


build/bazelutil/check.sh, line 103 at r1 (raw file):

git grep '//go:build' pkg | grep crdb_test | while read LINE; do
    if [[ "$EXISTING_CRDB_TEST_BUILD_CONSTRAINTS" == *"$LINE"* ]]; then
	# Grandfathered.

[nit] tabs

@rickystewart
Copy link
Collaborator Author

Cool! What will be the outcome caching-wise with this change? The files that (directly or indirectly) depend on the util package will be recompiled when we switch between builds and tests?

No, actually. The nice part about using x_defs for this is that nothing is recompiled, and instead it only requires the binary to be re-linked:

This means that changing a value results only in re-linking, not re-compilation and thus does not cause cascading changes.

[nit] tabs

Fixed.

@rickystewart
Copy link
Collaborator Author

Re-compilation would still be required for everything that depends on pkg/sql/opt/props or pkg/sql/opt/memo, though. But because the list of sources is conditional on the test configuration, we can't avoid that.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.. you are changing it from a const to a var..

The assumption has been that when you do if util.CrdbTestBuild { ... } then the code is statically elided altogether in non-test builds. It is meant to be a zero cost assertion.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @rickystewart)

@rickystewart
Copy link
Collaborator Author

rickystewart commented Nov 16, 2021

The assumption has been that when you do if util.CrdbTestBuild { ... } then the code is statically elided altogether in non-test builds. It is meant to be a zero cost assertion.

Is that static elision guaranteed/validated anywhere? Or is it just an assumption that the compiler is smart enough to do this?

I can do conditional compilation for these files too, you'll just see much worse caching since now everything that depends on pkg/util will indeed have to be recompiled when swapping between build and test, as you hypothesized above. And my suspicion is that pkg/util is way too commonly used for that to be tolerable.

@jordanlewis
Copy link
Member

Is that static elision guaranteed/validated anywhere? Or is it just an assumption that the compiler is smart enough to do this?

It's not validated by a test, but the compiler does elide statically false things.

We could decide to move the CrdbTestBuild constant, conditionally compiled, into its own small package, like Radu suggests. That way, only packages that depend on it would need to be recompiled, rather than everyone who imports util.

@petermattis
Copy link
Collaborator

Is that static elision guaranteed/validated anywhere? Or is it just an assumption that the compiler is smart enough to do this?

Pretty much every reasonable compiler for the last 5-10 years has statically elided unreachable code. The Go compiler definitely does so.

We could decide to move the CrdbTestBuild constant, conditionally compiled, into its own small package, like Radu suggests. That way, only packages that depend on it would need to be recompiled, rather than everyone who imports util.

To make sure I understand: the bazel caching will cache both versions where the tag is set and where it is unset, right? So if you dev test; dev build; dev test, it isn't going to recompile again on the second dev test.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @rickystewart)


pkg/sql/opt/memo/BUILD.bazel, line 6 at r1 (raw file):

    name = "memo",
    srcs = [
        "check_expr.go",

The use of tags in these files predate the CrdbTestBuild constant. I think it'd be equivalent to keep just one file with no build tags and do if !util.CrdbTestBuild { return } first thing in CheckExpr() (and Verify() etc).

@rickystewart
Copy link
Collaborator Author

OK, I made the requested changes. Unfortunately this still thrashes the cache (when you swap between build and test, everything that depends on pkg/util/buildutil has to be recompiled, and Bazel can't cache between the two configurations). It just doesn't thrash the cache as badly as before, when EVERYTHING would have to be recompiled.

I still want to get this merged because it's an objective improvement, but there's still work to be done here. I think what we want to do is go with the original approach (where swapping between build and test only requires re-linking) for dev scenarios, but for release builds make sure CrdbTestBuild is set to a const. Should still be possible with Bazel, I'll just need to shuffle some stuff around. Would that address the concern?

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just make all dev builds have it set to true, and only release builds set it to false. IMO that would be preferable actually.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @rickystewart)

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I made the requested changes. Unfortunately this still thrashes the cache (when you swap between build and test, everything that depends on pkg/util/buildutil has to be recompiled, and Bazel can't cache between the two configurations). It just doesn't thrash the cache as badly as before, when EVERYTHING would have to be recompiled.

@rickystewart how does bazel's cache work with regards to go build tags? With go build, if I build source using go build -tags foo and go build -tags bar, I'll end up recompiling everything, but the cache will be populated with both foo and bar variants. If I then do go build -tags foo nothing will be rebuilt. From experimentation it looks like bazel doesn't behave that way which is bad. It means switching between race (which is a build tag) and non-race builds will recompiling everything. Is there some way to teach bazel to put the build tag in the cache key such that it doesn't evict builds with a different cache key? I'm not sure I'm using the correct language here. I'm also quite surprised if this hasn't affected other Go users of bazel.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @rickystewart)

@rickystewart rickystewart force-pushed the crdb_test branch 4 times, most recently from 1160195 to 596957d Compare November 17, 2021 20:42
@rickystewart
Copy link
Collaborator Author

Is there some way to teach bazel to put the build tag in the cache key such that it doesn't evict builds with a different cache key?

Making that change would probably require patching upstream rules_go somehow. Not really sure what change would be required. I'll file an issue upstream and we'll go from there.

@rickystewart
Copy link
Collaborator Author

OK, so this latest version of the patch uses crdb_test_on.go or crdb_test_off.go accordingly for opt builds. For dbg or fastbuild builds, we instead use crdb_test_dyn.go, a new file that dynamically sets the value of CrdbTestBuild depending on a value passed in at link time. (Context: builds in Bazel can be built in one of three compilation modes, which are fastbuild, dbg, and opt. fastbuild is the default and opt is for optimized builds. Since fastbuild is the default, devs will default to using it, so they'll get the "fast-path" behavior. For release we'll use opt, obviously, which will get CrdbTestBuild as an appropriate const. You'll still thrash the cache if you swap between build and test while building w/ opt, but presumably if you're building an optimized build you should be ready to deal with slightly longer compile times.)

AFAIC this is ready to merge.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clever! :lgtm_strong:

Not strictly related to this PR - if it doesn't already, it would be nice if the build version (e.g. what you see in the logs or when you do SELECT version()) indicated fastbuild/dbg for those builds. That will help reduce confusion if we ever use the wrong build type in some context.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rail and @rickystewart)


pkg/util/buildutil/BUILD.bazel, line 40 at r5 (raw file):

    outs = ["gen-crdb_test_off.go"],
    cmd = REMOVE_GO_BUILD_CONSTRAINTS,
)

Very cool!

@rickystewart
Copy link
Collaborator Author

if it doesn't already, it would be nice if the build version (e.g. what you see in the logs or when you do SELECT version()) indicated fastbuild/dbg for those builds. That will help reduce confusion if we ever use the wrong build type in some context.

Makes sense to me: #72891

In a couple different places in-tree we use the build tags `crdb_test`
and `crdb_test_off` to support conditional compilation. Unfortunately,
this results in low cache utilization when swapping between builds and
tests, since recompiling with different `gotags` causes Bazel to
recompile EVERYTHING, including the entire standard library, with these
different tags. This means that you can't build `cockroach` and then run
a quick test using cached build artifacts, for example.

Fix this by adding a new configuration, `//build/toolchains:crdb_test`.
We use `select()` in `pkg/util/buildutil/BUILD.bazel` to choose between
`crdb_test_on.go` and `crdb_test_off.go` accordingly.

Also add logic to `build/bazelutil/check.sh` to detect new uses of these
build constraints and warn people that the logic also needs to be
Bazelfied.

Closes cockroachdb#71857.

Release note: None
@rickystewart
Copy link
Collaborator Author

TFTR :)

bors r=RaduBerinde

@craig
Copy link
Contributor

craig bot commented Nov 18, 2021

Build succeeded:

@craig craig bot merged commit b6aeffc into cockroachdb:master Nov 18, 2021
@rickystewart
Copy link
Collaborator Author

Making that change would probably require patching upstream rules_go somehow. Not really sure what change would be required. I'll file an issue upstream and we'll go from there.

bazel-contrib/rules_go#3012

rickystewart added a commit to rickystewart/cockroach that referenced this pull request Feb 15, 2022
This partially reverts cockroachdb#72838. That change was made to avoid thrashing
the cache when swapping between test- and non-test configurations.
Today we have `dev cache` which can retain build artifacts across the
different configurations, so this doesn't really serve a purpose any
more. Indeed, you can now swap between `bazel build
pkg/cmd/cockroach-short` and `bazel build pkg/cmd/cockroach-short
--config=test` very quickly with Bazel downloading old versions of the
built libraries if the built-in cache gets wiped.

We still filter out the `go:build` lines in `crdb_test_{off,on}.go` so
we don't have to set `gotags` for test and non-test, which still saves a
lot of time and unnecessary recompilation. We have a check for this in
CI so no one should be able to add a build constraint without us
knowing.

Release note: None
craig bot pushed a commit that referenced this pull request Feb 16, 2022
73878: sql: introduce MVCC-compliant index backfiller r=ajwerner a=stevendanna

Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.

To support an MVCC-compliant AddSSTable that writes at the _current_
timestamp, this change implements a new backfilling process described
in the following RFC:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md

In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in #72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index."  This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.

After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in #75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.

This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.

Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.

Release note: None

76252: kvserver: account for AddSSTableRequests in QPS r=kvoli a=kvoli

Previously, Queries-Per-Second (QPS) was calculated uniformly per
`BatchRequest` as 1. This patch introduces variable QPS calculation
for `AddSSTableRequest`, which use an order of magnitude more
resources than other request types.

This patch introduces the
`kv.replica_stats.addsst_request_size_factor` cluster setting. This
setting is used to attribute QPS to `AddSSTableRequest` sizes. The
calculation is done as QPS = 1 + size(AddSSTableRequest) / factor.
When `kv.replica_stats.addsst_request_size_factor` is less than 1, or
no `AddSSTableRequest` exists within a `BatchRequest`, then QPS = 1;
the current behavior today.

resolves #73731

Release note (performance improvement):
Introduced `kv.replica_stats.addsst_request_size_factor` cluster
setting. This setting is used to tune Queries-Per-Second (QPS)
sensitivity to large imports. By default, this setting is disabled.
When enabled, the size of any AddSSTableRequest will contribute to
QPS in inverse relation to this settings magnitude. By default this
setting configured to a conservative 50,000; every 50 kilobytes will
be accounted for as an additional 1 QPS.

76617: bazel: make `CrdbTestBuild` `const` r=RaduBerinde a=rickystewart

This partially reverts #72838. That change was made to avoid thrashing
the cache when swapping between test- and non-test configurations.
Today we have `dev cache` which can retain build artifacts across the
different configurations, so this doesn't really serve a purpose any
more. Indeed, you can now swap between `bazel build
pkg/cmd/cockroach-short` and `bazel build pkg/cmd/cockroach-short
--config=test` very quickly with Bazel downloading old versions of the
built libraries if the built-in cache gets wiped.

We still filter out the `go:build` lines in `crdb_test_{off,on}.go` so
we don't have to set `gotags` for test and non-test, which still saves a
lot of time and unnecessary recompilation. We have a check for this in
CI so no one should be able to add a build constraint without us
knowing.

Release note: None

76627: util/tracing: improve span recording interface r=andreimatei a=andreimatei

Before this patch, the Span's recording interface was left over from a
time when there were only one recording mode: verbose. We now have two
modes: verbose recording and structured recording. They can be enabled
at span creation time through the WithRecording(<type>) option. This
patch changes the Span's SetVerbose() method to expose the two options.

Release note: None

76667: roachtest/test: fix ORM testing due to removed cluster setting r=rafiss a=otan

`sql.catalog.unsafe_skip_system_config_trigger.enabled` got removed
recently and was part of an alpha. Let's clean it up in ORMs too.

Resolves #76654
Resolves #76655
Resolves #76656
Resolves #76657
Resolves #76658
Resolves #76659
Resolves #76660
Resolves #76661
Resolves #76662
Resolves #76663
Resolves #76664
Resolves #76665
Resolves #76666

Release note: None

Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Rui Hu <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
RajivTS pushed a commit to RajivTS/cockroach that referenced this pull request Mar 6, 2022
This partially reverts cockroachdb#72838. That change was made to avoid thrashing
the cache when swapping between test- and non-test configurations.
Today we have `dev cache` which can retain build artifacts across the
different configurations, so this doesn't really serve a purpose any
more. Indeed, you can now swap between `bazel build
pkg/cmd/cockroach-short` and `bazel build pkg/cmd/cockroach-short
--config=test` very quickly with Bazel downloading old versions of the
built libraries if the built-in cache gets wiped.

We still filter out the `go:build` lines in `crdb_test_{off,on}.go` so
we don't have to set `gotags` for test and non-test, which still saves a
lot of time and unnecessary recompilation. We have a check for this in
CI so no one should be able to add a build constraint without us
knowing.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bazel: low cache utilization between builds and tests
5 participants