-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bazel: acceptance tests fail when run under Bazel #59446
Labels
A-build-system
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-dev-inf
Comments
rickystewart
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
A-build-system
labels
Jan 26, 2021
I assume it's this. |
See documentation for how to appropriately access binaries from a test. The gotags for the test also need to be updated. I didn't look to see whether there were other issues that would prevent the tests from running. This issue is specifically about the |
3 tasks
rail
added a commit
to rail/cockroach
that referenced
this issue
Aug 7, 2021
Touches cockroachdb#59446 Previously, in order to run the acceptance test we were implicitly identifying the `cockroach` binary to the acceptance test. This is a bit problematic under bazel, where the binary location may be in different places. This patch explicitly passes the location if the cockroach binary to the acceptance test. Release note: None
rail
added a commit
to rail/cockroach
that referenced
this issue
Aug 9, 2021
Touches cockroachdb#59446 Previously, in order to run the acceptance test we were implicitly identifying the `cockroach` binary to the acceptance test. This is a bit problematic under bazel, where the binary location may be in different places. This patch explicitly passes the location if the cockroach binary to the acceptance test. Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 9, 2021
68418: sqlproxy: add a timeout when dialing tenant pods r=JeffSwenson a=JeffSwenson Dialing dead or non existent pods on GKS can black hole the network connection. This leaves the connection process stuck until the OS TCP timeout of 2 minutes. The SQL proxy and the tenant pod are colocated. 5 seconds should be more than enough to establish the TCP connection. Timeing out will cause the retry loop in proxy_handler.go to refetch the tenant IP address and retry the connection. Release note: None 68483: release: update cockroach-cloud-images repository location r=DuskEagle a=DuskEagle This commit updates the location of the cockroach-cloud-images Docker repository. For reasons described in https://cockroachlabs.atlassian.net/browse/CC-4667, we are switching our internal CockroachDB Docker repository to this location, which is build on top of Google Artifact Registry. Release note: None 68524: sql: change physical planning heuristics a bit to prefer local execution r=yuzefovich a=yuzefovich This commit changes two parts of the physical planner heuristics: - we now say that the lookup join "can be distributed" rather than "should be distributed" - for top K sort we also say that it "can be" rather than "should be" distributed. I'm not certain whether both of these changes are always beneficial, but here is some justification. The change to the lookup join heuristic will make it so that the distribution of the join reader stage is determined by other stages of the physical plan in `distsql=auto` mode. Consider an example when the input to the lookup join is the table reader that scans only a handful of rows. Previously, because of the "should distribute" heuristic, such a plan would be "distributed" meaning we would plan a single table reader on the leaseholder for the relevant range (most likely a remote node from the perspective of the gateway node for the query); this, in turn, would force the planning of the join reader on the same node, and all consequent stages - if any - too. Such a decision can create a hotspot if that particular range is queried often (think append-only access pattern where the latest data is accessed most frequently). With this change in such a scenario we will get more even compute utilization across the cluster because the flow will be fully planned on the gateway (which assumed to be chosen randomly by a load balancer), and the lookup join will be performed from the gateway (we'll still need to perform a remote read from the leaseholder of that single range). The change to the top K sort heuristic seems less controversial to me, yet I don't have a good justification. My feeling is that usually the value of K is small, so it's ok if we don't "force" ourselves to distribute the sort operation if the physical plan otherwise isn't calling for it. Overall, the choice of making changes to these two heuristics isn't very principled and is driven by a single query from one of our largest customers which happened to hit the hot spot scenario as described above. In their case, they have append-like workload that is constantly updating a single range. Eventually that range is split automatically, but both new ranges stay on the same node. The latest data is accessed far more frequently than any other data in the table, yet according to the KV heuristics the ranges aren't being reallocated because the scans hitting the hot ranges aren't exceeding the threshold. What isn't accounted for is the fact that other parts of the flow are far more compute-intensive, so this change attempts to alleviate such a hot node scenario. Release note (sql change): Some queries with lookup joins and/or top K sorts are now more likely to be executed in "local" manner with `distsql=auto` session variable. 68569: acceptance: explicitly specify binary path r=rickystewart a=rail Touches #59446 Previously, in order to run the acceptance test we were implicitly identifying the `cockroach` binary to the acceptance test. This is a bit problematic under bazel, where the binary location may be in different places. This patch explicitly passes the location if the cockroach binary to the acceptance test. Release note: None 68573: sql: fix COPY CSV so it handles multiple records at a time r=otan a=rafiss fixes #68484 This was broken since the golang csv reader reads the entire input all at once, so the underlying buffer is consumed. This caused the loop that reads each record to terminate early. Release note (bug fix): Fixed the COPY CSV command so that it handles multiple records separated by newline characters. 68605: sql: rework SHOW JOBS WHEN COMPLETE delegate r=arulajmani a=ajwerner The previous pattern was fragile because the filter could be re-arranged. Release note: None 68609: cloud: bump orchestrator version to 21.1.7 r=mgartner a=mgartner Release note: None 68615: roachtest: bump predecessor to v21.1.7 r=mgartner a=mgartner Release note: None Co-authored-by: Jeff <[email protected]> Co-authored-by: Joel Kenny <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Rail Aliiev <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
sajjadrizvi
pushed a commit
to sajjadrizvi/cockroach
that referenced
this issue
Aug 10, 2021
Touches cockroachdb#59446 Previously, in order to run the acceptance test we were implicitly identifying the `cockroach` binary to the acceptance test. This is a bit problematic under bazel, where the binary location may be in different places. This patch explicitly passes the location if the cockroach binary to the acceptance test. Release note: None
rail
added a commit
to rail/cockroach
that referenced
this issue
Oct 25, 2021
This patch makes the acceptance test work under Bazel. * Add `AbsCertsDir()` in order to keep track of certificate path for cases when tests change the working directory. * docker-compose tests to use interpolation and environment variables in order to override `CERTS_DIR` and `COCKROACH_BINARY`. * Add `copyRunfiles()` in order to copy Bazel-generated symlinked runfiles as regular files to make them available in docker mounted volumes. Related: cockroachdb#71932, cockroachdb#71930 Fixes cockroachdb#59446 Release note: None
rail
added a commit
to rail/cockroach
that referenced
this issue
Oct 25, 2021
This patch makes the acceptance test work under Bazel. * Add `AbsCertsDir()` in order to keep track of certificate path for cases when tests change the working directory. * docker-compose tests to use interpolation and environment variables in order to override `CERTS_DIR` and `COCKROACH_BINARY`. * Add `copyRunfiles()` in order to copy Bazel-generated symlinked runfiles as regular files to make them available in docker mounted volumes. Related: cockroachdb#71932, cockroachdb#71930 Fixes cockroachdb#59446 Release note: None
craig bot
pushed a commit
that referenced
this issue
Nov 23, 2021
71770: bazel: run acceptance tests under Bazel r=rickystewart a=rail This patch makes the acceptance test work under Bazel. * Add `AbsCertsDir()` in order to keep track of certificate path for cases when tests change the working directory. * docker-compose tests to use interpolation and environment variables in order to override `CERTS_DIR` and `COCKROACH_BINARY`. * Add `copyRunfiles()` in order to copy Bazel-generated symlinked runfiles as regular files to make them available in docker mounted volumes. Related: #71932, #71930 Fixes: #59446 Release note: None 72574: ci: add bazel roachtest gce teamcity job r=rail a=rickystewart Release note: None 73055: ui: prevent undefined axis label on custom chart r=zachlite a=zachlite Release note (bug fix): Y-axis labels on custom charts no longer display 'undefined'. @thtruo, this fix addresses the issue as described in #72115. Now, when the user selects a new unit from the dropdown, the user will experience up to a 10 second delay before the axis label refreshes. @nathanstilwell and I investigated this last week, and we'd need more time to think of a solution that doesn't create long term maintenance headaches. The benefit of this PR as it stands is that the Y-axis label is noticeably less broken. FYI. 73080: bazel: don't shard `kvserver` test r=rail a=rickystewart The `exclusive` tag here prevents the shards from running concurrently. See #65407, #65582. Release note: None Co-authored-by: Rail Aliiev <[email protected]> Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Zach Lite <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-build-system
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-dev-inf
bazel test pkg/acceptance:acceptance_test
fails immediately with:Not sure why this is happening. I assume the acceptance tests run the
go
binary from within the test somewhere?Epic CRDB-8306
The text was updated successfully, but these errors were encountered: