Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vendor: bump etcd/raft #88985

Closed
wants to merge 1 commit into from
Closed

vendor: bump etcd/raft #88985

wants to merge 1 commit into from

Conversation

tbg
Copy link
Member

@tbg tbg commented Sep 29, 2022

This picks up etcd-io/etcd#14413.

TODO:

  • make it build
  • remove the maxIndex param and handling from Task.AckCommittedEntriesBeforeApplication
  • check that single node write latencies don't regress

Closes #87264.

Release note: None

This picks up etcd-io/etcd#14413.

Closes cockroachdb#87264.

Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member Author

tbg commented Sep 29, 2022

@rickystewart unfortunately this also runs into the same problem I encountered when trying to vendor this PR in before it merged:

go get go.etcd.io/etcd/raft/v3@d379e6221e10c32d83dc72ce39e0b19b3f00366f
./dev gen bazel --mirror
./dev build short
ERROR: /private/var/tmp/_bazel_tobias/7e77f37e5521082217aaab37b4b8407f/external/io_etcd_go_etcd_raft_v3/raftpb/BUILD.bazel:16:17: no such package '@io_etcd_go_etcd_raft_v3//etcd/api/versionpb': BUILD file not found in directory 'etcd/api/versionpb' of external repository @io_etcd_go_etcd_raft_v3. Add a BUILD file to a directory to mark it as a package. and referenced by '@io_etcd_go_etcd_raft_v3//raftpb:raftpb_go_proto'
ERROR: Analysis of target '//pkg/cmd/cockroach-short:cockroach-short' failed; build aborted: no such package '@io_etcd_go_etcd_raft_v3//etcd/api/versionpb': BUILD file not found in directory 'etcd/api/versionpb' of external repository @io_etcd_go_etcd_raft_v3. Add a BUILD file to a directory to mark it as a package.

This is likely caused by https://github.com/etcd-io/etcd/blob/d379e6221e10c32d83dc72ce39e0b19b3f00366f/raft/raftpb/raft.proto#L5

@tbg
Copy link
Member Author

tbg commented Oct 10, 2022

#89632 seems to work locally at least? Maybe taking out the old replace first did the trick, or maybe they since added a new tag upstream.

@tbg tbg closed this Oct 10, 2022
@tbg tbg mentioned this pull request Oct 10, 2022
3 tasks
craig bot pushed a commit that referenced this pull request Nov 3, 2022
89632: go.mod: bump raft r=nvanbenschoten a=tbg

```
go get go.etcd.io/etcd/raft/v3@d19116e6ee66e52a5fd8cce2e10f9422fb80e42f

go: downloading go.etcd.io/etcd/raft/v3 v3.6.0-alpha.0.0.20221009201006-d19116e6ee66
go: module github.com/golang/protobuf is deprecated: Use the "google.golang.org/protobuf" module instead.
go: upgraded go.etcd.io/etcd/api/v3 v3.5.0 => v3.6.0-alpha.0
go: upgraded go.etcd.io/etcd/raft/v3 v3.0.0-20210320072418-e51c697ec6e8 => v3.6.0-alpha.0.0.20221009201006-d19116e6ee66
```

This picks up

- etcd-io/etcd#14413
- etcd-io/etcd#14538

Compared single-node performance on gceworker via

```bash
#!/bin/bash
set -euxo pipefail
pkill -9 cockroach || true
rm -rf cockroach-data
cr=./cockroach-$1
$cr start-single-node --background --insecure
$cr workload init kv
$cr workload run kv --splits 100 --max-rate 2000 --duration 10m --read-percent 0 --min-block-bytes 10 --max-block-bytes 10 | tee $1.txt
```

```
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0        1199604         1999.3      6.7      7.1     10.0     11.0     75.5  write #master
  600.0s        0        1199614         1999.4      6.8      7.1     10.0     11.0     79.7  write #PR
```

Closes #87264.

- [x] [make it build](#88985 (comment))
- [x] remove the maxIndex param and handling from Task.AckCommittedEntriesBeforeApplication
- [x] check that single node write latencies don't regress

Release note: None


91117: sql: reduce the overhead of EXPLAIN ANALYZE r=yuzefovich a=yuzefovich

In order to propagate the execution stats across the distributed query plan we use the tracing infrastructure, where each stats object is added as "structured metadata" to the trace. Thus, whenever we're collecting the exec stats for a statement, we must enable tracing. Previously, in many cases we would enable it at the highest verbosity level which has non-trivial overhead. In some cases this was an overkill (e.g. in `EXPLAIN ANALYZE` we don't really care about the trace containing all of the gory details - we won't expose it anyway), so this is now fixed by using the less verbose "structured" verbosity level. As a concrete example of the difference: for a stmt that without `EXPLAIN ANALYZE` takes around 190ms, with `EXPLAIN ANALYZE` it would previously run for about 1.8s and now it takes around 210ms.

This required some minor changes to the row-by-row outbox and router
setups to collect thats even if the recording is not verbose.

Addresses: #90739.

Epic: None

Release note (performance improvement): The overhead of running `EXPLAIN ANALYZE` and `EXPLAIN ANALYZE (DISTSQL)` has been significantly reduced. The overhead of `EXPLAIN ANALYZE (DEBUG)` didn't change.

91119: roachprod: improve error in ParallelE r=smg260 a=tbg

Prior to this commit, the error's stack trace did not link back
to the caller of `ParallelE`. Now it does.

Epic: none
Release note: None


91126: dev: allow whitespace separated regexps for testlogic files r=ajwerner a=ajwerner

This was a feature of `make testlogic` and it was liked.

Fixes #91125

Release note: None

Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

raft: re-work leader self-ack mechanism
2 participants