-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdc: changefeed TeamCity flakiness #72706
Comments
samiskin
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Nov 12, 2021
samiskin
changed the title
cdc: changefeed CI flakiness
cdc: changefeed TeamCity flakiness
Nov 12, 2021
cc @cockroachdb/cdc |
I'm on it. Seems related to #71239. |
craig bot
pushed a commit
that referenced
this issue
Nov 15, 2021
72669: bazel: properly generate `.eg.go` code in `pkg/sql/colconv` via bazel r=rail a=rickystewart Release note: None 72714: sql/catalog/lease: permit gaps in descriptor history r=ajwerner a=ajwerner In #71239, we added a new mechanism to look up historical descriptors. I erroneously informed @jameswsj10 that we would never have gaps in the descriptor history, and, thus, when looking up historical descriptors, we could always use the earliest descriptor's modification time as the bounds for the relevant query. This turns out to not be true. Consider the case where version 3 is a historical version and then version 4 pops up and gets leased. Version 3 will get removed if it is not referenced. In the meantime, version 3 existed when we went to go find version 2. At that point, we'll inject version 2 and have version 4 leased. We need to make sure we can handle the case where we need to go fetch version 3. In the meantime, this change also removes some logic added to support the eventual resurrection of #59606 whereby we'll use the export request to fetch descriptor history to power historical queries even in the face of descriptors having been deleted. Fixes #72706. Release note: None 72740: sql/catalog/descs: fix perf regression r=ajwerner a=ajwerner This commit in #71936 had the unfortunate side-effect of allocating and forcing reads on the `uncommittedDescriptors` set even when we aren't looking for the system database. This has an outsized impact on the performance of the single-node, high-core-count KV runs. Instead of always initializing the system database, just do it when we access it. ``` name old ops/s new ops/s delta KV95-throughput 88.6k ± 0% 94.8k ± 1% +7.00% (p=0.008 n=5+5) name old ms/s new ms/s delta KV95-P50 1.60 ± 0% 1.40 ± 0% -12.50% (p=0.008 n=5+5) KV95-Avg 0.60 ± 0% 0.50 ± 0% -16.67% (p=0.008 n=5+5) ``` The second commit is more speculative and came from looking at a profile where 1.6% of the allocated garbage was due to that `NameInfo` even though we'll never, ever hit it. <img width="2345" alt="Screen Shot 2021-11-15 at 12 57 31 AM" src="https://user-images.githubusercontent.com/1839234/141729924-d00eebab-b35c-42bd-8d0b-ee39f3ac7d46.png"> Fixes #72499 Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Andrew Werner <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There've been recent test failures around a couple Changefeed tests in TeamCity:
TestChangefeedWorksOnRBRChange, TestChangefeedSchemaChangeAllowBackfill, TestChangefeedSchemaChangeAllowBackfill
On running them in roachprod, they seem to fail after around 3k runs. The command I used to test this was:
In investigating the RBR error it appeared to be blocking somewhere in the
eventToRow
call inchangefeed_processors.go:ConsumeEvent
Stack trace of the error
The text was updated successfully, but these errors were encountered: