-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/ccl/spanconfigccl/spanconfigreconcilerccl/spanconfigreconcilerccl_test: TestDataDriven failed #98038
Comments
Doesn't immediately reproduce under stress:
But I think I see what's going on -- it's just raciness in the test itself rather an a real underlying bug. At some point in the test we have protection records over /Table/10{6-8} at ts=3 and ts=4: cockroach/pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/protectedts Lines 89 to 90 in 7416aab
Then we release those protection records: cockroach/pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/protectedts Lines 133 to 138 in 7416aab
And expect that for /Table/10{6-8}, we should observe mutations that replace the existing record with one that has no protected timestamps. That does happen, but its possible to observe the intermediate mutation getting rid of just one protected timestamp (the test is not getting rid of pts records atomically). We should just assert on the final state instead of intermediate mutations. |
98082: spanconfig: deflake spanconfigreconcilerccl/TestDataDriven r=irfansharif a=irfansharif Fixes #98038. This test set up two protection records over two schema objects at two timestamps ts=3 and ts=4. ``` /Table/10{6-7} protection_policies=[{ts: 3} {ts: 4}] /Table/10{7-8} protection_policies=[{ts: 3} {ts: 4}] ``` When it later released those protection records: ``` release record-id=3 release record-id=4 ---- ``` It asserted that the span config mutations showed that we did infact get rid of the protected state: ``` mutations ---- delete /Table/10{6-7} upsert /Table/10{6-7} range default delete /Table/10{7-8} upsert /Table/10{7-8} range default ``` But since release of these protections was non-atomic, in #98038 we observed the following transition instead. ``` delete /Table/10{6-7} upsert /Table/10{6-7} protection_policies=[{ts: 4}] delete /Table/10{7-8} upsert /Table/10{7-8} protection_policies=[{ts: 4}] delete /Table/10{6-7} upsert /Table/10{6-7} range default delete /Table/10{7-8} upsert /Table/10{7-8} range default ``` That is, we first got rid of the record with ts=3 and only then got rid of ts=4. We just rewrite the test to assert on the final state of the records that show no remaining protections, instead of trying to add synchronization for mutations. Release note: None Co-authored-by: irfan sharif <[email protected]>
Fixes #98038. This test set up two protection records over two schema objects at two timestamps ts=3 and ts=4. /Table/10{6-7} protection_policies=[{ts: 3} {ts: 4}] /Table/10{7-8} protection_policies=[{ts: 3} {ts: 4}] When it later released those protection records: release record-id=3 release record-id=4 ---- It asserted that the span config mutations showed that we did infact get rid of the protected state: mutations ---- delete /Table/10{6-7} upsert /Table/10{6-7} range default delete /Table/10{7-8} upsert /Table/10{7-8} range default But since release of these protections was non-atomic, in #98038 we observed the following transition instead. delete /Table/10{6-7} upsert /Table/10{6-7} protection_policies=[{ts: 4}] delete /Table/10{7-8} upsert /Table/10{7-8} protection_policies=[{ts: 4}] delete /Table/10{6-7} upsert /Table/10{6-7} range default delete /Table/10{7-8} upsert /Table/10{7-8} range default That is, we first got rid of the record with ts=3 and only then got rid of ts=4. We just rewrite the test to assert on the final state of the records that show no remaining protections, instead of trying to add synchronization for mutations. Release note: None
Fixes #98038. This test set up two protection records over two schema objects at two timestamps ts=3 and ts=4. /Table/10{6-7} protection_policies=[{ts: 3} {ts: 4}] /Table/10{7-8} protection_policies=[{ts: 3} {ts: 4}] When it later released those protection records: release record-id=3 release record-id=4 ---- It asserted that the span config mutations showed that we did infact get rid of the protected state: mutations ---- delete /Table/10{6-7} upsert /Table/10{6-7} range default delete /Table/10{7-8} upsert /Table/10{7-8} range default But since release of these protections was non-atomic, in #98038 we observed the following transition instead. delete /Table/10{6-7} upsert /Table/10{6-7} protection_policies=[{ts: 4}] delete /Table/10{7-8} upsert /Table/10{7-8} protection_policies=[{ts: 4}] delete /Table/10{6-7} upsert /Table/10{6-7} range default delete /Table/10{7-8} upsert /Table/10{7-8} range default That is, we first got rid of the record with ts=3 and only then got rid of ts=4. We just rewrite the test to assert on the final state of the records that show no remaining protections, instead of trying to add synchronization for mutations. Release note: None
pkg/ccl/spanconfigccl/spanconfigreconcilerccl/spanconfigreconcilerccl_test.TestDataDriven failed with artifacts on master @ cf14ad694ee562676f53e36fa8495206c3aed61f:
Parameters:
TAGS=bazel,gss,deadlock
Help
See also: How To Investigate a Go Test Failure (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-25036
The text was updated successfully, but these errors were encountered: