Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/kvserver/protectedts/ptstorage: TestCorruptData failed #67972

Closed
cockroach-teamcity opened this issue Jul 23, 2021 · 9 comments · Fixed by #72503
Closed

kv/kvserver/protectedts/ptstorage: TestCorruptData failed #67972

cockroach-teamcity opened this issue Jul 23, 2021 · 9 comments · Fixed by #72503
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

kv/kvserver/protectedts/ptstorage.TestCorruptData failed with artifacts on master @ 2ca4acb22aade1175232187bb2b1754057e89f9b:

=== RUN   TestCorruptData
--- FAIL: TestCorruptData (0.33s)
=== RUN   TestCorruptData/corrupt_hlc_timestamp
    test_log_scope.go:73: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_hlc_timestamp743071762
    test_log_scope.go:74: use -show-logs to present logs inline
E210723 06:26:02.392390 18466 kv/kvserver/protectedts/ptstorage/storage.go:224  [-] 1  failed to parse row as record: failed to parse timestamp for a6a729ae-6379-43ab-81c8-4bd376d46a02: logical part has too many digits
E210723 06:26:02.393380 17261 kv/kvserver/protectedts/ptstorage/storage.go:224  [-] 2  failed to parse row as record: failed to parse timestamp for a6a729ae-6379-43ab-81c8-4bd376d46a02: logical part has too many digits
    storage_test.go:579: 
        	Error Trace:	storage_test.go:579
        	Error:      	"[{ERROR %!s(int64=1627021562393380000) %!s(int64=17261) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=224) failed to parse row as record: failed to parse timestamp for a6a729ae-6379-43ab-81c8-4bd376d46a02: logical part has too many digits - %!s(uint64=58) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)} {ERROR %!s(int64=1627021562392390000) %!s(int64=18466) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=224) failed to parse row as record: failed to parse timestamp for a6a729ae-6379-43ab-81c8-4bd376d46a02: logical part has too many digits - %!s(uint64=57) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)}]" should have 1 item(s), but has 2
        	Test:       	TestCorruptData/corrupt_hlc_timestamp
    panic.go:613: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_hlc_timestamp743071762
    --- FAIL: TestCorruptData/corrupt_hlc_timestamp (0.16s)
Reproduce

To reproduce, try:

make stressrace TESTS=TestCorruptData PKG=./pkg/kv/kvserver/protectedts/ptstorage TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Parameters in this failure:

  • GOFLAGS=-parallel=4

/cc @cockroachdb/kv ajwerner

This test on roachdash | Improve this report!

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Jul 23, 2021
@tbg
Copy link
Member

tbg commented Jul 27, 2021

@ajwerner we're getting two entries where one is expected by the test:

entries, err := log.FetchEntriesFromFiles(0, math.MaxInt64, 100, msg,
log.WithFlattenedSensitiveData)
require.NoError(t, err)
require.Len(t, entries, 1)
for _, e := range entries {
require.Equal(t, severity.ERROR, e.Severity)
}
})

It seems silly - we're getting the "failed to parse row as record:" message twice, which, I don't know why there would even only ever be only one message? Don't we want to test that there is at least one message matching this pattern? Why are we expecting exactly one message to be logged? As far as I can tell this is operating on "the logger" that everyone and their dog can write to.

The fix seems clear (what I said above) but want to check that I'm not missing anything.

@knz
Copy link
Contributor

knz commented Jul 27, 2021

NB: we have facilities for more clever filtering nowadays. See for example what pgwire/auth_test.go does to extract log entries.

@ajwerner
Copy link
Contributor

Don't we want to test that there is at least one message matching this pattern?

Yeah, I doubt there's any reason for it to be so strict.

@tbg
Copy link
Member

tbg commented Jul 27, 2021

@AlexTalks this is another, more straightforward thing you could pick off.

@cockroach-teamcity
Copy link
Member Author

kv/kvserver/protectedts/ptstorage.TestCorruptData failed with artifacts on master @ 701b177d8f4b81d8654dfb4090a2cd3cf82e63a7:

=== RUN   TestCorruptData
--- FAIL: TestCorruptData (0.38s)
=== RUN   TestCorruptData/corrupt_spans
    test_log_scope.go:73: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans188771539
    test_log_scope.go:74: use -show-logs to present logs inline
E210731 06:40:01.216074 17034 kv/kvserver/protectedts/ptstorage/storage.go:224  [-] 1  failed to parse row as record: failed to unmarshal spans for 4bfbed8b-5e7a-4068-9243-c39fe0940fe2: unexpected EOF
E210731 06:40:01.216606 15941 kv/kvserver/protectedts/ptstorage/storage.go:224  [-] 2  failed to parse row as record: failed to unmarshal spans for 4bfbed8b-5e7a-4068-9243-c39fe0940fe2: unexpected EOF
    storage_test.go:527: 
        	Error Trace:	storage_test.go:527
        	Error:      	"[{ERROR %!s(int64=1627713601216606000) %!s(int64=15941) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=224) failed to parse row as record: failed to unmarshal spans for 4bfbed8b-5e7a-4068-9243-c39fe0940fe2: unexpected EOF - %!s(uint64=58) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)} {ERROR %!s(int64=1627713601216074000) %!s(int64=17034) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=224) failed to parse row as record: failed to unmarshal spans for 4bfbed8b-5e7a-4068-9243-c39fe0940fe2: unexpected EOF - %!s(uint64=57) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)}]" should have 1 item(s), but has 2
        	Test:       	TestCorruptData/corrupt_spans
    panic.go:613: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans188771539
    --- FAIL: TestCorruptData/corrupt_spans (0.22s)
Reproduce

To reproduce, try:

make stressrace TESTS=TestCorruptData PKG=./pkg/kv/kvserver/protectedts/ptstorage TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Parameters in this failure:

  • GOFLAGS=-parallel=4

/cc @cockroachdb/kv ajwerner

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

kv/kvserver/protectedts/ptstorage.TestCorruptData failed with artifacts on master @ 7897f24246bef3cb94f9f4bfaed474ecaa9fdee6:

=== RUN   TestCorruptData
--- FAIL: TestCorruptData (0.46s)
=== RUN   TestCorruptData/corrupt_spans
    test_log_scope.go:73: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans653245200
    test_log_scope.go:74: use -show-logs to present logs inline
E210820 06:21:49.454581 19189 kv/kvserver/protectedts/ptstorage/storage.go:240  [-] 1  failed to parse row as record: failed to unmarshal spans for 8dd7559c-dccc-49e7-bc86-4e119d3a4ee2: unexpected EOF
E210820 06:21:49.455517 18044 kv/kvserver/protectedts/ptstorage/storage.go:240  [-] 2  failed to parse row as record: failed to unmarshal spans for 8dd7559c-dccc-49e7-bc86-4e119d3a4ee2: unexpected EOF
    storage_test.go:574: 
        	Error Trace:	storage_test.go:574
        	Error:      	"[{ERROR %!s(int64=1629440509455517000) %!s(int64=18044) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=240) failed to parse row as record: failed to unmarshal spans for 8dd7559c-dccc-49e7-bc86-4e119d3a4ee2: unexpected EOF - %!s(uint64=60) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)} {ERROR %!s(int64=1629440509454581000) %!s(int64=19189) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=240) failed to parse row as record: failed to unmarshal spans for 8dd7559c-dccc-49e7-bc86-4e119d3a4ee2: unexpected EOF - %!s(uint64=59) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)}]" should have 1 item(s), but has 2
        	Test:       	TestCorruptData/corrupt_spans
    panic.go:613: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans653245200
    --- FAIL: TestCorruptData/corrupt_spans (0.21s)
Reproduce

To reproduce, try:

make stressrace TESTS=TestCorruptData PKG=./pkg/kv/kvserver/protectedts/ptstorage TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Parameters in this failure:

  • GOFLAGS=-parallel=4

/cc @cockroachdb/kv ajwerner

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

kv/kvserver/protectedts/ptstorage.TestCorruptData failed with artifacts on master @ ef22c9e3aaed735eb36f9a11da6af72396705e3e:

=== RUN   TestCorruptData
--- FAIL: TestCorruptData (0.73s)
=== RUN   TestCorruptData/corrupt_spans
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans317553198
    test_log_scope.go:80: use -show-logs to present logs inline
    storage_test.go:574: 
        	Error Trace:	storage_test.go:574
        	Error:      	"[{ERROR %!s(int64=1634886923070603000) %!s(int64=18199) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=240) failed to parse row as record: failed to unmarshal spans for f50f25f3-3979-4049-b24a-25a98f17b271: unexpected EOF - %!s(uint64=51) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)} {ERROR %!s(int64=1634886923070185000) %!s(int64=19443) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=240) failed to parse row as record: failed to unmarshal spans for f50f25f3-3979-4049-b24a-25a98f17b271: unexpected EOF - %!s(uint64=50) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)}]" should have 1 item(s), but has 2
        	Test:       	TestCorruptData/corrupt_spans
    panic.go:613: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans317553198
    --- FAIL: TestCorruptData/corrupt_spans (0.55s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)Parameters in this failure:

  • GOFLAGS=-parallel=4

/cc @cockroachdb/kv ajwerner

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

kv/kvserver/protectedts/ptstorage.TestCorruptData failed with artifacts on master @ 46b7efa32c57b15f4ae6e0d75845f63360b62161:

=== RUN   TestCorruptData
--- FAIL: TestCorruptData (0.39s)
=== RUN   TestCorruptData/corrupt_spans
    test_log_scope.go:79: test logs captured to: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans802016722
    test_log_scope.go:80: use -show-logs to present logs inline
    storage_test.go:574: 
        	Error Trace:	storage_test.go:574
        	Error:      	"[{ERROR %!s(int64=1636182178601228000) %!s(int64=19761) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=240) failed to parse row as record: failed to unmarshal spans for eebcb81f-0d1a-4ee1-9416-21d4f1cb021a: unexpected EOF - %!s(uint64=51) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)} {ERROR %!s(int64=1636182178598992000) %!s(int64=18580) kv/kvserver/protectedts/ptstorage/storage.go %!s(int64=240) failed to parse row as record: failed to unmarshal spans for eebcb81f-0d1a-4ee1-9416-21d4f1cb021a: unexpected EOF - %!s(uint64=50) %!s(bool=false) DEV %!s(uint32=0) %!s(uint32=0) %!s(uint32=0)}]" should have 1 item(s), but has 2
        	Test:       	TestCorruptData/corrupt_spans
    panic.go:613: -- test log scope end --
test logs left over in: /go/src/github.com/cockroachdb/cockroach/artifacts/logTestCorruptData_corrupt_spans802016722
    --- FAIL: TestCorruptData/corrupt_spans (0.24s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)Parameters in this failure:

  • GOFLAGS=-parallel=4

/cc @cockroachdb/kv

This test on roachdash | Improve this report!

@ajwerner
Copy link
Contributor

ajwerner commented Nov 7, 2021

Sorry for ignoring this: #72503.

craig bot pushed a commit that referenced this issue Nov 8, 2021
72358: ccl/server/sql: Fix failing reset index usage unit test r=lindseyjin a=lindseyjin

Resolves #72254

Previously, the unit test for resetting index usage statistics was
failing. This was happening because of two reasons:
1. The IndexUsageStatsController was not being set properly for DistSQL
server configs.
2. Noise from other tests was sometimes populating the index usage
statistics table right after it was reset, causing the check for a newly
reset table to sometimes randomly fail.

To fix these issues, we have implemented the following:
1. Created a test case to reproduce the first error, and correctly
populated IndexUsageStatsController for DistSQL server configs.
2. Updated the unit test to query for the specific test table id, to
avoid other tables polluting the test space. The test has also been
updated to check for a correct LastReset time.

Release note: none

72466: opt: clean up FK ON UPDATE CASCADE code and opt trees r=mgartner a=mgartner

#### opt: make FK ON UPDATE CASCADE input columns anonymous

This commit anonymizes the columns in the input of a foreign key
ON UPDATE CASCADE expression. This is safe because these columns can
only be referenced by other expressions if they are update columns, and
in that case, `mutationBuilder.addUpdateCascade` will give them a
distinct name in the scope it produces. This change allows a special
case added in #57153 to be removed, without failing the regression test.

Release note: None

#### opt: use more accurate FK ON UPDATE column metadata names

Columns produced by the WithScan expression in a foreign key ON UPDATE
CASCADE now have metadata names based on the column names of the child
table, rather than the parent table. This more accurately describes the
columns in the cascade plan because the plan is concerned with the child
table, not the parent. Metadata names are only used in opt expression
trees, so this is purely an aesthetic change, not a semantic one.

Release note: None


72493: sessionphase: fix service latency computation in an error case r=yuzefovich a=yuzefovich

If we encounter an error during the logical planning, then the end
execution phase is never set. Previously, we would use that unset value
in order to compute the latency of planning and execution and would get
a negative result. This is now fixed.

Release note: None

72503: protectedts/ptstorage: deflake test r=ajwerner a=ajwerner

The test asserted that the error message appeared just once. If there's a retry
the error can appear in the trace more than once. This commit cleans up that
condition.

Fixes #67972

Release note: None

Co-authored-by: Lindsey Jin <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
@craig craig bot closed this as completed in b76d3c5 Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants