Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

teamcity: failed test: TestPlannerLogic #34911

Closed
cockroach-teamcity opened this issue Feb 14, 2019 · 2 comments
Closed

teamcity: failed test: TestPlannerLogic #34911

cockroach-teamcity opened this issue Feb 14, 2019 · 2 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

The following tests appear to have failed on master (test): TestPlannerLogic/local, TestPlannerLogic/local: TestPlannerLogic/local/show_trace, TestPlannerLogic

You may want to check for open issues.

#1138761:

TestPlannerLogic/local
--- FAIL: test/TestPlannerLogic: TestPlannerLogic/local (0.000s)




TestPlannerLogic/local: TestPlannerLogic/local/show_trace
...<semantic_type:INT width:64 precision:0 visible_type:BIGINT > nullable:false hidden:false > columns:<name:"v" id:2 type:<semantic_type:INT width:64 precision:0 visible_type:BIGINT > nullable:true hidden:false > next_column_id:3 families:<name:"primary" id:0 column_names:"k" column_names:"v" column_ids:1 column_ids:2 default_column_id:2 > next_family_id:1 primary_index:<name:"primary" id:1 unique:true column_names:"k" column_directions:ASC column_ids:1 foreign_key:<table:0 index:0 name:"" validity:Validated shared_prefix_len:0 on_delete:NO_ACTION on_update:NO_ACTION match:SIMPLE > interleave:<> partitioning:<num_columns:0 > type:FORWARD > next_index_id:3 privileges:<users:<user:"admin" privileges:2 > users:<user:"root" privileges:2 > > next_mutation_id:3 format_version:3 state:DROP draining_names:<parent_id:53 name:"kv" > view_query:"" drop_time:... replacement_of:<id:0 time:<> > audit_mode:DISABLED drop_job_id:... gc_mutations:<index_id:2 drop_time:... job_id:... > >
                    dist sender send  r6: sending batch 1 Put to (n1,s1):1
                    sql txn           rows affected: 0
                    dist sender send  r11: sending batch 8 QueryIntent to (n1,s1):1
                    dist sender send  r6: sending batch 1 EndTxn to (n1,s1):1
                    
                but found (query options: "") :
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    dist sender send  r6: sending batch 1 Get to (n1,s1):1
                    flow              Put /Table/3/1/54/2/1 -> table:<name:"kv" id:54 parent_id:53 version:8 modification_time:<wall_time:... > columns:<name:"k" id:1 type:<semantic_type:INT width:64 precision:0 visible_type:BIGINT > nullable:false hidden:false > columns:<name:"v" id:2 type:<semantic_type:INT width:64 precision:0 visible_type:BIGINT > nullable:true hidden:false > next_column_id:3 families:<name:"primary" id:0 column_names:"k" column_names:"v" column_ids:1 column_ids:2 default_column_id:2 > next_family_id:1 primary_index:<name:"primary" id:1 unique:true column_names:"k" column_directions:ASC column_ids:1 foreign_key:<table:0 index:0 name:"" validity:Validated shared_prefix_len:0 on_delete:NO_ACTION on_update:NO_ACTION match:SIMPLE > interleave:<> partitioning:<num_columns:0 > type:FORWARD > next_index_id:3 privileges:<users:<user:"admin" privileges:2 > users:<user:"root" privileges:2 > > next_mutation_id:3 format_version:3 state:DROP draining_names:<parent_id:53 name:"kv" > view_query:"" drop_time:... replacement_of:<id:0 time:<> > audit_mode:DISABLED drop_job_id:... gc_mutations:<index_id:2 drop_time:... job_id:... > >
                    dist sender send  r6: sending batch 1 Put to (n1,s1):1
                    sql txn           rows affected: 0
                    dist sender send  r11: sending batch 8 QueryIntent to (n1,s1):1
                    dist sender send  r6: sending batch 1 EndTxn to (n1,s1):1
                    dist sender send  r6: sending batch 1 PushTxn to (n1,s1):1
                    dist sender send  r11: sending batch 2 ResolveIntent to (n1,s1):1
                    
                
            logic.go:2345: 
                testdata/planner_test/show_trace:348: error while processing
            logic.go:2346: testdata/planner_test/show_trace:348: too many errors encountered, skipping the rest of the input



TestPlannerLogic
--- FAIL: test/TestPlannerLogic (10.880s)

------- Stdout: -------
    test_log_scope.go:81: test logs captured to: /tmp/logTestPlannerLogic088024876
    test_log_scope.go:62: use -show-logs to present logs inline




Please assign, take a look and update the issue accordingly.

@cockroach-teamcity cockroach-teamcity added this to the 2.2 milestone Feb 14, 2019
@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels Feb 14, 2019
@tbg tbg assigned andy-kimball and unassigned jordanlewis Feb 19, 2019
@andy-kimball
Copy link
Contributor

This doesn't seem to be a planning issue. The diff shows these two additional rows at the end of the log:

                    dist sender send  r6: sending batch 1 PushTxn to (n1,s1):1
                    dist sender send  r11: sending batch 2 ResolveIntent to (n1,s1):1

I'm assuming this is a test flake. @nvanbenschoten, do you know what's causing these to be added? Are they non-deterministic? Do we need to somehow exclude them from expected trace results?

@nvanbenschoten
Copy link
Member

This looks like a duplicate of #33720, where we also saw an unexpected push. Without digging in too deeply here, I suspect that we're seeing essentially what @andreimatei speculated in #33720 (comment): a previous transaction may have failed to asynchronously clean up its intent in time and we're running into it in a later transaction.

We're already ignoring a number of trace messages here. We might just want to ignore this source of non-determinism as well.

andy-kimball added a commit to andy-kimball/cockroach that referenced this issue Feb 21, 2019
Filter out PushTxn/ResolveIntent/SystemConfigSpan trace messages, since they
represent intent cleanup / system activity from earlier operations that
interferes with subsequent tests. This was already being done in other
variations, so this commit extends that to all the variations that are doing
filtering of messages.

Fixes cockroachdb#34911
Fixes cockroachdb#35093

Release note: None
craig bot pushed a commit that referenced this issue Feb 21, 2019
35081: sql: Fix test flake r=andy-kimball a=andy-kimball

Filter out PushTxn/ResolveIntent trace messages, since they represent intent
cleanup from earlier operations that interferes with subsequent tests. This
was already being done in other variations, so this commit extends that to
all the variations that are doing filtering of messages.

Fixes #34911

Release note: None

35086: mvcc: don't write below provisional commit timestamp on WriteTooOld errors r=tbg a=nvanbenschoten

Fixes #34853.

This change fixes #34853, which appears to have been broken by
57d0201#diff-41e9e1b7829f8e25daeb564f42c26017L1514.
That change made it so that intents are now written at a transaction's
provisional commit timestamp instead of its original timestamp (which is
fantastic!). In doing so, it removed special-purpose code to forward the
`MVCCLogicalOpDetails` timestamp to the provisional commit timestamp of the
transaction because it assumed that the new `writeTimestamp` variable that the
change introduced already assumed this role.

Unfortunately, the change allowed this `writeTimestamp` variable to regress
below the transaction's provisional commit timestamp in subtle cases where a
committed value ended up between the transaction's original timestamp and its
provisional commit timestamp (a case that generates `WriteTooOldError`s). In
this case, we were replacing the `writeTimestamp` with the committed value's
timestamp + 1 logical tick. This regression allowed `MVCCWriteIntentOp`s to look
like they were ignoring closed timestamps.

The fix was to either introduce the special purpose code again or to avoid
letting `writeTimestamp` regress in this `WriteTooOldError` code path. This PR
chose the latter solution.

In doing so, this PR also improves upon the benefit already provided by #32487 -
it reduces the number of intents that need to be moved on transaction commit.
This is because it ensures that if a transaction has an original (read)
timestamp below a committed value and a provisional commit (write) timestamp
above a committed value, an intent left by that transaction will be in a
"committable" state without a need to rewrite it during intent resolution.

Release note: None

35112: kv: assert an absence of retries in TestTxnCoordSenderRetries  r=tbg,andreimatei a=nvanbenschoten

The test was only asserting that cases that should observe retries did, not the cases that should not observe retries did not.

The PR also fixes `TestPropagateTxnOnError`, which wasn't really testing anything. We now set txn.Writing on the client, so it wasn't verifying what it thought it was. We switch it to assert that observed timestamps are propagated on error instead.

35118: storage: Downgrade a spammy log message r=tbg a=bdarnell

This log message only occurs on clusters with multiple stores per
node, but on those clusters it fires multiple times per second.

Release note: None

Co-authored-by: Andrew Kimball <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Ben Darnell <[email protected]>
@craig craig bot closed this as completed in #35081 Feb 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

4 participants