Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestRangeLookupWithOpenTransaction timeout #747

Closed
tbg opened this issue Apr 22, 2015 · 2 comments · Fixed by #776
Closed

TestRangeLookupWithOpenTransaction timeout #747

tbg opened this issue Apr 22, 2015 · 2 comments · Fixed by #776
Labels
C-test-failure Broken test (automatically or manually discovered).

Comments

@tbg
Copy link
Member

tbg commented Apr 22, 2015

The following test appears to have failed:

#1637:

I0422 19:03:16.912589     230 node.go:225] Started node with [[]=] engine(s) and attributes []
I0422 19:03:16.912720     230 server.go:163] starting https server at 127.0.0.1:40591
I0422 19:03:16.913986     230 multiraft.go:410] node 100000001 starting
I0422 19:03:17.958851     230 client.go:148] client 127.0.0.1:40591 connected
I0422 19:03:17.959470     230 client.go:222] client 127.0.0.1:40591 starting heartbeat
--- FAIL: TestRangeLookupWithOpenTransaction (1.12s)
    dist_sender_server_test.go:79: get request did not succeed in face of range metadata intent
=== RUN TestMultiRangeScan
I0422 19:03:17.995105     230 rocksdb.go:87] opening rocksdb instance at ""
I0422 19:03:18.000988     230 multiraft.go:410] node 100000001 starting
I0422 19:03:18.001999     230 raft.go:390] raft: 100000001 became follower at term 5
I0422 19:03:18.002214     230 raft.go:207] raft: newRaft 100000001 [peers: [100000001], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I0422 19:03:18.002376     230 raft.go:464] raft: 100000001 is starting a new election at term 5
I0422 19:03:18.002456     230 raft.go:403] raft: 100000001 became candidate at term 6
I0422 19:03:18.002509     230 raft.go:447] raft: 100000001 received vote from 100000001 at term 6
I0422 19:03:18.002610     230 raft.go:426] raft: 100000001 became leader at term 6
--
I0422 19:03:50.801251     230 multiraft.go:688] node 100000001: group 1 raft ready
I0422 19:03:50.801333     230 multiraft.go:693] HardState updated: {Term:6 Vote:4294967297 Commit:14 XXX_unrecognized:[]}
I0422 19:03:50.801801     230 multiraft.go:696] New Entry[0]: 6/14 EntryNormal 13d76b579fcbe5a708c618565f3cc9d7: raft_id:1 cmd:<put:<header:<timestamp:<wall_time:1429729430800730575 logical:0 > cmd_id:<wall_time:1429729430800688551 random:632219556936206807 > ke
I0422 19:03:50.802157     230 multiraft.go:699] Committed Entry[0]: 6/14 EntryNormal 13d76b579fcbe5a708c618565f3cc9d7: raft_id:1 cmd:<put:<header:<timestamp:<wall_time:1429729430800730575 logical:0 > cmd_id:<wall_time:1429729430800688551 random:632219556936206807 > ke
--- PASS: TestKeysAndBodyArePreserved (0.33s)
FAIL
FAIL    github.com/cockroachdb/cockroach/kv 83.604s
=== RUN TestHeartbeatSingleGroup
I0422 19:02:37.917025     238 multiraft.go:410] node 1 starting
I0422 19:02:37.917764     238 multiraft.go:410] node 2 starting
I0422 19:02:37.918623     238 raft.go:390] raft: 1 became follower at term 5
I0422 19:02:37.918785     238 raft.go:207] raft: newRaft 1 [peers: [1,2], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I0422 19:02:37.919933     238 raft.go:390] raft: 2 became follower at term 5
I0422 19:02:37.920104     238 raft.go:207] raft: newRaft 2 [peers: [1,2], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I0422 19:02:37.920287     238 raft.go:464] raft: 1 is starting a new election at term 5
I0422 19:02:37.920413     238 raft.go:403] raft: 1 became candidate at term 6
I0422 19:02:37.920491     238 raft.go:447] raft: 1 received vote from 1 at term 6
I0422 19:03:16.912589     230 node.go:225] Started node with [[]=] engine(s) and attributes []
I0422 19:03:16.912720     230 server.go:163] starting https server at 127.0.0.1:40591
I0422 19:03:16.913986     230 multiraft.go:410] node 100000001 starting
I0422 19:03:17.958851     230 client.go:148] client 127.0.0.1:40591 connected
I0422 19:03:17.959470     230 client.go:222] client 127.0.0.1:40591 starting heartbeat
--- FAIL: TestRangeLookupWithOpenTransaction (1.12s)
    dist_sender_server_test.go:79: get request did not succeed in face of range metadata intent
=== RUN TestMultiRangeScan
I0422 19:03:17.995105     230 rocksdb.go:87] opening rocksdb instance at ""
I0422 19:03:18.000988     230 multiraft.go:410] node 100000001 starting
I0422 19:03:18.001999     230 raft.go:390] raft: 100000001 became follower at term 5
I0422 19:03:18.002214     230 raft.go:207] raft: newRaft 100000001 [peers: [100000001], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I0422 19:03:18.002376     230 raft.go:464] raft: 100000001 is starting a new election at term 5
I0422 19:03:18.002456     230 raft.go:403] raft: 100000001 became candidate at term 6
I0422 19:03:18.002509     230 raft.go:447] raft: 100000001 received vote from 100000001 at term 6
I0422 19:03:18.002610     230 raft.go:426] raft: 100000001 became leader at term 6
--
I0422 19:03:50.801251     230 multiraft.go:688] node 100000001: group 1 raft ready
I0422 19:03:50.801333     230 multiraft.go:693] HardState updated: {Term:6 Vote:4294967297 Commit:14 XXX_unrecognized:[]}
I0422 19:03:50.801801     230 multiraft.go:696] New Entry[0]: 6/14 EntryNormal 13d76b579fcbe5a708c618565f3cc9d7: raft_id:1 cmd:<put:<header:<timestamp:<wall_time:1429729430800730575 logical:0 > cmd_id:<wall_time:1429729430800688551 random:632219556936206807 > ke
I0422 19:03:50.802157     230 multiraft.go:699] Committed Entry[0]: 6/14 EntryNormal 13d76b579fcbe5a708c618565f3cc9d7: raft_id:1 cmd:<put:<header:<timestamp:<wall_time:1429729430800730575 logical:0 > cmd_id:<wall_time:1429729430800688551 random:632219556936206807 > ke
--- PASS: TestKeysAndBodyArePreserved (0.33s)
FAIL
FAIL    github.com/cockroachdb/cockroach/kv 83.604s
=== RUN TestHeartbeatSingleGroup
I0422 19:02:37.917025     238 multiraft.go:410] node 1 starting
I0422 19:02:37.917764     238 multiraft.go:410] node 2 starting
I0422 19:02:37.918623     238 raft.go:390] raft: 1 became follower at term 5
I0422 19:02:37.918785     238 raft.go:207] raft: newRaft 1 [peers: [1,2], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I0422 19:02:37.919933     238 raft.go:390] raft: 2 became follower at term 5
I0422 19:02:37.920104     238 raft.go:207] raft: newRaft 2 [peers: [1,2], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I0422 19:02:37.920287     238 raft.go:464] raft: 1 is starting a new election at term 5
I0422 19:02:37.920413     238 raft.go:403] raft: 1 became candidate at term 6
I0422 19:02:37.920491     238 raft.go:447] raft: 1 received vote from 1 at term 6

Please assign, take a look and update the issue accordingly.

@tbg tbg added the C-test-failure Broken test (automatically or manually discovered). label Apr 22, 2015
@tbg tbg changed the title Test failure in CI build 1637 TestRangeLookupWithOpenTransaction timeout Apr 22, 2015
@tbg
Copy link
Member Author

tbg commented Apr 22, 2015

note the one second sleep here:

I0422 19:03:16.912589     230 node.go:225] Started node with [[]=] engine(s) and attributes []
I0422 19:03:16.912720     230 server.go:163] starting https server at 127.0.0.1:40591
I0422 19:03:16.913986     230 multiraft.go:410] node 100000001 starting
I0422 19:03:17.958851     230 client.go:148] client 127.0.0.1:40591 connected
I0422 19:03:17.959470     230 client.go:222] client 127.0.0.1:40591 starting heartbeat
    dist_sender_server_test.go:79: get request did not succeed in face of range metadata intent

This looks like an error caused by a slow machine, as the command executed (twice even, but @bdarnell said that shouldn't keep the caller waiting).

@tbg tbg closed this as completed Apr 22, 2015
@tbg
Copy link
Member Author

tbg commented Apr 23, 2015

This happens alarmingly often. I've tried to get it locally and eventually managed (stress -d 15 -c 15 -i 15 -m 15 in the other window and running a tight loop of ./kv.test), but unfortunately even with -v 8 you can see the Node.Get return similarly in both failing and non-failing runs, and it's probably just a timeout. I'll up the timeout significantly tomorrow; after all, if it's a deadlock, we'll see it again.
Reopening for visibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered).
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant