Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stability: crash at startup: panic: [group 903] entries[1:12) is unavailable from storage #7389

Closed
mberhault opened this issue Jun 22, 2016 · 19 comments
Assignees

Comments

@mberhault
Copy link
Contributor

Trying to upgrade rho to the beta candidate sha: f447eb0

On restart, all nodes crashed with:

I160622 14:16:15.777437 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
I160622 14:16:15.777587 cli/start.go:148  writing go and jemalloc memory profiles to /home/ubuntu/logs every 10s
I160622 14:16:15.777782 server/context.go:265  1 storage engine initialized
I160622 14:16:15.777909 cli/start.go:334  starting cockroach node
W160622 14:16:15.791920 gossip/gossip.go:897  not connected to cluster; use --join to specify a connected node
I160622 14:16:15.805285 gossip/gossip.go:923  starting client to 104.196.42.152:26257
I160622 14:16:15.806082 storage/engine/rocksdb.go:129  opening rocksdb instance at "/mnt/data"
I160622 14:16:16.805855 gossip/gossip.go:923  starting client to 104.196.105.82:26257
I160622 14:16:16.806195 /go/src/google.golang.org/grpc/clientconn.go:499  grpc: Conn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 104.196.42.152:26257: getsockopt: connection refused"; Reconnecting to "104.196.42.152:26257"
I160622 14:16:18.221311 server/node.go:414  initialized store store=2:2 ([]=/mnt/data): {Capacity:396201193472 Available:340016451584 RangeCount:0}
I160622 14:16:18.221748 server/node.go:302  node ID 2 initialized
I160622 14:16:18.222765 storage/stores.go:292  read 3 node addresses from persistent storage
I160622 14:16:18.223194 server/node.go:535  connecting to gossip network to verify cluster ID...
I160622 14:16:18.228667 gossip/gossip.go:923  starting client to cockroach-rho-3:26257
E160622 14:16:18.466050 raft/log.go:308  [group 903] entries[1:12) is unavailable from storage
panic: [group 903] entries[1:12) is unavailable from storage

goroutine 44 [running]:
panic(0x1a75440, 0xc8203826d0)
        /usr/local/go/src/runtime/panic.go:481 +0x3ff
github.com/cockroachdb/cockroach/storage.(*raftLogger).Panicf(0xc82039def8, 0x2039a40, 0x2a, 0xc82041a340, 0x2, 0x2)
        /go/src/github.com/cockroachdb/cockroach/storage/raft.go:117 +0x1cb
github.com/coreos/etcd/raft.(*raftLog).slice(0xc8201ced90, 0x1, 0xc, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
        /go/src/github.com/coreos/etcd/raft/log.go:308 +0x581
github.com/coreos/etcd/raft.(*raftLog).nextEnts(0xc8201ced90, 0x0, 0x0, 0x0)
        /go/src/github.com/coreos/etcd/raft/log.go:142 +0x107
github.com/coreos/etcd/raft.newReady(0xc820c76b60, 0xc8203825b0, 0x6, 0x1, 0xb, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/coreos/etcd/raft/node.go:475 +0xe1
github.com/coreos/etcd/raft.(*RawNode).newReady(0xc820c6af00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/coreos/etcd/raft/rawnode.go:40 +0xbb
github.com/coreos/etcd/raft.(*RawNode).Ready(0xc820c6af00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/coreos/etcd/raft/rawnode.go:182 +0x58
github.com/cockroachdb/cockroach/storage.(*Replica).handleRaftReady.func1(0xc820c6af00, 0x0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/storage/replica.go:1352 +0xc8
github.com/cockroachdb/cockroach/storage.(*Replica).withRaftGroupLocked(0xc820425180, 0xc82086d4d0, 0x0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/storage/replica.go:280 +0x60d
github.com/cockroachdb/cockroach/storage.(*Replica).handleRaftReady(0xc820425180, 0x0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/storage/replica.go:1355 +0x19f
github.com/cockroachdb/cockroach/storage.(*Store).processRaft.func1()
        /go/src/github.com/cockroachdb/cockroach/storage/store.go:2106 +0x548
github.com/cockroachdb/cockroach/util/stop.(*Stopper).RunWorker.func1(0xc820398f50, 0xc820824c90)
        /go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:139 +0x60
created by github.com/cockroachdb/cockroach/util/stop.(*Stopper).RunWorker
        /go/src/github.com/cockroachdb/cockroach/util/stop/stopper.go:140 +0x70

Here are the per-node panic messages:

104.196.42.152
E160622 14:16:14.654867 raft/log.go:308  [group 903] entries[1:632) is unavailable from storage
panic: [group 903] entries[1:632) is unavailable from storage

104.196.105.82
E160622 14:16:18.466050 raft/log.go:308  [group 903] entries[1:12) is unavailable from storage
panic: [group 903] entries[1:12) is unavailable from storage

104.196.108.90
E160622 14:16:21.794437 raft/log.go:308  [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage

104.196.19.81
E160622 14:16:43.618335 raft/log.go:308  [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
@mberhault
Copy link
Contributor Author

I'll leave things are they are for now, but then I'll try reverting back to the previous binary (sha a51f9e4)

@mberhault
Copy link
Contributor Author

data and logs backed up on all nodes to /mnt/data/backup.7389 and ~/logs.7389 respectively.

@mberhault
Copy link
Contributor Author

confirmed that rolling back to a51f9e4 works.

@petermattis
Copy link
Collaborator

This looks related to (or the same as) #7386.

@mberhault
Copy link
Contributor Author

This is still broken on rho using sha 6d60fb6 (before the merge of the original PR).
see: https://github.com/cockroachdb/cockroach/tree/data-race for actual code

@petermattis
Copy link
Collaborator

@mberhault Still broken as in you're still seeing the same crashes?

@mberhault
Copy link
Contributor Author

mberhault commented Jun 22, 2016

yup. Here's a summary of running sha and panic message (or lack of it) per node.

104.196.42.152
I160622 14:16:12.840036 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:14.654867 raft/log.go:308  [group 903] entries[1:632) is unavailable from storage
panic: [group 903] entries[1:632) is unavailable from storage
I160622 14:58:34.821947 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:37.492956 raft/log.go:308  [group 903] entries[1:632) is unavailable from storage
panic: [group 903] entries[1:632) is unavailable from storage
I160622 14:59:34.710186 cli/start.go:228  CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:46:06.689434 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)
E160622 15:46:11.426498 raft/log.go:308  [group 809] entries[1:144) is unavailable from storage
panic: [group 809] entries[1:144) is unavailable from storage

104.196.105.82
I160620 12:15:35.823889 cli/start.go:228  CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 14:16:15.777437 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:18.466050 raft/log.go:308  [group 903] entries[1:12) is unavailable from storage
panic: [group 903] entries[1:12) is unavailable from storage
I160622 14:58:37.569923 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:39.802549 raft/log.go:308  [group 903] entries[1:12) is unavailable from storage
panic: [group 903] entries[1:12) is unavailable from storage
I160622 14:59:37.541498 cli/start.go:228  CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:47:11.002354 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)
E160622 15:47:13.799225 raft/log.go:308  [group 809] entries[1:147) is unavailable from storage
panic: [group 809] entries[1:147) is unavailable from storage

104.196.108.90
I160622 14:16:18.842993 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:21.794437 raft/log.go:308  [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:58:40.241841 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:42.673173 raft/log.go:308  [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:59:40.668983 cli/start.go:228  CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:48:15.250553 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)
E160622 15:48:42.235954 raft/log.go:308  [group 895] entries[1:4495) is unavailable from storage
panic: [group 895] entries[1:4495) is unavailable from storage

104.196.19.81
I160622 14:16:41.893305 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:43.618335 raft/log.go:308  [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:58:42.957817 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:44.578452 raft/log.go:308  [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:59:43.307353 cli/start.go:228  CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:49:06.357349 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)

@tbg
Copy link
Member

tbg commented Jun 22, 2016

I'll look at the original test failure. It's not surprising that once we
get into this corrupted state, we stay there. Will update this issue with
my findings.

On Wed, Jun 22, 2016 at 12:31 PM marc [email protected] wrote:

yup. Here's a summary of running sha and panic message (or lack of it) per
node.

104.196.42.152
I160622 14:16:12.840036 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:14.654867 raft/log.go:308 [group 903] entries[1:632) is unavailable from storage
panic: [group 903] entries[1:632) is unavailable from storage
I160622 14:58:34.821947 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:37.492956 raft/log.go:308 [group 903] entries[1:632) is unavailable from storage
panic: [group 903] entries[1:632) is unavailable from storage
I160622 14:59:34.710186 cli/start.go:228 CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:46:06.689434 cli/start.go:317 CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)
E160622 15:46:11.426498 raft/log.go:308 [group 809] entries[1:144) is unavailable from storage
panic: [group 809] entries[1:144) is unavailable from storage

1: 104.196.105.82
I160620 12:15:35.823889 cli/start.go:228 CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 14:16:15.777437 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:18.466050 raft/log.go:308 [group 903] entries[1:12) is unavailable from storage
panic: [group 903] entries[1:12) is unavailable from storage
I160622 14:58:37.569923 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:39.802549 raft/log.go:308 [group 903] entries[1:12) is unavailable from storage
panic: [group 903] entries[1:12) is unavailable from storage
I160622 14:59:37.541498 cli/start.go:228 CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:47:11.002354 cli/start.go:317 CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)
E160622 15:47:13.799225 raft/log.go:308 [group 809] entries[1:147) is unavailable from storage
panic: [group 809] entries[1:147) is unavailable from storage

104.196.108.90
I160622 14:16:18.842993 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:21.794437 raft/log.go:308 [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:58:40.241841 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:42.673173 raft/log.go:308 [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:59:40.668983 cli/start.go:228 CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:48:15.250553 cli/start.go:317 CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)
E160622 15:48:42.235954 raft/log.go:308 [group 895] entries[1:4495) is unavailable from storage
panic: [group 895] entries[1:4495) is unavailable from storage

104.196.19.81
I160622 14:16:41.893305 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:16:43.618335 raft/log.go:308 [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:58:42.957817 cli/start.go:317 CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 13:18:56, go1.6.2)
E160622 14:58:44.578452 raft/log.go:308 [group 901] entries[1:57) is unavailable from storage
panic: [group 901] entries[1:57) is unavailable from storage
I160622 14:59:43.307353 cli/start.go:228 CockroachDB beta-20160609-194-ga51f9e4 (linux amd64, built 2016/06/15 18:33:37, go1.6.2)
I160622 15:49:06.357349 cli/start.go:317 CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:41:01, go1.6.2)


You are receiving this because you were assigned.

Reply to this email directly, view it on GitHub
#7389 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AE135OUMLEBTU-106wKBYQ5xG2RzJH5eks5qOWN6gaJpZM4I71h6
.

-- Tobias

@mberhault
Copy link
Contributor Author

mberhault commented Jun 22, 2016

The same is happening on gamma, I'm pretty sure (90%) without having even touched the data using code that has Pete's change in.

Summary of running binaries and panic. Plus annotations of what I did here and there.
This is just on node 1, but I was doing the same thing on the others, plus full stops of all nodes before restarting with a different binary.

104.196.125.166
I160620 19:01:59.726796 cli/start.go:317  CockroachDB beta-20160609-299-g554347c (linux amd64, built 2016/06/20 09:58:52, go1.6.2)

# Stop all nodes and backup all data for each node.

I160622 16:32:09.149953 cli/start.go:317  CockroachDB beta-20160609-386-gf447eb0 (linux amd64, built 2016/06/22 14:06:28, go1.6.2)
E160622 16:32:17.008855 raft/log.go:308  [group 4365] entries[1:13) is unavailable from storage
panic: [group 4365] entries[1:13) is unavailable from storage

# Restore data.

I160622 16:33:23.264607 cli/start.go:317  CockroachDB beta-20160609-299-g554347c (linux amd64, built 2016/06/20 09:58:52, go1.6.2)
I160622 18:32:39.909682 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:30:48, go1.6.2)
I160622 19:01:52.240503 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:30:48, go1.6.2)
E160622 19:01:55.343373 raft/log.go:308  [group 4403] entries[1:12) is unavailable from storage
panic: [group 4403] entries[1:12) is unavailable from storage

# Restore data.

I160622 19:03:40.642784 cli/start.go:317  CockroachDB beta-20160609-299-g554347c (linux amd64, built 2016/06/20 09:58:52, go1.6.2)
I160622 19:11:30.887252 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:30:48, go1.6.2)
E160622 19:11:33.058289 raft/log.go:308  [group 4365] entries[1:16) is unavailable from storage
panic: [group 4365] entries[1:16) is unavailable from storage

# Restore data.

I160622 19:14:22.811144 cli/start.go:317  CockroachDB beta-20160609-377-g6d60fb6 (linux amd64, built 2016/06/22 15:30:48, go1.6.2)
E160622 19:14:24.449988 raft/log.go:308  [group 4365] entries[1:13) is unavailable from storage
panic: [group 4365] entries[1:13) is unavailable from storage

# Restore data.

I160622 19:15:00.796534 cli/start.go:317  CockroachDB beta-20160609-299-g554347c (linux amd64, built 2016/06/20 09:58:52, go1.6.2)

@tbg
Copy link
Member

tbg commented Jun 22, 2016

Documenting some of the post-mortem work:

ubuntu@cockroach-rho-1:~$ sudo ./cockroach.race debug check-store /mnt/data/
range 901: truncated index 0 should equal first index 11 - 1
range 809: truncated index 0 should equal first index 11 - 1

I looked at some other dumps with @mberhault (preceding his upgrades, I think on gamma) and there too we saw similar ranges which were problematic. But in this case, these ranges are the ones causing the crash (see the panics above).

From sudo ./cockroach.race debug range-data /mnt/data/ 809 | less -S we see that there is data in the state, albeit almost nothing (indicating that likely something went through and deleted everything, like in #7386. We see exactly two transaction entries:

/Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hits\x00\x01\x89\xf8\x06621"/Transaction/addrKey:/id:"0c93bcd2-cf10-41a1-ab9b-2e5113598929": "storage/replica_command.go:2687 (

/Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hits\x00\x01\x89\xf8\x06621"/Transaction/addrKey:/id:"7ed6542b-43cd-4d35-9513-91088d92b5d3": "storage/replica_command.go:2170 (

That's not really a smoking gun since this range contains mostly time-series data, so we didn't really expect any other transactions to be anchored here. Still, let's look at the timestamps.

ChangeReplicas: 1466425098 (6/20/2016, 8:18:18 AM GMT-4:00 DST)
AdminSplit: 1466094798 (6/16/2016, 12:33:18 PM GMT-4:00 DST)

In the most likely interpretation, something went wrong when the ChangeReplicas went through on the morning of 6/20 - it's conceivable that the GC queue would've picked up the Replica and destroyed it, but that something (some other queue) managed to use the Replica struct, leading to an error similar to #7386.

Playing the same game for the other Replica 901 only reveals a split, but no rebalance. Possibly another node would have more.

What's a bit puzzling with the above narrative is that this should really only clobber a Replica, not the Replica set. Going to look more.

@tbg
Copy link
Member

tbg commented Jun 22, 2016

Picking up from that last clue (all of the replica set is clobbered), I looked at the replica on the second node. The output of sudo ./cockroach.race debug range-data /mnt/data/ 809 | less -S is identical, but there's something interesting going on. For simplicity, here's (almost) the full thing (omitting only some abort cache entries), truncated for legibility:

Local/RangeID/809/r/RaftAppliedIndex: 56
/Local/RangeID/809/r/RangeLeaderLease: replica {1 1 3} 2016-06-20 13:11:59.5639
/Local/RangeID/809/r/LeaseAppliedIndex: 14
/Local/RangeID/809/r/RangeStats: last_update_nanos:1466425098261818005 intent_a
1466425098.261818005,0 /Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hi
        Raw:range_id:809 start_key:"\004tsd\022cr.store.rocksdb.block.cache.hit

1466425097.923246709,0 /Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hi
        Raw:range_id:809 start_key:"\004tsd\022cr.store.rocksdb.block.cache.hit

1466094798.533753849,0 /Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hi
        Raw:range_id:809 start_key:"\004tsd\022cr.store.rocksdb.block.cache.hit

1466094798.533753849,0 /Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hi

/Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hits\x00\x01\x89\xf8\x066

/Local/Range/"\x04tsd\x12cr.store.rocksdb.block.cache.hits\x00\x01\x89\xf8\x066

/System/"tsd\x12cr.store.rocksdb.block.cache.hits\x00\x01\x89\xf8\x06621": "2\x

The interesting bit is that the timestamp of the ChangeReplicas occurs three times, and its timestamp 1466425098.* occurs both in the last update to MVCC and the RangeDescriptor key version, and that the stats are completely messed up (maybe explaining why there is only a single timeseries value in the range).

Considering these things makes it unlikely that the replica GC queue is at fault here. After all, the history of the RangeDescriptor shows healthy three nodes, then adding Node4, and finally removing it. We're looking at Node{1,2}, so there's no reason we should ever have been clobbered, negating earlier assumptions.

@tbg
Copy link
Member

tbg commented Jun 22, 2016

At this point, my impression is that we're simply looking at a very skinny range - it contains only a single key, /System/"tsd\x12cr.store.rocksdb.block.cache.hits\x00\x01\x89\xf8\x06621. Why that range was ever created in the first place is a new mystery, but it's mostly empty simply because it is.

We might be looking at a corruption of the internal state here which comes from a different angle. I'm looking at the Raft log (and internal Raft state) next to get more intel.

@tbg
Copy link
Member

tbg commented Jun 22, 2016

For posterity: the easiest way to grab all of the range prefix is

sudo ./cockroach.race debug keys /mnt/data/ --from=$(echo -n -e '\x01i\xf7\x03)') --to=$(echo -n -e '\x01i\xf7\x03)\xff') --raw --values | less -S

where the gibberish hex was obtained by adding this in keys_test.go:

func TestFoo(t *testing.T) {
       k := MakeRangeIDUnreplicatedPrefix(1)
       t.Fatalf("%q", string(k))
}

(and the --to is just --from with \x255 appended).

@tbg
Copy link
Member

tbg commented Jun 22, 2016

Looking at 903 on rho1 (which panic'ed with panic: [group 903] entries[1:632) is unavailable from storage).

From sudo ./cockroach.race debug keys /mnt/data/ --from=$(echo -n -e '\x01i\xf7\x03\x87') --to=$(echo -n -e '\x01i\xf7\x03\x87\xff') --raw --values | less -S I can see that

  • the Raft log starts at 667, with a consistent TruncatedState (index:666 term:23122)
  • HardState {Term:23255 Vote:3 Commit:681} is also consistent with the applied index (677)
  • LastIndex 708 is consistent too.

If I look at the backup instead (/mnt/data/backup.7398), many of those state keys are not written, and that makes total sense. Looking at #7379, we see the reference tests failing on some assertions about persisted state.

It's not unlikely that similar discrepancies would trip up older clusters (for example, before my refactorings some state was synthesized on demand, whereas after them any such state was persisted before the raft group needed said state).

It may be difficult to prove conclusively, but armed with a data dump and an old enough version we should be able to prove this, and hopefully we can add a migration strategy (writing the missing state when booting up the stores). I'll take a look at that tomorrow. #7310 is the change I think is responsible after all.

If the beforementioned is true, #7404 really seems like an unrelated problem with similar symptoms that happened to pop up at about the same time.

@tbg
Copy link
Member

tbg commented Jun 22, 2016

The precise location which I think causes trouble is

https://github.com/cockroachdb/cockroach/pull/7310/files#diff-2967750a9f426e20041d924947ff1d46L255

As mentioned before, the ranges we were looking at barely held any data. In particular, there's nothing really that ever touched them, and there was no need to truncate the logs. As a result, no TruncatedState was ever written. However, the truncated state is used to generate the initial state (required when creating the in-memory Raft group), and thus truncatedState used to synthesize all sorts of things.

With the new code, it wouldn't synthesize anything but simply return a zero value, which explains exactly the problem: It should return a FirstIndex of 11 (which is also correctly the first index present in the never-truncated log), but returns 0. Thus Raft tries to load the entries [0, something) and the panic follows.

@tbg
Copy link
Member

tbg commented Jun 23, 2016

I'm going to whip up a "migration strategy" (it's a bit silly: going to have to synthesize a TruncatedState for those Raft groups which don't have one at startup before the corresponding Replica becomes active. The TruncatedState is part of the replicated state, but it should work out fine to write it on each individual replica, as we always overwrite this value blindly in case of a truncation being about to be applied on some of the replicas only). Will test with @mberhault.

@tamird
Copy link
Contributor

tamird commented Jun 23, 2016

This migration strategy should be able to salvage the currently borked beta
cluster, correct?

On Thu, Jun 23, 2016 at 10:50 AM, Tobias Schottdorf <
[email protected]> wrote:

I'm going to whip up a "migration strategy" (it's a bit silly: going to
have to synthesize a TruncatedState for those Raft groups which don't have
one at startup before the corresponding Replica becomes active. The
TruncatedState is part of the replicated state, but it should work out fine
to write it on each individual replica, as we always overwrite this value
blindly in case of a truncation being about to be applied on some of the
replicas only).


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#7389 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABdsPJpBWw29VMgigRuLdqIb0TfiX7dOks5qOp05gaJpZM4I71h6
.

@tbg
Copy link
Member

tbg commented Jun 23, 2016

I thought beta was borked on unrelated bugs. But yes, this will hopefully salvage rho and gamma (though they are possibly too fried at this point; who knows) and allow us to release a beta version.

tbg added a commit to tbg/cockroach that referenced this issue Jun 23, 2016
Add those which were not decoded while looking at cockroachdb#7389.
tbg added a commit to tbg/cockroach that referenced this issue Jun 23, 2016
In cockroachdb#7310, we added (and enforced) assumptions on what was
persisted to disk, but did not ensure that older versions
were migrated properly. This in turn lead to cockroachdb#7389.

A migration is only necessary for ranges which had never
been truncated, and so it should be fairly rare in practice.

The migration simply writes the truncated state that would
otherwise been synthesized by the old code, which is technically
a consistency violation, but should work well in practice because
all nodes will write that state on startup when they are rebooted.
tbg added a commit to tbg/cockroach that referenced this issue Jun 23, 2016
Add those which were not decoded while looking at cockroachdb#7389.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
In cockroachdb#7310, we added (and enforced) assumptions on what was
persisted to disk, but did not ensure that older versions
were migrated properly. This in turn lead to cockroachdb#7389.

A migration is only necessary for ranges which had never
been truncated, and so it should be fairly rare in practice.

The migration simply writes the truncated state that would
otherwise been synthesized by the old code, which is technically
a consistency violation, but should work well in practice because
all nodes will write that state on startup when they are rebooted.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
Also straightened out some kinks with the existing reference tests, namely that
they were creating two single-node clusters per test which were never used.
Instead, we can now create zero-node clusters and run one-shot containers on
it (one would argue that that's still weird, but it seems acceptable for the
time being).

The newly added reference test will crash when run for commits between cockroachdb#7310
and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the
test.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
Also straightened out some kinks with the existing reference tests, namely that
they were creating two single-node clusters per test which were never used.
Instead, we can now create zero-node clusters and run one-shot containers on
it (one would argue that that's still weird, but it seems acceptable for the
time being).

The newly added reference test will crash when run for commits between cockroachdb#7310
and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the
test.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
Also straightened out some kinks with the existing reference tests, namely that
they were creating two single-node clusters per test which were never used.
Instead, we can now create zero-node clusters and run one-shot containers on
it (one would argue that that's still weird, but it seems acceptable for the
time being).

The newly added reference test will crash when run for commits between cockroachdb#7310
and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the
test.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
Also straightened out some kinks with the existing reference tests, namely that
they were creating two single-node clusters per test which were never used.
Instead, we can now create zero-node clusters and run one-shot containers on
it (one would argue that that's still weird, but it seems acceptable for the
time being).

The newly added reference test will crash when run for commits between cockroachdb#7310
and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the
test.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
In cockroachdb#7310, we added (and enforced) assumptions on what was
persisted to disk, but did not ensure that older versions
were migrated properly. This in turn lead to cockroachdb#7389.

A migration is only necessary for ranges which had never
been truncated, and so it should be fairly rare in practice.

The migration simply writes the truncated state that would
otherwise been synthesized by the old code, which is technically
a consistency violation, but should work well in practice because
all nodes will write that state on startup when they are rebooted.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
Also straightened out some kinks with the existing reference tests, namely that
they were creating two single-node clusters per test which were never used.
Instead, we can now create zero-node clusters and run one-shot containers on
it (one would argue that that's still weird, but it seems acceptable for the
time being).

The newly added reference test will crash when run for commits between cockroachdb#7310
and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the
test.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
In cockroachdb#7310, we added (and enforced) assumptions on what was
persisted to disk, but did not ensure that older versions
were migrated properly. This in turn lead to cockroachdb#7389.

A migration is only necessary for ranges which had never
been truncated, and so it should be fairly rare in practice.

The migration simply writes the truncated state that would
otherwise been synthesized by the old code, which is technically
a consistency violation, but should work well in practice because
all nodes will write that state on startup when they are rebooted.
tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2016
Also straightened out some kinks with the existing reference tests, namely that
they were creating two single-node clusters per test which were never used.
Instead, we can now create zero-node clusters and run one-shot containers on
it (one would argue that that's still weird, but it seems acceptable for the
time being).

The newly added reference test will crash when run for commits between cockroachdb#7310
and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the
test.
@tbg
Copy link
Member

tbg commented Jul 5, 2016

Closing as the issue is sufficiently handled (and the more fundamental aspects are discussed separately in #7600).

@tbg tbg closed this as completed Jul 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants