Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stability: local storage fills up on connection problems #4977

Closed
mberhault opened this issue Mar 8, 2016 · 5 comments
Closed

stability: local storage fills up on connection problems #4977

mberhault opened this issue Mar 8, 2016 · 5 comments
Milestone

Comments

@mberhault
Copy link
Contributor

Build sha: c6ebc42

In a scenario similar to #4925, we have node problems at the following times:
node1: SIGABRT due to pthread_create failed
W160308 01:23:45
node1.log.parse.txt.gz

node2: endlessly running, stopped manually.
node2.log.parse.txt.gz

node3: OOM
W160308 01:15:16
node3.log.parse.txt.gz

node4: filled up /mnt/data (all 30GB of it).
W160308 08:12:18
node4.log.parse.txt.gz

Interestingly, node4 is the only one not in the list of dns entries for beta.cockroachlabs.com, which is the URL used by the photos app.

I'm running cockroach debug keys on the whole data directory and doing a basic count by prefix. Will update when done.

@mberhault
Copy link
Contributor Author

For info, the largest prefixes on node4:/mnt/data by count of keys under that prefix are:

    732 "/Local/RangeID/7/u
    836 "/Table/52/1/\"
    915 "/Local/RangeID/8/r
   1145 "/Table/53/1/\"
   1325 "/Local/RangeID/5/u
  15819 "/Table/53/2/\"\"
  64787 "/Table/53/1/\"\"
  70586 "/Local/RangeID/7/r
 248790 "/Local/RangeID/1/u
457249476 "/Local/RangeID/4/u

@bdarnell
Copy link
Contributor

bdarnell commented Mar 9, 2016

/Local/RangeID/*/u is almost certainly the raft log (there are other things that appear in that space but the raft log is the only one that is expected to grow to many keys). The rebalancer was moving things around quite a bit; node 4 was the last one to be re-added to range 4:

$ zgrep -h _REPLICA *|sort|grep 'range 4'
I160307 20:34:22.358123 storage/replica.go:1072  raft: proposing ADD_REPLICA {4 4 2} for range 4
I160307 20:34:22.369000 storage/replica.go:1072  raft: proposing ADD_REPLICA {3 3 3} for range 4
I160307 20:53:22.535040 storage/replica.go:1072  raft: proposing ADD_REPLICA {2 2 4} for range 4
I160307 20:53:22.550613 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {4 4 2} for range 4
I160307 20:55:12.549447 storage/replica.go:1072  raft: proposing ADD_REPLICA {4 4 5} for range 4
I160307 20:55:12.575113 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {1 1 1} for range 4
I160307 21:02:00.401559 storage/replica.go:1072  raft: proposing ADD_REPLICA {1 1 6} for range 4
I160307 21:02:00.417585 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {2 2 4} for range 4
I160307 21:17:18.141883 storage/replica.go:1072  raft: proposing ADD_REPLICA {2 2 7} for range 4
I160307 21:17:18.167182 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {3 3 3} for range 4
I160307 21:25:27.730692 storage/replica.go:1072  raft: proposing ADD_REPLICA {3 3 8} for range 4
I160307 21:25:27.745389 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {1 1 6} for range 4
I160307 21:25:27.747938 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {1 1 6} for range 4
I160307 21:48:33.313587 storage/replica.go:1072  raft: proposing ADD_REPLICA {1 1 9} for range 4
I160307 21:48:33.343425 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {3 3 8} for range 4
I160307 21:49:06.507886 storage/replica.go:1072  raft: proposing ADD_REPLICA {3 3 10} for range 4
I160307 21:49:06.530137 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {4 4 5} for range 4
I160307 21:57:25.885974 storage/replica.go:1072  raft: proposing ADD_REPLICA {4 4 11} for range 4
I160307 21:57:25.904065 storage/replica.go:1072  raft: proposing REMOVE_REPLICA {2 2 7} for range 4

Due to rocksdb write amplification this means that node 4 could be expected to take up more space than the others with all its recently-written data. What was disk usage like on the other three nodes? (especially nodes 1 and 3. node 2 was not a member of the range at the end of the logs). That is, did node 4 contain more log entries and other data than the others, or was that data just stored less efficiently because it had just been rewritten?

@petermattis petermattis modified the milestone: Beta Mar 9, 2016
@bdarnell
Copy link
Contributor

bdarnell commented Mar 9, 2016

From offline discussion: node 4 filled up its 30GB of disk while the other three nodes were using less than 1GB. So it's not just write amplification from the last applySnapshot.

The data has been deleted, but next time this happens it would be helpful to run cockroach debug raft-log $DATA_DIR $RANGE_ID. That will produce a ton of output but the last few KB are probably the interesting part.

@tamird
Copy link
Contributor

tamird commented Mar 23, 2016

Should we close this since the data has been deleted? Seems there's nothing actionable here right now.

@mberhault
Copy link
Contributor Author

I think we should. I'll file a new one if this occurs again. It hasn't since.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants