-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stability: local storage fills up on connection problems #4977
Comments
For info, the largest prefixes on node4:/mnt/data by count of keys under that prefix are:
|
Due to rocksdb write amplification this means that node 4 could be expected to take up more space than the others with all its recently-written data. What was disk usage like on the other three nodes? (especially nodes 1 and 3. node 2 was not a member of the range at the end of the logs). That is, did node 4 contain more log entries and other data than the others, or was that data just stored less efficiently because it had just been rewritten? |
From offline discussion: node 4 filled up its 30GB of disk while the other three nodes were using less than 1GB. So it's not just write amplification from the last The data has been deleted, but next time this happens it would be helpful to run |
Should we close this since the data has been deleted? Seems there's nothing actionable here right now. |
I think we should. I'll file a new one if this occurs again. It hasn't since. |
Build sha: c6ebc42
In a scenario similar to #4925, we have node problems at the following times:
node1: SIGABRT due to pthread_create failed
W160308 01:23:45
node1.log.parse.txt.gz
node2: endlessly running, stopped manually.
node2.log.parse.txt.gz
node3: OOM
W160308 01:15:16
node3.log.parse.txt.gz
node4: filled up /mnt/data (all 30GB of it).
W160308 08:12:18
node4.log.parse.txt.gz
Interestingly, node4 is the only one not in the list of dns entries for beta.cockroachlabs.com, which is the URL used by the photos app.
I'm running
cockroach debug keys
on the whole data directory and doing a basic count by prefix. Will update when done.The text was updated successfully, but these errors were encountered: