Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Need a cronjob to trigger defrag if already got auto-compaction configured? #8496

Closed
frank12268 opened this issue Sep 5, 2017 · 4 comments

Comments

@frank12268
Copy link

I am using etcd v3.2.2 for some time, and run into Error: etcdserver: mvcc: database space exceeded every day / other day, with auto-compaction set to 24h. I have tried:

  1. directly defrag and dis_alarm, sometimes it worked
  2. manual compact, sometimes got the same error soon
  3. manual compact, then defrag, then dis_alarm, seems it almost worked.

I know I could try shorten the compaction gap to 1h, but there are some questions to ask firstly:

  1. Is it ok to run defrag right after compaction every hour? pros/cons? plan to support auto-defrag? what is the best practice?
  2. What triggers defragment? A superficial glance at the code shows the defragments happen only after: a.) node (re)start; and b.) defrag command received. Is it correct?
  3. Does compaction decrease the db size? If it is true, is there any way to guarantee not exceeding the database space limitation if user retires or deletes obsolete data in time?
  4. Does (compaction with physical=true) == (defrag) ?

btw, any update on this issue 'Adding guidance of configuring compaction related parameters #8018'

@heyitsanthony
Copy link
Contributor

Is it ok to run defrag right after compaction every hour? pros/cons? plan to support auto-defrag? what is the best practice?

It's fine to run defrag every hour but while etcd is defragmenting it can't serve keys.

What triggers defragment?

It is only triggered on an RPC request. An in-progress defragment operation is not resumed on member restart.

Does compaction decrease the db size?

It only deletes keys from the database and makes that space available to the db internally. It does not free file system space like defragmenting, which is what the database size quota checks. This is explained in the maintenance documentation at https://github.com/coreos/etcd/blob/master/Documentation/op-guide/maintenance.md#defragmentation.

Does (compaction with physical=true) == (defrag) ?

No. etcd does compaction in the background by default, the physical flag makes the RPC wait until the compaction is entirely written to the db. The behavior is explained in the RPC documentation at https://github.com/coreos/etcd/blob/master/Documentation/dev-guide/api_reference_v3.md#message-compactionrequest-etcdserveretcdserverpbrpcproto.

@frank12268
Copy link
Author

Thanks @heyitsanthony
I started an etcd with default --quota-backend-bytes and executed the following commands (the commands in the 'space' section in maintenance doc without 'defrag')

# while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024  | ETCDCTL_API=3 /opt/etcd/v3.2.2/etcdctl --endpoints=10.201.102.122:13210 --write-out=json put key || break; done
<...omit log here...>
Error:  etcdserver: mvcc: database space exceeded
# rev=$(ETCDCTL_API=3 /opt/etcd/v3.2.2/etcdctl --endpoints=10.201.102.122:13210 --write-out=json endpoint status | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*')
# echo $rev
2040
# ETCDCTL_API=3 /opt/etcd/v3.2.2/etcdctl --endpoints=10.201.102.122:13210 --write-out=json compact $rev 
compacted revision 2040
# ETCDCTL_API=3 /opt/etcd/v3.2.2/etcdctl --endpoints=10.201.102.122:13210 --write-out=json alarm disarm
{"alarms":[{"memberID":5889610180455465045,"alarm":1}]}
# ETCDCTL_API=3 /opt/etcd/v3.2.2/etcdctl --endpoints=10.201.102.122:13210 --write-out=json alarm list
{}
# while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024  | ETCDCTL_API=3 /opt/etcd/v3.2.2/etcdctl --endpoints=10.201.102.122:13210 --write-out=json put key || break; done
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.136127 s, 7.7 MB/s
Error:  etcdserver: mvcc: database space exceeded
# rev=$(ETCDCTL_API=3 /opt/etcd/v3.2.2/etcdctl --endpoints=10.201.102.122:13210 --write-out=json endpoint status | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*')
# echo $rev
2040

I presumed some more data could be inserted after the compaction if compaction could make space for db internally, but seems like not the case. Did I miss some operation? (I have tried both compact with & without --physical=true)
If not, It is a little bit concerned like I said in question 3, "is there any way to guarantee not exceeding the database space limitation if user retires or deletes obsolete data in time?"

@heyitsanthony
Copy link
Contributor

The quota is based on the database size on disk; compacting the db will not change that. It has to be defragmented to reclaim the space after a compaction.

You can avoid hitting the quota by periodically compacting and defragmenting based on the cluster workload.

@heyitsanthony
Copy link
Contributor

Questions answered; closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants