Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP, RFC *: a new option for size based compaction #7782

Closed
wants to merge 1 commit into from

Conversation

mitake
Copy link
Contributor

@mitake mitake commented Apr 20, 2017

This commit adds a new option --snapshot-size for specifying size
based compaction. The option takes a number of bytes. If the
accumulated size of in memory log entries becomes larger than the
given size, etcd triggers compaction.

I'm still working on evaluation for understanding the performance affection from this option.

/cc @xiang90

@mitake
Copy link
Contributor Author

mitake commented Apr 20, 2017

related issue: #7162

@mitake mitake requested review from xiang90 and removed request for xiang90 April 20, 2017 05:20
@mitake mitake force-pushed the size-based-compaction branch 2 times, most recently from c7b7a23 to c88edf7 Compare May 10, 2017 08:31
@codecov-io
Copy link

codecov-io commented May 10, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@8fbf7ce). Click here to learn what that means.
The diff coverage is 96.15%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #7782   +/-   ##
=========================================
  Coverage          ?   75.71%           
=========================================
  Files             ?      341           
  Lines             ?    26590           
  Branches          ?        0           
=========================================
  Hits              ?    20133           
  Misses            ?     5011           
  Partials          ?     1446
Impacted Files Coverage Δ
etcdserver/config.go 77.77% <100%> (ø)
raft/storage.go 90.22% <100%> (ø)
embed/config.go 69.76% <100%> (ø)
etcdmain/config.go 79.75% <100%> (ø)
etcdserver/raft.go 87.87% <100%> (ø)
embed/etcd.go 73.7% <100%> (ø)
etcdserver/server.go 80.09% <88.88%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8fbf7ce...503f9e1. Read the comment docs.

@gyuho gyuho force-pushed the master branch 2 times, most recently from 44ca396 to 4301f49 Compare June 2, 2017 15:53
@mitake mitake force-pushed the size-based-compaction branch from c88edf7 to 17ef348 Compare June 6, 2017 08:36
@mitake
Copy link
Contributor Author

mitake commented Jun 6, 2017

@xiang90 I think this PR is ready to review. Could you take a look?

@mitake mitake force-pushed the size-based-compaction branch 2 times, most recently from 389478a to 503f9e1 Compare June 6, 2017 11:49
@xiang90
Copy link
Contributor

xiang90 commented Jun 16, 2017

the name is confusing. users will probably feel it is the actual snapshot file size. but the limit is actually the total size of in memory entries.

we need to come up with a better name. and, yes, snap-count is also a bad name in my opinion. it makes sense for v2, but not for v3.

@mitake
Copy link
Contributor Author

mitake commented Jun 19, 2017

@xiang90 I see, then how about --snapshot-trigger-log-size? Too long?

@mitake
Copy link
Contributor Author

mitake commented Jul 12, 2017

Probably I should cc @bdarnell because this PR relates with the raft package?

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to raft.MemoryStorage are OK with me (we don't use it in CockroachDB).

@mitake
Copy link
Contributor Author

mitake commented Jul 25, 2017

@bdarnell thanks for your review!

@heyitsanthony
Copy link
Contributor

@mitake I would like to separate the policy from the rest of MemoryStorage, if at least to avoid the disruption from having to change the memory storage constructor. I think it can be done in the etcdserver package like:

type MemoryStorageCompactHinter interface {
    raft.Storage
    CompactIndex() int64
    CreateSnapshot(...) // + anything else needed from memorystorage not in the storage interface
}

type storageCompactHintNone struct { *raft.MemoryStorage }

func (nh *storageCompactHintNone) CompactIndex() int64 {
    // no-op compaction
    v, err := nh.FirstIndex()
    if err != nil { panic(err) }
    return v
}

type storageCompatHintSize struct {
    *raft.MemoryStorage
    entsSize uint64
    snapSize uint64
    snapSizeCompactIdx int64
}

...

func newMemoryStorage(ms *raft.MemoryStorage, sizeHint int) MemoryStorageCompactHinter {
    if sizeHint == 0 {
       return &storageCompactHintNone{ms}
    }
    return &storageCompactHintSize{ms, sizeHint}
}

where raftNodeConfig.raftStorage is changed to use MemoryStorageCompactHinter.

the size hinting storage implementation would need an extra lock, but I don't think the performance hit will be too bad

@mitake
Copy link
Contributor Author

mitake commented Jul 27, 2017

@heyitsanthony the separation seems to be good. I'll follow the design in the next update.

This commit adds a new option --snapshot-size for specifying size
based compaction. The option takes a number of bytes. If the
accumulated size of in memory log entries becomes larger than the
given size, etcd triggers compaction.
@mitake mitake force-pushed the size-based-compaction branch from 503f9e1 to df37a38 Compare August 1, 2017 05:18
@mitake
Copy link
Contributor Author

mitake commented Aug 1, 2017

@heyitsanthony I found that implementing the memory storage w/ compaction hint only in etcdserver side is difficult. This is because information required for size based compaction depends on MemoryStorage internals like the length and the total size MemoryStorage.ents. Implementing the hint wrapper in raft side is ok?

@gyuho gyuho added the WIP label Aug 8, 2017
@heyitsanthony
Copy link
Contributor

@mitake OK, so long as entsSize etc are still in a structure separated from MemoryStorage like storageCompactHintSize

@mikelnrd
Copy link

mikelnrd commented Aug 12, 2017 via email

@gyuho
Copy link
Contributor

gyuho commented Oct 2, 2017

@mitake I am moving this after 3.3, unless there are noticeable performance gains.

@gyuho gyuho modified the milestones: v3.3.0, v3.4.0 Oct 2, 2017
@mitake
Copy link
Contributor Author

mitake commented Oct 3, 2017

@gyuho yes the moving is ok because it isn't emergent.

@stale
Copy link

stale bot commented Apr 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 6, 2020
@stale stale bot closed this Apr 27, 2020
@chaochn47
Copy link
Member

Hi @mitake Do you know why this promising approach went stale? It can provide a reliable mechanism to predict the raft log memory consumption beforehand when doing memory budget planning.

I want to collaborate with you and continue this effort if no huge blockers.

@mitake
Copy link
Contributor Author

mitake commented Apr 10, 2023

hi @chaochn47 so sorry for my very late response. Are you still interested in this? If so I can cooperate with you for revisiting the idea.

@chaochn47
Copy link
Member

Hi @mitake No worry. I am happy to work with the idea towards implementation. How about I raise a issue in etcd-io/raft, demonstrate my understanding of the PR and go from there?

@mitake
Copy link
Contributor Author

mitake commented Apr 11, 2023

@chaochn47 That's great, but as described in the roadmap: #15499 current top priority of the project is correctness. So I think we should be careful for not compromising correctness for performance.
Could you open a discussion on raft repository side then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

8 participants