-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.5] etcdserver: backport check scheduledCompactKeyName and finishedCompac… #16068
Conversation
// checkPrevCompactionCompleted checks whether the previous scheduled compaction is completed. | ||
func (s *store) checkPrevCompactionCompleted() bool { | ||
tx := s.b.ReadTx() | ||
tx.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why not use tx.RLock()
but understand the PR is a backport of #15985
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lock()
for ReadTx()
should be by default a Rlock()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my understanding, do you mean the following RWMutex Lock
is equal to RLock
to protect the read buffer?
etcd/server/storage/backend/read_tx.go
Lines 127 to 130 in 5e40a8b
func (rt *readTx) Lock() { rt.mu.Lock() } | |
func (rt *readTx) Unlock() { rt.mu.Unlock() } | |
func (rt *readTx) RLock() { rt.mu.RLock() } | |
func (rt *readTx) RUnlock() { rt.mu.RUnlock() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After investigating a little bit, I think you are right, we should use RLock() instead. Lock() should only be used when the read buffer is updated by the writes committed by the BatchTxs. However, there are several positions where use read-only operations but Lock() is used. I think it is good to open a separate PR to correct them all.
Please correct me if I'm wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. That aligns with my understanding.
I also don't think it blocks this PR to be merged. This can be corrected separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old implementation uses stricter lock, which is sub-optimal, but is not a bug.
stricter lock
is true, it's also wrong lock
. It's a bug to me.
It should also have impact on performance for Non-K8s use case when auth is enabled. [Of course, it will be better if we have some performance comparison, but I will NOT force contributor to do it given it's a straightforward change]
To ensure release stability, we need to assume that the list you proposed can have a mistake.
That's why we need careful review. I wouldn't be afraid backporting it give it's a simple change and if we have confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note I prefer to backporting it, but I am not insist on backporting it if there is strong objection from other maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stricter lock is true, it's also wrong lock. It's a bug to me.
Without the change system behaves correctly, with the change only thing you get is performance improvement. How this is a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say it's harmless (on functionality) bug, obviously the lock isn't correctly be used. Let alone it has impact on performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bugs is when locks don't protect the critical section, relaxing locks is performance improvement. Nothing is harmless until proven, let alone concurrency change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
_, scheduledCompactBytes := tx.UnsafeRange(buckets.Meta, scheduledCompactKeyName, nil, 0) | ||
scheduledCompact := int64(0) | ||
if len(scheduledCompactBytes) != 0 { | ||
scheduledCompact = bytesToRev(scheduledCompactBytes[0]).main | ||
} | ||
|
||
_, finishedCompactBytes := tx.UnsafeRange(buckets.Meta, finishedCompactKeyName, nil, 0) | ||
finishedCompact := int64(0) | ||
if len(finishedCompactBytes) != 0 { | ||
finishedCompact = bytesToRev(finishedCompactBytes[0]).main | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a UnsafeReadScheduledCompact
and UnsafeReadFinishedCompact
in server/mvcc/util.go for 3.5.
The two functions should be reused in multiple places, (Please search scheduledCompactKeyName
and finishedCompactKeyName
in the code base)
It's accepted to fix it in a separate PR. Please also raise a followup ticket.
Please raise two followup ticket as mentioned in #16068 (comment) and #16068 (comment). Pls also rebase this PR. |
…tKeyName before writing hash to release-3.5. Fix etcd-io#15919. Check ScheduledCompactKeyName and FinishedCompactKeyName before writing hash to hashstore. If they do not match, then it means this compaction has once been interrupted and its hash value is invalid. In such cases, we won't write the hash values to the hashstore, and avoids the incorrect corruption alarm. Signed-off-by: caojiamingalan <[email protected]>
f4b2f7a
to
6ac9d94
Compare
Replace unnecessary Lock()/Unlock()s with RLock()/RUnlock()s Signed-off-by: caojiamingalan <[email protected]>
Just raised #16247. I will solve #16068 (comment) real quick, so no need for another PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
thx @CaojiamingAlan
Add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact Signed-off-by: caojiamingalan <[email protected]>
Add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact Signed-off-by: caojiamingalan <[email protected]>
Add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact Signed-off-by: caojiamingalan <[email protected]>
…tKeyName before writing hash to release-3.5.
Follow up of #15985
Fix #15919.
Check ScheduledCompactKeyName and FinishedCompactKeyName before writing hash to hashstore.
If they do not match, then it means this compaction has once been interrupted and its hash value is invalid. In such cases, we won't write the hash values to the hashstore, and avoids the incorrect corruption alarm.
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.