fix race in handle_new_root #18984

jbiseda · 2021-07-30T03:17:21Z

Problem

Race in handle_new_root.

take write lock on bank_forks
set bank_forks root
drop write lock on bank_forks
take read lock on bank_forks
expect to access new_root in bank_forks (lock was not continuously held from setting root)

Summary of Changes

hold write lock on bank_forks across setting new_root and subsequent computation

behzadnouri · 2021-07-30T14:26:52Z

Lock on cluster-nodes will be removed in #18971 which also addresses several other issues with how cluster-nodes are computed and then cached in current master code.

codecov · 2021-08-08T00:28:56Z

Codecov Report

Merging #18984 (0364c87) into master (0e5ea36) will increase coverage by 0.0%.
The diff coverage is 78.9%.

@@           Coverage Diff           @@
##           master   #18984   +/-   ##
=======================================
  Coverage    82.8%    82.8%           
=======================================
  Files         455      455           
  Lines      130132   130069   -63     
=======================================
+ Hits       107794   107795    +1     
+ Misses      22338    22274   -64

behzadnouri

Thanks for working on this change. I think inspecting the locks in the code and ensuring that there are no redundant lock contentions is very valuable, such as the case of bank_forks.read() which is patched in this change and I am +1 on that.

However I have couple of general concerns:

Lock, unlock, then lock again pattern generally adds race-conditions when the object is modified between the first lock and the 2nd lock. So the values read/obtained from the first lock may no longer be valid by the time we obtain the 2nd lock (which can be significantly later in time due to lock contentions) or the object may be in an entirely different state.
Making locks more fine-grained or pushing lock acquisitions down the call-stack can easily introduce dead-locks if different call-stacks in different threads lock 2 mutexes in different orders. There has already been reports of such concurrency bugs in the code.
Pushing lock acquisitions down the call-stack may actually exacerbate performance, because, for example, instead of locking a mutex just once and performing some task for all the values in a vector, the code will end-up locking the mutex once for each item in the vector, release the lock and then lock it again for the next item, and as a result adding extra locking overhead and contention between every two elements of the vector.
Given above risks, it will be always good to drive these changes with performance benchmarks or metrics where we do already observe a lock contention and then can measure improvements. For example in the case of authorized_voter_keypairs or &r_bank_forks[new_root] in this change, I doubt that performance-wise it makes any difference here, however it may introduce above risks.

core/src/replay_stage.rs

runtime/src/bank_forks.rs

core/src/replay_stage.rs

stale · 2021-08-28T16:50:50Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

behzadnouri · 2021-08-29T14:52:53Z

@jbiseda Is this code ready to be reviewed again?

LGTM, maybe update the description explaining what causes the race condition here.
also, cc @carllin to take a look since he is more familiar with this part of the code.

carllin · 2021-08-31T05:44:45Z

core/src/replay_stage.rs

+        let mut w_bank_forks = bank_forks.write().unwrap();
+        w_bank_forks.set_root(
            new_root,
            accounts_background_request_sender,
            highest_confirmed_root,
        );
-        let r_bank_forks = bank_forks.read().unwrap();
-        let new_root_bank = &r_bank_forks[new_root];
+        let new_root_bank = &w_bank_forks[new_root];


hmm what was the race we were trying to prevent here? Replay is the only thread to ever insert/remove from BankForks so the contents shouldn't change.

Ideally we would like to hold the write lock as little as possible so things like RPC don't contend with any of the remaining logic

bank_forks might be modified between the time write-lock is released and the read-lock is obtained here:
https://github.com/solana-labs/solana/blob/9d3afba04/core/src/replay_stage.rs#L2701-L2706

Effectively we need an atomic downgrade here to release the write-lock and obtain the read-lock without letting any writes go in between, but that is not supported yet: rust-lang/rust#32527

It looks like there is only one caller running updates on this structure and is doing so from a single thread. After the update we don't really need to hold any lock. I guess we're just using the read lock to get access to the structure. Is this correct @carllin. If this is the case a comment may be in order. I just noticed this from code inspection. Without checking callers the drop/reacquire looks suspect.

I guess we're just using the read lock to get access to the structure

Yeah exactly only this replay thread ever updates BankForks, which is why it should be safe.

I think the risk here is that one can easily introduce changes to replay, bankforks, or elsewhere which invalidates that assumption and introduce the race-condition here, without any compile-time or test errors.
We had exactly this scenario with push-votes:
#18552 (review)

Thoughts on resolving this? I can leave this as-is and add a comment describing that the drop/reacquire is safe provided that this update function is only called in a single threaded fashion. I could try to make this more obvious by changing the function name to include a "_single_threaded" suffix.

My vote is to target for correctness and thread-safety and optimize later if it turned out to be a bottleneck. But I leave this to Carl since he knows this code much better than me.

I'm for keeping it as is right now.

I see Behzad's point that if we introduced the ability for another thread to write to BankForks this could break, but that would break a bunch of other places in Replay as well. If we ever decided we wanted to introduce a model in which more than one thread could update BankForks, I think we would have to overhaul a significant portion of ReplayStage, at which point I think i would be appropriate to go over these things.

Until then we work under the assumption that BankForks can only have single threaded updates seems fine.

If this is the case a comment may be in order. I just noticed this from code inspection.

@jbiseda marking a comment here is not a bad idea :smile

leaving as-is

jbiseda marked this pull request as ready for review July 30, 2021 14:06

jbiseda requested review from carllin, behzadnouri and mvines July 30, 2021 14:06

jbiseda marked this pull request as draft August 7, 2021 21:57

jbiseda force-pushed the lock_cleanup branch from a5f8d4a to f3ad838 Compare August 7, 2021 22:37

jbiseda marked this pull request as ready for review August 8, 2021 01:01

jbiseda changed the title ~~retransmit without holding lock on cluster_nodes~~ improve replay_stage lock granularity Aug 8, 2021

behzadnouri reviewed Aug 9, 2021

View reviewed changes

core/src/replay_stage.rs Outdated Show resolved Hide resolved

core/src/replay_stage.rs Outdated Show resolved Hide resolved

runtime/src/bank_forks.rs Outdated Show resolved Hide resolved

core/src/replay_stage.rs Outdated Show resolved Hide resolved

write lock

ff3302a

jbiseda force-pushed the lock_cleanup branch from f3ad838 to ff3302a Compare August 11, 2021 02:33

Merge branch 'master' into lock_cleanup

0364c87

jbiseda changed the title ~~improve replay_stage lock granularity~~ fix race in handle_new_root Aug 17, 2021

stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Aug 28, 2021

stale bot removed the stale [bot only] Added to stale content; results in auto-close after a week. label Aug 29, 2021

carllin reviewed Aug 31, 2021

View reviewed changes

jbiseda closed this Sep 1, 2021

jbiseda deleted the lock_cleanup branch September 1, 2021 21:58

jbiseda restored the lock_cleanup branch September 2, 2021 00:26

jbiseda mentioned this pull request Sep 2, 2021

add synchronization comment to handle_new_root #19571

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix race in handle_new_root #18984

fix race in handle_new_root #18984

jbiseda commented Jul 30, 2021 •

edited

Loading

behzadnouri commented Jul 30, 2021

codecov bot commented Aug 8, 2021 •

edited

Loading

behzadnouri left a comment

stale bot commented Aug 28, 2021

behzadnouri commented Aug 29, 2021

carllin Aug 31, 2021

behzadnouri Aug 31, 2021

jbiseda Aug 31, 2021

carllin Aug 31, 2021 •

edited

Loading

behzadnouri Aug 31, 2021

jbiseda Sep 1, 2021

behzadnouri Sep 1, 2021

carllin Sep 1, 2021 •

edited

Loading

carllin Sep 1, 2021 •

edited

Loading

jbiseda Sep 1, 2021

fix race in handle_new_root #18984

fix race in handle_new_root #18984

Conversation

jbiseda commented Jul 30, 2021 • edited Loading

Problem

Summary of Changes

behzadnouri commented Jul 30, 2021

codecov bot commented Aug 8, 2021 • edited Loading

Codecov Report

behzadnouri left a comment

Choose a reason for hiding this comment

stale bot commented Aug 28, 2021

behzadnouri commented Aug 29, 2021

carllin Aug 31, 2021

Choose a reason for hiding this comment

behzadnouri Aug 31, 2021

Choose a reason for hiding this comment

jbiseda Aug 31, 2021

Choose a reason for hiding this comment

carllin Aug 31, 2021 • edited Loading

Choose a reason for hiding this comment

behzadnouri Aug 31, 2021

Choose a reason for hiding this comment

jbiseda Sep 1, 2021

Choose a reason for hiding this comment

behzadnouri Sep 1, 2021

Choose a reason for hiding this comment

carllin Sep 1, 2021 • edited Loading

Choose a reason for hiding this comment

carllin Sep 1, 2021 • edited Loading

Choose a reason for hiding this comment

jbiseda Sep 1, 2021

Choose a reason for hiding this comment

jbiseda commented Jul 30, 2021 •

edited

Loading

codecov bot commented Aug 8, 2021 •

edited

Loading

carllin Aug 31, 2021 •

edited

Loading

carllin Sep 1, 2021 •

edited

Loading

carllin Sep 1, 2021 •

edited

Loading