Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

fix race in handle_new_root #18984

Closed
wants to merge 2 commits into from
Closed

Conversation

jbiseda
Copy link
Contributor

@jbiseda jbiseda commented Jul 30, 2021

Problem

Race in handle_new_root.

  • take write lock on bank_forks
  • set bank_forks root
  • drop write lock on bank_forks
  • take read lock on bank_forks
  • expect to access new_root in bank_forks (lock was not continuously held from setting root)

Summary of Changes

  • hold write lock on bank_forks across setting new_root and subsequent computation

@jbiseda jbiseda marked this pull request as ready for review July 30, 2021 14:06
@behzadnouri
Copy link
Contributor

Lock on cluster-nodes will be removed in #18971 which also addresses several other issues with how cluster-nodes are computed and then cached in current master code.

@jbiseda jbiseda marked this pull request as draft August 7, 2021 21:57
@codecov
Copy link

codecov bot commented Aug 8, 2021

Codecov Report

Merging #18984 (0364c87) into master (0e5ea36) will increase coverage by 0.0%.
The diff coverage is 78.9%.

@@           Coverage Diff           @@
##           master   #18984   +/-   ##
=======================================
  Coverage    82.8%    82.8%           
=======================================
  Files         455      455           
  Lines      130132   130069   -63     
=======================================
+ Hits       107794   107795    +1     
+ Misses      22338    22274   -64     

@jbiseda jbiseda marked this pull request as ready for review August 8, 2021 01:01
@jbiseda jbiseda changed the title retransmit without holding lock on cluster_nodes improve replay_stage lock granularity Aug 8, 2021
Copy link
Contributor

@behzadnouri behzadnouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this change. I think inspecting the locks in the code and ensuring that there are no redundant lock contentions is very valuable, such as the case of bank_forks.read() which is patched in this change and I am +1 on that.

However I have couple of general concerns:

  • Lock, unlock, then lock again pattern generally adds race-conditions when the object is modified between the first lock and the 2nd lock. So the values read/obtained from the first lock may no longer be valid by the time we obtain the 2nd lock (which can be significantly later in time due to lock contentions) or the object may be in an entirely different state.
  • Making locks more fine-grained or pushing lock acquisitions down the call-stack can easily introduce dead-locks if different call-stacks in different threads lock 2 mutexes in different orders. There has already been reports of such concurrency bugs in the code.
  • Pushing lock acquisitions down the call-stack may actually exacerbate performance, because, for example, instead of locking a mutex just once and performing some task for all the values in a vector, the code will end-up locking the mutex once for each item in the vector, release the lock and then lock it again for the next item, and as a result adding extra locking overhead and contention between every two elements of the vector.
  • Given above risks, it will be always good to drive these changes with performance benchmarks or metrics where we do already observe a lock contention and then can measure improvements. For example in the case of authorized_voter_keypairs or &r_bank_forks[new_root] in this change, I doubt that performance-wise it makes any difference here, however it may introduce above risks.

core/src/replay_stage.rs Outdated Show resolved Hide resolved
core/src/replay_stage.rs Outdated Show resolved Hide resolved
runtime/src/bank_forks.rs Outdated Show resolved Hide resolved
core/src/replay_stage.rs Outdated Show resolved Hide resolved
@jbiseda jbiseda changed the title improve replay_stage lock granularity fix race in handle_new_root Aug 17, 2021
@stale
Copy link

stale bot commented Aug 28, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Aug 28, 2021
@behzadnouri
Copy link
Contributor

@jbiseda Is this code ready to be reviewed again?

LGTM, maybe update the description explaining what causes the race condition here.
also, cc @carllin to take a look since he is more familiar with this part of the code.

@stale stale bot removed the stale [bot only] Added to stale content; results in auto-close after a week. label Aug 29, 2021
Comment on lines +2701 to +2707
let mut w_bank_forks = bank_forks.write().unwrap();
w_bank_forks.set_root(
new_root,
accounts_background_request_sender,
highest_confirmed_root,
);
let r_bank_forks = bank_forks.read().unwrap();
let new_root_bank = &r_bank_forks[new_root];
let new_root_bank = &w_bank_forks[new_root];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm what was the race we were trying to prevent here? Replay is the only thread to ever insert/remove from BankForks so the contents shouldn't change.

Ideally we would like to hold the write lock as little as possible so things like RPC don't contend with any of the remaining logic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bank_forks might be modified between the time write-lock is released and the read-lock is obtained here:
https://github.com/solana-labs/solana/blob/9d3afba04/core/src/replay_stage.rs#L2701-L2706

Effectively we need an atomic downgrade here to release the write-lock and obtain the read-lock without letting any writes go in between, but that is not supported yet: rust-lang/rust#32527

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is only one caller running updates on this structure and is doing so from a single thread. After the update we don't really need to hold any lock. I guess we're just using the read lock to get access to the structure. Is this correct @carllin. If this is the case a comment may be in order. I just noticed this from code inspection. Without checking callers the drop/reacquire looks suspect.

Copy link
Contributor

@carllin carllin Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we're just using the read lock to get access to the structure

Yeah exactly only this replay thread ever updates BankForks, which is why it should be safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the risk here is that one can easily introduce changes to replay, bankforks, or elsewhere which invalidates that assumption and introduce the race-condition here, without any compile-time or test errors.
We had exactly this scenario with push-votes:
#18552 (review)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on resolving this? I can leave this as-is and add a comment describing that the drop/reacquire is safe provided that this update function is only called in a single threaded fashion. I could try to make this more obvious by changing the function name to include a "_single_threaded" suffix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote is to target for correctness and thread-safety and optimize later if it turned out to be a bottleneck. But I leave this to Carl since he knows this code much better than me.

Copy link
Contributor

@carllin carllin Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for keeping it as is right now.

I see Behzad's point that if we introduced the ability for another thread to write to BankForks this could break, but that would break a bunch of other places in Replay as well. If we ever decided we wanted to introduce a model in which more than one thread could update BankForks, I think we would have to overhaul a significant portion of ReplayStage, at which point I think i would be appropriate to go over these things.

Until then we work under the assumption that BankForks can only have single threaded updates seems fine.

Copy link
Contributor

@carllin carllin Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the case a comment may be in order. I just noticed this from code inspection.

@jbiseda marking a comment here is not a bad idea :smile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving as-is

@jbiseda jbiseda closed this Sep 1, 2021
@jbiseda jbiseda deleted the lock_cleanup branch September 1, 2021 21:58
@jbiseda jbiseda restored the lock_cleanup branch September 2, 2021 00:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants