Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implements copy-on-write for staked-nodes #19090

Merged
merged 1 commit into from
Aug 10, 2021

Conversation

behzadnouri
Copy link
Contributor

Problem

Bank::staked_nodes and Bank::epoch_staked_nodes redundantly clone
staked-nodes HashMap even though an immutable reference will suffice:
https://github.com/solana-labs/solana/blob/a9014cece/runtime/src/vote_account.rs#L77

Summary of Changes

This commit implements copy-on-write semantics for staked-nodes by
wrapping the underlying HashMap in Arc<...>.

@codecov
Copy link

codecov bot commented Aug 5, 2021

Codecov Report

Merging #19090 (6e3799e) into master (8878f52) will increase coverage by 0.0%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##           master   #19090   +/-   ##
=======================================
  Coverage    82.8%    82.8%           
=======================================
  Files         449      449           
  Lines      128281   128312   +31     
=======================================
+ Hits       106280   106319   +39     
+ Misses      22001    21993    -8     

@behzadnouri behzadnouri requested review from carllin and sakridge August 5, 2021 21:20
@behzadnouri behzadnouri force-pushed the arc-staked-nodes-lite branch from b6694f4 to c8850ad Compare August 5, 2021 21:53
Comment on lines +128 to +130
let mut staked_nodes = self.staked_nodes.write().unwrap();
let staked_nodes = Arc::make_mut(&mut staked_nodes);
staked_nodes
Copy link
Contributor

@carllin carllin Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm where the old code persisted the new pubkey in self.staked_nodes, does this clone self.staked_nodes and only add the new entry to the local, cloned copy of staked_nodes? This won't persist the insert into the self.staked_nodes for other callers to observe right?

From what I can tell this function is only called from Bank::store_account() which is used in ledger-tool and expects the store to be persisted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It clones the inner pointer, so the changes are persisted.
Please see the added test_staked_nodes_cow or the example in Arc::make_mut:
https://doc.rust-lang.org/std/sync/struct.Arc.html#method.make_mut

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I mixed up what was being cloned, thanks for the clarification.

And the cost of cloning is acceptable because Bank::store_account() is only used in ledger-tool where there are no other outstanding references that would trigger this clone?

To avoid silent perf issues in the future if somebody calls store_account more intensively, should we be more explicit about this and use get_mut() instead of make_mut() and return a boolean from this function to indicate success/failure,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code only makes a clone if:

  • .staked_nodes was invoked since last mutation (i.e. since last make_mut).
  • That returned Arc<...> is still alive when you want to mutate the hash-map again (i.e. when you call make_mut).

Current master code is making a clone every time .staked_ndoes is invoked:
https://github.com/solana-labs/solana/blob/fd937548a/runtime/src/vote_account.rs#L77

So this code always does fewer clones than current master code, and make_mut does not necessarily make a clone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That returned Arc<...> is still alive when you want to mutate the hash-map again (i.e. when you call make_mut).

Right, I'm worried about these calls

solana/ledger-tool/src/main.rs

Lines 2064 to 2067 in fd93754

bank.store_account(
&faucet_pubkey,
&AccountSharedData::new(faucet_lamports, 0, &system_program::id()),
);
silently incurring a clone in the future if some other thread is holding another reference to this Arc. I feel like being explicit with a result from something like a store_without_clone that we can assert on makes this more transparent, and avoids silently incurring this cost.

Copy link
Contributor Author

@behzadnouri behzadnouri Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please describe the scenario that previously it was not making a clone and now it does?

  1. Previously every time one calls .staked_nodes() it had to make a clone. No exceptions there.
  2. Now it clones only if all these happen:
    • you call .staked_nodes() (since the very last time make_mut was invoked).
    • and hold on to the returned Arc<...>
    • and then also try to mutate stake-nodes while holding to that Arc.
      • After which there is no more clones if you mutate staked-nodes again.

(2) is a subset of (1), so for every time that this code ends up making a clone, it was making a clone previously anyways; and so this code is removing clones, not adding clones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please describe the scenario that previously it was not making a clone and now it does?

I don't think there is one currently, was referring to future-proofing calls like

solana/ledger-tool/src/main.rs

Lines 2064 to 2067 in fd93754

bank.store_account(
&faucet_pubkey,
&AccountSharedData::new(faucet_lamports, 0, &system_program::id()),
);
in case anybody does add a case where they hold a returned Arc. This way anybody calling these functions will be cognizant of when a clone is incurred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still like current call-sites, this code cannot introduce any new clones (where with the old code there was no clone).
I would be happy to discuss this more over a call or something next time we chat. For now, any objections to merge this?

Copy link
Contributor

@carllin carllin Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be happy to discuss this more over a call or something next time we chat

Sounds good, will probably be clearer with a discussion. No other objections!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you 🙏

Bank::staked_nodes and Bank::epoch_staked_nodes redundantly clone
staked-nodes HashMap even though an immutable reference will suffice:
https://github.com/solana-labs/solana/blob/a9014cece/runtime/src/vote_account.rs#L77

This commit implements copy-on-write semantics for staked-nodes by
wrapping the underlying HashMap in Arc<...>.
@behzadnouri behzadnouri force-pushed the arc-staked-nodes-lite branch from c8850ad to 6e3799e Compare August 6, 2021 13:51
Copy link
Member

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@behzadnouri behzadnouri merged commit f302774 into solana-labs:master Aug 10, 2021
@behzadnouri behzadnouri deleted the arc-staked-nodes-lite branch August 10, 2021 12:59
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Aug 11, 2021
Bank::vote_accounts redundantly clones vote-accounts HashMap even though
an immutable reference will suffice:
solana-labs#19090

This commit implements copy-on-write semantics for vote-accounts by
wrapping the underlying HashMap in Arc<...>.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Aug 11, 2021
Bank::vote_accounts redundantly clones vote-accounts HashMap even though
an immutable reference will suffice:
solana-labs#19090

This commit implements copy-on-write semantics for vote-accounts by
wrapping the underlying HashMap in Arc<...>.
mergify bot pushed a commit that referenced this pull request Jan 14, 2022
Bank::staked_nodes and Bank::epoch_staked_nodes redundantly clone
staked-nodes HashMap even though an immutable reference will suffice:
https://github.com/solana-labs/solana/blob/a9014cece/runtime/src/vote_account.rs#L77

This commit implements copy-on-write semantics for staked-nodes by
wrapping the underlying HashMap in Arc<...>.

(cherry picked from commit f302774)

# Conflicts:
#	runtime/src/bank.rs
#	runtime/src/stakes.rs
#	runtime/src/vote_account.rs
mergify bot added a commit that referenced this pull request Jan 14, 2022
* implements copy-on-write for staked-nodes (#19090)

Bank::staked_nodes and Bank::epoch_staked_nodes redundantly clone
staked-nodes HashMap even though an immutable reference will suffice:
https://github.com/solana-labs/solana/blob/a9014cece/runtime/src/vote_account.rs#L77

This commit implements copy-on-write semantics for staked-nodes by
wrapping the underlying HashMap in Arc<...>.

(cherry picked from commit f302774)

# Conflicts:
#	runtime/src/bank.rs
#	runtime/src/stakes.rs
#	runtime/src/vote_account.rs

* removes backport merge conflicts

Co-authored-by: behzad nouri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants