Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Consensus] Protect mapBlockIndex with its own mutex #951

Merged
merged 2 commits into from
Jun 15, 2021

Conversation

Zannick
Copy link
Collaborator

@Zannick Zannick commented May 18, 2021

Problem

tsan identified a data race on CChainState, particularly between AcceptBlock and ActivateBestChain (used often by the miner threads).

Solution

Mark mapBlockIndex as guarded by cs_mapblockindex, and ensure that the lock is taken prior to any use. Some code has been reorganized slightly to reduce the time spent holding the lock and avoid locking a second time for the same information.

Move many lookups to use LookupBlockIndex, which grabs the lock itself, and remove lots of cs_main uses that would have otherwise blocked only for LookupBlockIndex.

Tested

Original: Configured with --with-sanitizers=tsan, built with clang, run on regtest with 2 SHA mining threads enabled.
With cs_mapblockindex: configured --with-sanitizers=tsan, built with clang, run on testnet to catch up and mine 2 sha threads.

@Zannick Zannick self-assigned this May 18, 2021
@CaveSpectre11 CaveSpectre11 added Component: Consensus Part of the core cryptocurrency consensus protocol Component: Core App Related to the application itself. Tag: Waiting For Code Review Waiting for code review from a core developer labels May 18, 2021
Copy link
Collaborator

@CaveSpectre11 CaveSpectre11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that bugs me is that cs_main is used as a general lock, when we should be locking individual things. Since this is targeting mapBlockIndex, do you see a reason to not create a lock specifically for mapBlockIndex so we can start separating from this giant super lock that is preventing significant multithreading to happen?

Comment on lines 3610 to 3611
LOCK(cs_main);
// If mapBlockIndex isn't bloated, don't bother taking the time.
if (chainActive.Height() > static_cast<int>(mapBlockIndex.size() - PRUNE_COUNT)) {
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This area of code was a major place of inefficiency (primarily later on since IsAncestor took way longer than chainActive.Contains, and it was being done first). It's probably not something I want to lock through the whole thing, since it's looking for stale tips that are old (PRUNE_DEPTH).

I think I would prefer locking to check if mapBlockIndex is bloated; and if so, make a copy, unlock and then let it run with the copy even if mapBlockIndex changes; so we're not locked for the whole check if it's a long check [e.g. on startup].

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated to use a separate lock, but missed this section in the subsequent passes. Will get back to this when I have a chance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to split this in two critical sections, and not use cs_main at all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work so far!

@Zannick
Copy link
Collaborator Author

Zannick commented May 19, 2021

Let me look into using a separate lock. I think at the time it was not especially clear whether cs_main was already locking for mapBlockIndex in some places.

@CaveSpectre11
Copy link
Collaborator

Let me look into using a separate lock. I think at the time it was not especially clear whether cs_main was already locking for mapBlockIndex in some places.

it was definitely requiring it (in some places ;))

inline CBlockIndex* LookupBlockIndex(const uint256& hash)
{
    AssertLockHeld(cs_main);
    BlockMap::const_iterator it = mapBlockIndex.find(hash);
    return it == mapBlockIndex.end() ? nullptr : it->second;
}

@Zannick Zannick added Tag: Waiting For Developer and removed Tag: Waiting For Code Review Waiting for code review from a core developer labels May 22, 2021
@Zannick Zannick force-pushed the mapblockindex branch 2 times, most recently from fb97aab to 042bcdc Compare May 29, 2021 19:49
@Zannick Zannick added Tag: Waiting For Code Review Waiting for code review from a core developer Tag: Waiting For Developer and removed Tag: Waiting For Developer labels May 29, 2021
@Zannick
Copy link
Collaborator Author

Zannick commented May 29, 2021

tsan reported a data race in GetAncestor, so this is not ready to merge. I will have to look into that.

@CaveSpectre11 CaveSpectre11 removed the Tag: Waiting For Code Review Waiting for code review from a core developer label May 30, 2021
Zannick and others added 2 commits May 31, 2021 07:51
Declare mapBlockIndex to be protected by cs_mapblockindex.

Move a lot of generic lookups to LookupBlockIndex, while
other more complex uses (e.g. atomic lookup+insert) are left alone.

Make sure the lock is taken everywhere necessary (places found with clang
thread-safety warnings) and only as tightly as possible to avoid
potential lock cycles (detectable by tsan).

Many uses of cs_main are replaced by cs_mapblockindex if they were only
protecting mapBlockIndex, or removed (LookupBlockIndex taking the lock
itself), or moved to after LookupBlockIndex calls.

Rework PruneStaleBlockIndexes to do most work without the lock.
===
Veil: cannot presently remove SetNull helper due to other use,
however, this is still worthwhile due to the initialization changes.
@Zannick
Copy link
Collaborator Author

Zannick commented May 31, 2021

Cherry-picked explicit default initialization in CBlockIndex, this quiets the tsan warning about a read conflict with a write in operator::new.

@Zannick Zannick added Tag: Waiting For Code Review Waiting for code review from a core developer and removed Tag: Waiting For Developer labels May 31, 2021
@Zannick Zannick changed the title [Consensus] Declare mapBlockIndex protected by cs_main. [Consensus] Protect mapBlockIndex with its own mutex May 31, 2021
Copy link
Collaborator

@codeofalltrades codeofalltrades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ad442a4

Copy link
Collaborator

@CaveSpectre11 CaveSpectre11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 13dd025

@CaveSpectre11 CaveSpectre11 added Code Review: Passed and removed Tag: Waiting For Code Review Waiting for code review from a core developer labels Jun 12, 2021
@codeofalltrades codeofalltrades merged commit 18742ec into Veil-Project:master Jun 15, 2021
@Zannick Zannick deleted the mapblockindex branch October 22, 2021 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Review: Passed Component: Consensus Part of the core cryptocurrency consensus protocol Component: Core App Related to the application itself.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants