Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mining SHA256 via wallet becomes stuck - stopping and starting the miner fixes the problem temporarily #862

Closed
codeofalltrades opened this issue Nov 2, 2020 · 4 comments
Labels
Algo: Sha256d Component: Miner Both PoW and PoS block creation Issue Type: Bug Something isn't working

Comments

@codeofalltrades
Copy link
Collaborator

codeofalltrades commented Nov 2, 2020

Describe the issue
Mining SHA256 via wallet becomes stuck though i never got left behind or banned. Feels like when the 5 in a row limit is reached while on sha, wallet swithes all resources trying to build the 6th but it gets rejected, creating a loop.

**What behavior did you expect? **
Mining SHA256 should not stop syncing the chain.

Mining is stuck at block until you stop the miner generatecontinuous false
and then restart it generatecontinuous true 1

How reliably can you reproduce the issue, what are the steps to do so?

What version of the Veil client are you using, where did you get it (website, self-compiled, etc)?
Qt wallet

What type of machine are you observing the error on (OS/CPU and disk type)?
Windows 10 - Ryzen 3900x
AMD r7 3700x

Any extra information that might be useful in the debugging process.
I have tried with both staking enabled and disabled

Log file
2020-11-02T20:08:01Z UpdateTip: new best=ff8881348a6283a2ec09740be6a30088bada4587490c0c4a7bc5d889f8c5063f height=942845 type=PoS version=0x30000000 tx=2155963 date='2020-11-02T20:07:24Z' cache=0.2MiB(1262txo)
2020-11-02T20:08:03Z ERROR: CreateNewBlock: stale tip.
2020-11-02T20:08:08Z ERROR: CreateNewBlock: stale tip.
2020-11-02T20:08:24Z UpdateTip: new best=fd18796a4c88dd8c24f119f73c020fa2c7053b5db7beae516395c868620d6af1 height=942846 type=PoS

@codeofalltrades codeofalltrades added Issue Type: Bug Something isn't working Component: Miner Both PoW and PoS block creation Algo: Sha256d labels Nov 2, 2020
@codeofalltrades
Copy link
Collaborator Author

codeofalltrades commented Nov 11, 2020

2020-11-06T06:42:43Z PruneStaleBlockIndexes: Checking for stale indexes. Block index size=947775; Chain height=947763
2020-11-06T07:01:15Z ERROR: CreateNewBlock: stale tip.
2020-11-06T07:50:10Z PruneStaleBlockIndexes: Checking for stale indexes. Block index size=947857; Chain height=947833
2020-11-06T07:50:10Z PruneStaleBlockIndexes: Checking for stale indexes. Block index size=947857; Chain height=947834```
Log file data at the block my test got stuck on.

@seanPhill
Copy link
Collaborator

This happened to me again, as I was away from the wallet for half the week.
generatecontinuous false
and it's on its way back. Didn't mine any, as it must have done this on the first day I was away.
So, a couple of times the wallet has kept running and mining fine, and a couple of times it has got stuck on the first day.
Stop mining always fixes it.

@seanPhill
Copy link
Collaborator

seanPhill commented Dec 4, 2020

Tested the master build from merge 874 (while mining SHA265D) for several days and it still got stuck. Notice the "current" date and time at the top, and the "Last block time" at the bottom. (Maybe it is interesting that the time it got stuck was not so long after the last successfully mined block.) Note that that block at 2:44 AM was successfully mined, and confirmed after stopping mining and completing sync, but it still had confirmations remaining to be confirmed at the time of the stoppage.)
Screen Shot 2020-12-04 at 7 20 18 pm

@Zannick
Copy link
Collaborator

Zannick commented Mar 26, 2021

I started looking at this yesterday, and I'm pretty certain it's contention preventing the node from catching up. Once the miner threads themselves started to stall, the time spent building the block templates grew a bunch, and the net logging messages were complaining about timeouts a lot. I added some debug messages to check when locks were waited on for 250ms in miner.cpp and net_processing.cpp:

2021-03-26T18:44:47Z LONG LOCK WAIT: cs_main | miner.cpp:172 | 1072ms
2021-03-26T18:44:47Z LONG LOCK WAIT: cs_main | miner.cpp:1010 | 2611ms
2021-03-26T18:44:47Z BitcoinMiner (2c98): Long nonce selection: 2612ms (avg 0 over 55 loops)
2021-03-26T18:44:47Z LONG LOCK WAIT: cs_main | miner.cpp:172 | 630ms
2021-03-26T18:44:47Z LONG LOCK WAIT: cs_main | net_processing.cpp:3528 | 1531ms
2021-03-26T18:44:47Z LONG LOCK WAIT: cs_main | miner.cpp:172 | 1066ms
2021-03-26T18:44:48Z LONG LOCK WAIT: cs_main | miner.cpp:1010 | 877ms
[... miner.cpp is very common, disabling those ...]
2021-03-26T18:49:33Z LONG LOCK WAIT: cs_main | net_processing.cpp:2495 | 641ms
2021-03-26T18:49:34Z LONG LOCK WAIT: cs_main | net_processing.cpp:661 | 10387ms
2021-03-26T18:50:32Z LONG LOCK WAIT: cs_main | net_processing.cpp:2495 | 58891ms
2021-03-26T18:50:32Z LONG LOCK WAIT: cs_main | net_processing.cpp:3528 | 660ms
2021-03-26T18:50:33Z LONG LOCK WAIT: cs_main | net_processing.cpp:652 | 2172ms
2021-03-26T18:50:35Z LONG LOCK WAIT: cs_main | net_processing.cpp:2251 | 2852ms

Looking at the differences between RandomX mining and SHA256D mining:

  • RandomX threads run in a loop: create a block template, get a nonce, run many hashes. The inner hash loop stops after 100000 hashes.
    static const uint32_t RANDOMX_INNER_LOOP_COUNT = 100000;

    With my hash rate of 250 H/s/thread, that would take a thread 400s to finish, but it jumps out when there's a new block (and block time is set to 60s roughly). So, 1 block per new tip.
  • SHA256D threads run in a loop: create a block template, get a nonce, run many hashes. The inner loop stops after 0x1000 hashes (65536).
    static const int nInnerLoopCount = 0x010000;

    I was able to get a hash rate of 2-2.5 MH/s/thread, so it can take around 0.026s per outer loop, or around 38 block templates per second per thread--over 2200 per new tip.

And since getting a block template and the nonce grab the cs_main lock 3 times, this is a lot of contention with anything that needs that lock, including the other miner threads (which explains why they all seem to stop sometimes).

Setting the nInnerLoopCount to 0x1000000 (256x) was an improvement immediately: net no longer seems blocked, and quickly caught back up. With 8 SHA threads I'm generating about 1 block template per second. I still see occasional complaints about taking a long time to make a block template or wait on a lock, but it's not fully stalled there, and recovers in a bit. EDIT: Hours later, and it's 20 blocks behind and caught up about a minute after turning off mining, staging 20 blocks at once and then applying them.

I note that SHA doesn't have an early exit for a new block, as RandomX does (although RandomX is accessing chainActive every hash here without a lock--is that okay? the code seems inconsistent about locking for chainActive tip or height one-off access):

veil/src/miner.cpp

Lines 1113 to 1114 in fc2233e

if (pblock->nHeight <= chainActive.Height()) {
fBlockFoundAlready = true;

Should that happen for SHA as well? Maybe not every hash, but enough to avoid doing meaningless work.

codeofalltrades added a commit that referenced this issue Apr 18, 2021
a3168e7 [Mining] IncrementExtraNonce without cs_main (Zannick)

Pull request description:

  ### Problem ###
  While mining SHA256D, the wallet periodically gets stuck, failing to stay caught up with the chain. (At the same time, the mining significantly slows down.)

  ### Root Cause ###
  Lock contention; SHA256D is very fast, completing its loop in subsecond time on a modern cpu, meaning every thread goes through lock-heavy sections of the block template creation code multiple times per second. In comparison, RandomX on the same machine takes minutes to complete its loop, usually being interrupted by the creation of a new block instead of reaching the maximal number of attempts.

  ### Solution (partial) ###
  Reduce lock contention on `cs_main`. The nonce setup prior to running the hashes includes grabbing `cs_main` to perform `IncrementExtraNonce`. It takes this lock to safely get the chain tip and then to safely compare the current thread's `prevHeaderHash` against a static value, resetting the nonce to 0 and updating the static value if it doesn't match.

  However, it only uses the chain tip for its height, which we can get without a lock.

  Second, the nonce being reset to 0 is thread-local; other threads are not affected by the first thread to reach this part in a new block. Also, the nonce base is constantly increasing in a separate section, so there does not really appear to be a requirement that nonces start from 0 on a new block; the lock on `nNonce_base` already assures us that we get a unique number for each thread. As such, I believe we can safely remove the nonce reset here. (I think we can also remove the extra increment, but I've left it in for now.)

  Cleanup: Also removes one extraneous copy of the nonce into a thread local variable
  (there were previously two, and I suspect the extra was already being optimized away).

  ### Bounty PR ###
  #862

  ### Bounty Payment Address ##
  `sv1qqpswvjy7s9yrpcmrt3fu0kd8rutrdlq675ntyjxjzn09f965z9dutqpqgg85esvg8mhmyka5kq5vae0qnuw4428vs9d2gu4nz643jv5a72wkqqq73mnxr`

  ### Unit Testing Results ###
  `test_veil.exe` appears to crash on my system, so manually ran all the tests via a script and compared to master. No differences.

Tree-SHA512: 9a5e2d7f7c13af93b82e437625c3164dd8d500cfba3f49c25cef4b9bf8fd0fb48c1c08ef391b0549d030d27dfefd29dff9ff81ef29abdcc2287b6e29749ac242
Zannick added a commit to Zannick/veil that referenced this issue Apr 30, 2021
Validating zerocoin spends requires determining the height of an
accumulator hash, which is done by searching through all the blocks
in the chain starting from genesis, for every txin.

Since the height of a particular hash-denomination pair is not going
to change, we can cache it. This is most useful for in-wallet block
template generation, since each mining thread performs these
validations separately (while holding cs_main).

Adds a SimpleLRUCache template based on 7c9e97a and uses it to
store checksum heights. I chose a particular size for it that should
be large enough for significant spends.

This results in a roughly 30+x speedup of zerocoin spend validation
(26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
Zannick added a commit to Zannick/veil that referenced this issue Apr 30, 2021
Validating zerocoin spends requires determining the height of an
accumulator hash, which is done by searching through all the blocks
in the chain starting from genesis, for every txin.

Since the height of a particular hash-denomination pair is not going
to change, we can cache it. This is most useful for in-wallet block
template generation, since each mining thread performs these
validations separately (while holding cs_main).

Adds a SimpleLRUCache template based on 7c9e97a and uses it to
store checksum heights. I chose a particular size for it that should
be large enough for significant spends.

This results in a roughly 30+x speedup of zerocoin spend validation
(26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
Zannick added a commit to Zannick/veil that referenced this issue Apr 30, 2021
Validating zerocoin spends requires determining the height of an
accumulator hash, which is done by searching through all the blocks
in the chain starting from genesis, for every txin.

Since the height of a particular hash-denomination pair is not going
to change, we can cache it. This is most useful for in-wallet block
template generation, since each mining thread performs these
validations separately (while holding cs_main).

Adds a SimpleLRUCache template based on 7c9e97a and uses it to
store checksum heights. I chose a particular size for it that should
be large enough for significant spends.

This results in a roughly 30+x speedup of zerocoin spend validation
(26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
codeofalltrades added a commit that referenced this issue May 2, 2021
2270c94 [Mining] Reduce lock contention with SHA256D (Zannick)

Pull request description:

  ### Problem ###
  While mining SHA256D, the wallet periodically gets stuck, failing to stay caught up with the chain. (At the same time, the mining significantly slows down.)

  ### Root Cause ###
  Lock contention; SHA256D is very fast, completing its loop in subsecond time on a modern cpu, meaning every thread goes through lock-heavy sections of the block template creation code multiple times per second. In comparison, RandomX on the same machine takes minutes to complete its loop, usually being interrupted by the creation of a new block instead of reaching the maximal number of attempts.

  ### Solution ###
  Increase the number of loop iterations for SHA256D. To avoid doing useless work, also periodically check for a new block to end the outer iteration early.

  * Adds an additional loop inside `BitcoinMiner` for SHA256D only, which reduces (ideally, minimizes) the number of times per chain tip that the mining thread needs to grab `cs_main`.
  *  Periodically checks `chainActive.Height()` to see if a block has already been found, in order to exit early, similar to RandomX.

  ### Bounty PR ###
  #862

  ### Bounty Payment Address ##
  `sv1qqpswvjy7s9yrpcmrt3fu0kd8rutrdlq675ntyjxjzn09f965z9dutqpqgg85esvg8mhmyka5kq5vae0qnuw4428vs9d2gu4nz643jv5a72wkqqq73mnxr`

  ### Unit Testing Results ###
  Manually ran all tests in `src/test/` via a script and compared to master. No differences.
  Ran on testnet.

Tree-SHA512: 30ddd85fcac9e732691468da3764b43961b29c4fd3ed07f1e3b8d1edd7c4c67f640897d19dc6a607a04b0899e7ab24537c9ca301ecec98bb37047b47075c9855
Zannick added a commit to Zannick/veil that referenced this issue May 2, 2021
Validating zerocoin spends requires determining the height of an
accumulator hash, which is done by searching through all the blocks
in the chain starting from genesis, for every txin.

Since the height of a particular hash-denomination pair is not going
to change, we can cache it. This is most useful for in-wallet block
template generation, since each mining thread performs these
validations separately (while holding cs_main).

Adds a SimpleLRUCache template based on 7c9e97a and uses it to
store checksum heights. I chose a particular size for it that should
be large enough for significant spends.

This results in a roughly 30+x speedup of zerocoin spend validation
(26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
codeofalltrades added a commit that referenced this issue May 3, 2021
1235e1c [Zerocoin][Validation] Cache checksum heights (Zannick)

Pull request description:

  ### Problem ###
  While mining SHA256D, the wallet periodically gets stuck, failing to stay caught up with the chain, **even with #915**. (At the same time, the mining significantly slows down.)

  ### Root Cause ###
  Lock contention; the cs_main lock is held throughout critical parts of block template creation, done independently per mining thread. The intermittent stuckness appears correlated with large numbers of txins during zerocoin spend validation. This requires determining the height of an accumulator hash, which is done by searching through all the blocks in the chain starting from genesis, for every txin. My debug logs (with -debug=bench) show very consistent examples like `- Verify 646 txins: 17392.00ms (26.923ms/txin)`.

  ### Solution ###
  Since the height of a particular hash-denomination pair is not going to change, we can cache it. One mining thread will still have to find the height to begin with, but all the other threads will not need to. (Also, hashes may persist to be reused in later blocks; there are likely multiple coins of the same denom with the same accumulator hash.)

  Adds a SimpleLRUCache template based on 7c9e97a and uses it to store checksum heights. I chose a somewhat arbitrary size for it that should be large enough for significant spends.

  This results in a **significant** speedup of zerocoin spend validation (alternately got `- Verify 101 txins: 1.00ms (0.010ms/txin)` , `- Verify 101 txins: 0.00ms (0.000ms/txin)` on one block). Vastly improves #862 alongside #915--I can confirm it still gets stuck without #915.

  ### Bounty PR ###
  #862

  ### Bounty Payment Address ##
  `sv1qqpswvjy7s9yrpcmrt3fu0kd8rutrdlq675ntyjxjzn09f965z9dutqpqgg85esvg8mhmyka5kq5vae0qnuw4428vs9d2gu4nz643jv5a72wkqqq73mnxr`

  ### Testing ###
  testnet: built with `-with-sanitizers=thread -O0 -g` and run with 3/4 threads mining sha256d, and spending tons of 10s at once to stress-test this section.
  mainnet: running 12 threads sha256d, built with #915 and `-O3` and

  ```patch
  diff --git a/src/validation.cpp b/src/validation.cpp
  index f92427109..a2a2c9ef5 100644
  --- a/src/validation.cpp
  +++ b/src/validation.cpp
  @@ -2863,0 +2895,2 @@ bool CChainState::ConnectBlock(const CBlock& block, CValidationState& state, CBl
           return state.DoS(100, error("%s: CheckQueue failed", __func__), REJECT_INVALID, "block-validation-failed");

       int64_t nTime4 = GetTimeMicros(); nTimeVerify += nTime4 - nTime2;
  +    if (nInputs > 20)
  +        LogPrintf("    - Verify %u txins: %.2fms (%.3fms/txin) [%.2fs (%.2fms/blk)]\n", nInputs - 1, MILLI * (nTime4 - nTime2), nInputs <= 1 ? 0 : MILLI * (nTime4 - nTime2) / (nInputs-1), nTimeVerify * MICRO, nTimeVerify * MILLI / nBlocksTotal);
       LogPrint(BCLog::BENCH, "    - Verify %u txins: %.2fms (%.3fms/txin) [%.2fs (%.2fms/blk)]\n", nInputs - 1, MILLI * (nTime4 - nTime2), nInputs <= 1 ? 0 : MILLI * (nTime4 - nTime2) / (nInputs-1), nTimeVerify * MICRO, nTimeVerify * MILLI / nBlocksTotal);
  ```
  for simple performance monitoring.

Tree-SHA512: 922f79e50e592a69d8ad14092218da515b53223aa77664b60c7ae0fa8444d1587e64ec37cdca17234bb3742cd54f52a8d7c1de9f3333d0521612fec651536395
@Zannick Zannick closed this as completed Jul 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algo: Sha256d Component: Miner Both PoW and PoS block creation Issue Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants