-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mining SHA256 via wallet becomes stuck - stopping and starting the miner fixes the problem temporarily #862
Comments
|
This happened to me again, as I was away from the wallet for half the week. |
I started looking at this yesterday, and I'm pretty certain it's contention preventing the node from catching up. Once the miner threads themselves started to stall, the time spent building the block templates grew a bunch, and the net logging messages were complaining about timeouts a lot. I added some debug messages to check when locks were waited on for 250ms in miner.cpp and net_processing.cpp:
Looking at the differences between RandomX mining and SHA256D mining:
And since getting a block template and the nonce grab the cs_main lock 3 times, this is a lot of contention with anything that needs that lock, including the other miner threads (which explains why they all seem to stop sometimes). Setting the nInnerLoopCount to 0x1000000 (256x) was an improvement immediately: net no longer seems blocked, and quickly caught back up. With 8 SHA threads I'm generating about 1 block template per second. I still see occasional complaints about taking a long time to make a block template or wait on a lock, but it's not fully stalled there, and recovers in a bit. EDIT: Hours later, and it's 20 blocks behind and caught up about a minute after turning off mining, staging 20 blocks at once and then applying them. I note that SHA doesn't have an early exit for a new block, as RandomX does (although RandomX is accessing Lines 1113 to 1114 in fc2233e
Should that happen for SHA as well? Maybe not every hash, but enough to avoid doing meaningless work. |
a3168e7 [Mining] IncrementExtraNonce without cs_main (Zannick) Pull request description: ### Problem ### While mining SHA256D, the wallet periodically gets stuck, failing to stay caught up with the chain. (At the same time, the mining significantly slows down.) ### Root Cause ### Lock contention; SHA256D is very fast, completing its loop in subsecond time on a modern cpu, meaning every thread goes through lock-heavy sections of the block template creation code multiple times per second. In comparison, RandomX on the same machine takes minutes to complete its loop, usually being interrupted by the creation of a new block instead of reaching the maximal number of attempts. ### Solution (partial) ### Reduce lock contention on `cs_main`. The nonce setup prior to running the hashes includes grabbing `cs_main` to perform `IncrementExtraNonce`. It takes this lock to safely get the chain tip and then to safely compare the current thread's `prevHeaderHash` against a static value, resetting the nonce to 0 and updating the static value if it doesn't match. However, it only uses the chain tip for its height, which we can get without a lock. Second, the nonce being reset to 0 is thread-local; other threads are not affected by the first thread to reach this part in a new block. Also, the nonce base is constantly increasing in a separate section, so there does not really appear to be a requirement that nonces start from 0 on a new block; the lock on `nNonce_base` already assures us that we get a unique number for each thread. As such, I believe we can safely remove the nonce reset here. (I think we can also remove the extra increment, but I've left it in for now.) Cleanup: Also removes one extraneous copy of the nonce into a thread local variable (there were previously two, and I suspect the extra was already being optimized away). ### Bounty PR ### #862 ### Bounty Payment Address ## `sv1qqpswvjy7s9yrpcmrt3fu0kd8rutrdlq675ntyjxjzn09f965z9dutqpqgg85esvg8mhmyka5kq5vae0qnuw4428vs9d2gu4nz643jv5a72wkqqq73mnxr` ### Unit Testing Results ### `test_veil.exe` appears to crash on my system, so manually ran all the tests via a script and compared to master. No differences. Tree-SHA512: 9a5e2d7f7c13af93b82e437625c3164dd8d500cfba3f49c25cef4b9bf8fd0fb48c1c08ef391b0549d030d27dfefd29dff9ff81ef29abdcc2287b6e29749ac242
Validating zerocoin spends requires determining the height of an accumulator hash, which is done by searching through all the blocks in the chain starting from genesis, for every txin. Since the height of a particular hash-denomination pair is not going to change, we can cache it. This is most useful for in-wallet block template generation, since each mining thread performs these validations separately (while holding cs_main). Adds a SimpleLRUCache template based on 7c9e97a and uses it to store checksum heights. I chose a particular size for it that should be large enough for significant spends. This results in a roughly 30+x speedup of zerocoin spend validation (26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
Validating zerocoin spends requires determining the height of an accumulator hash, which is done by searching through all the blocks in the chain starting from genesis, for every txin. Since the height of a particular hash-denomination pair is not going to change, we can cache it. This is most useful for in-wallet block template generation, since each mining thread performs these validations separately (while holding cs_main). Adds a SimpleLRUCache template based on 7c9e97a and uses it to store checksum heights. I chose a particular size for it that should be large enough for significant spends. This results in a roughly 30+x speedup of zerocoin spend validation (26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
Validating zerocoin spends requires determining the height of an accumulator hash, which is done by searching through all the blocks in the chain starting from genesis, for every txin. Since the height of a particular hash-denomination pair is not going to change, we can cache it. This is most useful for in-wallet block template generation, since each mining thread performs these validations separately (while holding cs_main). Adds a SimpleLRUCache template based on 7c9e97a and uses it to store checksum heights. I chose a particular size for it that should be large enough for significant spends. This results in a roughly 30+x speedup of zerocoin spend validation (26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
2270c94 [Mining] Reduce lock contention with SHA256D (Zannick) Pull request description: ### Problem ### While mining SHA256D, the wallet periodically gets stuck, failing to stay caught up with the chain. (At the same time, the mining significantly slows down.) ### Root Cause ### Lock contention; SHA256D is very fast, completing its loop in subsecond time on a modern cpu, meaning every thread goes through lock-heavy sections of the block template creation code multiple times per second. In comparison, RandomX on the same machine takes minutes to complete its loop, usually being interrupted by the creation of a new block instead of reaching the maximal number of attempts. ### Solution ### Increase the number of loop iterations for SHA256D. To avoid doing useless work, also periodically check for a new block to end the outer iteration early. * Adds an additional loop inside `BitcoinMiner` for SHA256D only, which reduces (ideally, minimizes) the number of times per chain tip that the mining thread needs to grab `cs_main`. * Periodically checks `chainActive.Height()` to see if a block has already been found, in order to exit early, similar to RandomX. ### Bounty PR ### #862 ### Bounty Payment Address ## `sv1qqpswvjy7s9yrpcmrt3fu0kd8rutrdlq675ntyjxjzn09f965z9dutqpqgg85esvg8mhmyka5kq5vae0qnuw4428vs9d2gu4nz643jv5a72wkqqq73mnxr` ### Unit Testing Results ### Manually ran all tests in `src/test/` via a script and compared to master. No differences. Ran on testnet. Tree-SHA512: 30ddd85fcac9e732691468da3764b43961b29c4fd3ed07f1e3b8d1edd7c4c67f640897d19dc6a607a04b0899e7ab24537c9ca301ecec98bb37047b47075c9855
Validating zerocoin spends requires determining the height of an accumulator hash, which is done by searching through all the blocks in the chain starting from genesis, for every txin. Since the height of a particular hash-denomination pair is not going to change, we can cache it. This is most useful for in-wallet block template generation, since each mining thread performs these validations separately (while holding cs_main). Adds a SimpleLRUCache template based on 7c9e97a and uses it to store checksum heights. I chose a particular size for it that should be large enough for significant spends. This results in a roughly 30+x speedup of zerocoin spend validation (26-30ms/txin -> 0.4-0.9ms/txin). Vastly improves Veil-Project#862 alongside Veil-Project#915.
1235e1c [Zerocoin][Validation] Cache checksum heights (Zannick) Pull request description: ### Problem ### While mining SHA256D, the wallet periodically gets stuck, failing to stay caught up with the chain, **even with #915**. (At the same time, the mining significantly slows down.) ### Root Cause ### Lock contention; the cs_main lock is held throughout critical parts of block template creation, done independently per mining thread. The intermittent stuckness appears correlated with large numbers of txins during zerocoin spend validation. This requires determining the height of an accumulator hash, which is done by searching through all the blocks in the chain starting from genesis, for every txin. My debug logs (with -debug=bench) show very consistent examples like `- Verify 646 txins: 17392.00ms (26.923ms/txin)`. ### Solution ### Since the height of a particular hash-denomination pair is not going to change, we can cache it. One mining thread will still have to find the height to begin with, but all the other threads will not need to. (Also, hashes may persist to be reused in later blocks; there are likely multiple coins of the same denom with the same accumulator hash.) Adds a SimpleLRUCache template based on 7c9e97a and uses it to store checksum heights. I chose a somewhat arbitrary size for it that should be large enough for significant spends. This results in a **significant** speedup of zerocoin spend validation (alternately got `- Verify 101 txins: 1.00ms (0.010ms/txin)` , `- Verify 101 txins: 0.00ms (0.000ms/txin)` on one block). Vastly improves #862 alongside #915--I can confirm it still gets stuck without #915. ### Bounty PR ### #862 ### Bounty Payment Address ## `sv1qqpswvjy7s9yrpcmrt3fu0kd8rutrdlq675ntyjxjzn09f965z9dutqpqgg85esvg8mhmyka5kq5vae0qnuw4428vs9d2gu4nz643jv5a72wkqqq73mnxr` ### Testing ### testnet: built with `-with-sanitizers=thread -O0 -g` and run with 3/4 threads mining sha256d, and spending tons of 10s at once to stress-test this section. mainnet: running 12 threads sha256d, built with #915 and `-O3` and ```patch diff --git a/src/validation.cpp b/src/validation.cpp index f92427109..a2a2c9ef5 100644 --- a/src/validation.cpp +++ b/src/validation.cpp @@ -2863,0 +2895,2 @@ bool CChainState::ConnectBlock(const CBlock& block, CValidationState& state, CBl return state.DoS(100, error("%s: CheckQueue failed", __func__), REJECT_INVALID, "block-validation-failed"); int64_t nTime4 = GetTimeMicros(); nTimeVerify += nTime4 - nTime2; + if (nInputs > 20) + LogPrintf(" - Verify %u txins: %.2fms (%.3fms/txin) [%.2fs (%.2fms/blk)]\n", nInputs - 1, MILLI * (nTime4 - nTime2), nInputs <= 1 ? 0 : MILLI * (nTime4 - nTime2) / (nInputs-1), nTimeVerify * MICRO, nTimeVerify * MILLI / nBlocksTotal); LogPrint(BCLog::BENCH, " - Verify %u txins: %.2fms (%.3fms/txin) [%.2fs (%.2fms/blk)]\n", nInputs - 1, MILLI * (nTime4 - nTime2), nInputs <= 1 ? 0 : MILLI * (nTime4 - nTime2) / (nInputs-1), nTimeVerify * MICRO, nTimeVerify * MILLI / nBlocksTotal); ``` for simple performance monitoring. Tree-SHA512: 922f79e50e592a69d8ad14092218da515b53223aa77664b60c7ae0fa8444d1587e64ec37cdca17234bb3742cd54f52a8d7c1de9f3333d0521612fec651536395
Describe the issue
Mining SHA256 via wallet becomes stuck though i never got left behind or banned. Feels like when the 5 in a row limit is reached while on sha, wallet swithes all resources trying to build the 6th but it gets rejected, creating a loop.
**What behavior did you expect? **
Mining SHA256 should not stop syncing the chain.
Mining is stuck at block until you stop the miner
generatecontinuous false
and then restart it
generatecontinuous true 1
How reliably can you reproduce the issue, what are the steps to do so?
What version of the Veil client are you using, where did you get it (website, self-compiled, etc)?
Qt wallet
What type of machine are you observing the error on (OS/CPU and disk type)?
Windows 10 - Ryzen 3900x
AMD r7 3700x
Any extra information that might be useful in the debugging process.
I have tried with both staking enabled and disabled
Log file
2020-11-02T20:08:01Z UpdateTip: new best=ff8881348a6283a2ec09740be6a30088bada4587490c0c4a7bc5d889f8c5063f height=942845 type=PoS version=0x30000000 tx=2155963 date='2020-11-02T20:07:24Z' cache=0.2MiB(1262txo)
2020-11-02T20:08:03Z ERROR: CreateNewBlock: stale tip.
2020-11-02T20:08:08Z ERROR: CreateNewBlock: stale tip.
2020-11-02T20:08:24Z UpdateTip: new best=fd18796a4c88dd8c24f119f73c020fa2c7053b5db7beae516395c868620d6af1 height=942846 type=PoS
The text was updated successfully, but these errors were encountered: