-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Multithreaded POW mining worker #9629
Multithreaded POW mining worker #9629
Conversation
User @Wizdave97, please sign the CLA here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I said before, multithreaded PoW is already working.
On the other hand, this PR forces the mining thread to be designed with a certain lifecycle style (which must deliberately be designed with stream/future), but that is not always possible. Some mining algorithms have certain "tricks" that coordinates multithreading themselves. For them with this PR it's unusable. The pow module should strive to be as generic as possible to support as many Substrate chains as possible.
@sorpaas in the current code the mining worker is wrapped in an Arc and mutex, so multithreaded access to the underlying worker and hence mining metadata is not possible, since each thread races to acquire a lock to the worker and once that happens in one thread, all other threads become blocked on waiting to access a lock, so in essence only a single thread will be used to compute the seal. Whereas in this approach of using a stream, the metadata is broadcast across a channel and all threads listening can acess this data and true multithreading can be achieved since each thread can use the data receieved to try and compute a seal, no thread is blocked or starved of mining metadata in this approach. |
@sorpaas I don't think this obstructs any third party from implementing multithreading, they just have to listen to the stream instead of acquiring a lock to a mutex. WIth this approach we eliminate the possibility of creating deadlocks as a result of faulty strategies in acquiring a lock to the mutex |
653f078
to
afe91b0
Compare
That's a misunderstanding of Rust's future. Performance gain of Rust future is obtained when you have less threads than concurrent tasks. If you already need that many sync threads, then you're already paying all the costs of threads. If you look into the |
We're not using futures for performance gain, we're using futures for synchronization, which in turn gives us performance gain. sealing threads before this pr
sealing threads after this pr
|
I'd appreciate it if you can show some performance metrics if you want to convince me that it's actually useful to use stream/future here. It's all the same thread lock primitives under the hood so I'm not sure why using a plain |
Will get back to you with that asap |
If your loop is really "hot" (in that the compute is really fast), then you can create an atomic value in In general, future uses the same primitives as threads so unless you're doing the things future are designed for, nothing will magically be faster. |
56390a3
to
79cbe8b
Compare
@sorpaas Here are the benchmarks results below, For the default mutex worker use the Default Mutex workerPlatform: linux Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz Total memory: 25.047011328gb Difficulty:1_000_000_000 Block times(seconds) from Genesis block Average Block time: 892.329052636498 Multithreaded mining workerPlatform: linux Total memory: 25.047011328gb Benchmarked with 8 threads Block times (seconds) from Genesis block Average block time: 346.687263162512 |
67f4b9e
to
4d5f55c
Compare
You're using Also I'd appreciate to see if you can port https://github.com/kulupu/kulupu/blob/master/pow/src/lib.rs over to work with this PR. It has an optimization where the execution of a new loop is dependent on the previous loop. This PR actually looks like a revert of #7060. We did handle the whole thread management within Substrate for mining workers, but later figured out that just won't work for any mining algorithms that's slightly complex. |
67e2cc7
to
8060a5c
Compare
@sorpaas I was not able to get the benchmarks for the multithreaded mining worker based on the mutex worker, it keeps deadlocking once multiple threads are enabled, if you can take a look at the code and see what's wrong, that would go a long way in helping speed up getting those benchmarks. https://github.com/polytope-labs/sybil -> branch -> |
8060a5c
to
cdb68cd
Compare
Let me get this straight -- unless there can be drastic performance improvement with this change, I don't think we'd want to revert back to the model in this PR, and I see the chance of this ever getting merged to be slim. Handling threads internally in the |
@sorpaas Any luck getting the multithreading to work here https://github.com/polytope-labs/sybil -> branch -> |
@Wizdave97 Try #9698 and see if it works for you. It handles locking internally now so should prevent most possible misuses. |
Mutex based worker linux Block times(seconds) Avg Block time: 270.0583499908447 Stream based worker linux Block times(seconds) Average block time: 408.88565000295637 |
What this PR does?
This PR modifies the
sc_consensus_pow::start_mining worker
to return a clonable channel that recieves a broadcast of the latest mining metadata, so nodes can try to compute the seal using a multithreaded approach.