-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split up mempool fetch and mempool write #77
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Some comments that apply to whole commits and not specific code lines)
7389f42 Connection duration logging
Why is including the time spent sending the reply desirable? (I also commented on that here: #75 (comment))
d2c6ab5 Additional verbose error handling
The error-chain
crate's chain_err()
method already does chain the original error as the context for the error returned from the closure, so there should be no need to do this manually.
You can use display_chain()
where the error is displayed to show its full context (like for example here).
72a90bc Fn duration instrumentation
We already have a mechanism in place to instrument function execution durations using Prometheus. Why introduce another one?
Prometheus has some pretty useful reporting, visualization and querying abilities, and should be better suited and more robust for this. It already covers many of the functions, including some that the log_fn_duration
instrumentation was also added to.
The rust-prometheus
API is also pretty ergonomic to work with - the timer stops automatically when dropped so that you don't have to calculate durations manually, which also means that you don't have to hold return values in a temporary res
variable and can instead return them directly.
Also, it seems like this might be overly verbose - do we really need to measure things like the simple match in get_electrum_height()
or the hex string parsing in hash_for_value()
? Ideally I'd prefer the code not to be overly cluttered with instrumentations.
93571b9 Using thread::available_parallelism for rayon::ThreadPoolBuilder
Using the number of available cores is rayon's default. Removing the call to ThreadPoolBuilder::num_threads()
should have the same effect.
(Note that the documentation does mention that this behavior may change in the future, but IMO we should deal with this when/if we upgrade to a rayon release that changes this. It might be the case that they never do, or that the new default behavior they implement is preferable for us too.)
28d3c6b
to
5ebb797
Compare
b091634
to
acc05bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two tiny nits, all LGTM other than that. Thanks James!
Could you please squash the "address review" commits (leaving the "remove stale mempool txs" commit separate)? Also, can you mention the change regarding the mempool sync startup loop in the commit?
acc05bb
to
a0ff028
Compare
@shesek Thanks for the review! I believe I've addressed your latest comments. Please let me know if I missed anything / misunderstood. |
… comments from shesek
430ef38
to
8514a47
Compare
8514a47
to
359edaa
Compare
See this commit on the PR based off this PR: mempool/electrs@9e0ecad If someone tried to call broadcast with a tx we are trying to update in between the grabbing of old_txids and before adding the tx in the update call, the tx will already exist and since the history entries are in Vecs you could get duplicate history entries. |
Currently, the mempool's lock is acquired before doing the following steps to update the in-memory version of the mempool:
Right now, all four steps are done while holding a write lock on the mempool.
We suspect that this is causing incoming requests which read from the query's mempool to fail, because steps 1 - 4 together
In reality, step 1 only needs a read lock on the mempool, step 2 and 3 need no lock and only step 4 needs the write lock.
This PR updates the code to only acquire a write lock for step 4.
It also adds logging around each step, so that we can more-easily see were the bottlenecks are when long response times occasionally occur.
Since the lock is dropped between step 1 and step 4, this opens up potential consistency issues. However, step 4 is the only thing which updates the mempool -- no updates come from other threads, so this shouldn't cause any issues.