Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RNG is not thread safe. #72

Closed
evoskuil opened this issue Oct 21, 2016 · 4 comments
Closed

RNG is not thread safe. #72

evoskuil opened this issue Oct 21, 2016 · 4 comments
Labels

Comments

@evoskuil
Copy link
Member

Police Terror:

I'm using the current master branch to sync with testnet, but I'm
getting this error:

...
14:15:56.657549 WARNING [server] Failed to connect secure notification
worker: operation failed
14:15:56.756083 WARNING [server] Failed to connect secure notification
worker: operation failed
terminate called after throwing an instance of 'std::runtime_error'
what(): random_device::random_device(const std::string&)
14:15:57.415901 INFO [server] Stop signal detected (code: 6).
Aborted (core dumped)

Here's what debug.log is showing:

14:15:57.415478 DEBUG [network] Failure connecting outbound: operation
failed
14:15:57.415501 DEBUG [network] Failure connecting outbound: operation
failed
14:15:57.415570 DEBUG [node] Redundant block
[0000000057bd296621bc2f314e3594c412a0cd9e94f1083554c3184160612450] from
[177.189.250.198:18333]
14:15:57.415605 DEBUG [network] Connecting to [121.40.72.188:18333]
14:15:57.415703 DEBUG [network] Valid block payload from
[177.189.250.198:18333](207 bytes)
14:15:57.415757 DEBUG [network] Connecting to [212.93.226.185:18333]
14:15:57.415816 DEBUG [network] Connecting to [2.55.186.124:18333]
14:15:57.415901 INFO [server] Stop signal detected (code: 6).
14:15:57.415937 DEBUG [node] Redundant block
[000000008a675b534ea86b312f29c9fdcce7b7b07ca1879c144b8e20215ac201] from
[177.189.250.198:18333]
14:15:57.416008 DEBUG [network] Valid block payload from
[177.189.250.198:18333](207 bytes)
g

I've attached the .cfg file I'm using.

Looking at libbitcoin/src/utility/random.cpp, I see that
nonzero_pseudo_random() is throwing. Why isn't that an assert instead?

https://github.com/libbitcoin/libbitcoin/blob/master/src/utility/random.cpp#L46

Here is the backtrace:

9 libbitcoin::pseudo_random () at src/utility/random.cpp:39

10 0x00007ffff5a3d48d in libbitcoin::network::hosts::fetch

(this=this@entry=0x657310,
out=...) at src/collections/hosts.cpp:76

11 0x00007ffff5a27e7a in

libbitcoin::network::p2p::fetch_address(std::function<void
(std::error_code const&, libbitcoin::message::network_address const&)>)
const (this=0x656ad0,
handler=...) at src/p2p.cpp:430

12 0x00007ffff5a5ae5f in

libbitcoin::network::session::fetch_address(std::function<void
(std::error_code const&, libbitcoin::config::authority const&)>) const (
this=this@entry=0x7fff280d6330, handler=...) at
src/sessions/session.cpp:83

13 0x00007ffff5a6064e in

libbitcoin::network::session_batch::new_connect(std::shared_ptrlibbitcoin::network::connector,
std::function<void (std::error_code const&,
std::shared_ptrlibbitcoin::network::channel)>)
(this=this@entry=0x7fff280d6330,
connect=std::shared_ptr (count 115, weak 1) 0x7ffef03f5760, handler=...)
at src/sessions/session_batch.cpp:70

14 0x00007ffff5a60b53 in

libbitcoin::network::session_batch::connect(std::shared_ptrlibbitcoin::network::connector,
std::function<void (std::error_code const&,
std::shared_ptrlibbitcoin::network::channel)>) (this=0x7fff280d6330,
connect=std::shared_ptr (count 115, weak 1) 0x7ffef03f5760, handler=...)
at src/sessions/session_batch.cpp:56

15 0x00007ffff5a6f335 in

libbitcoin::network::session_outbound::new_connection (
this=this@entry=0x7fff280d6330,
connect=std::shared_ptr (count 115, weak 1) 0x7ffef03f5760)
at src/sessions/session_outbound.cpp:87

16 0x00007ffff5a6f7d1 in

libbitcoin::network::session_outbound::handle_connect (
this=0x7fff280d6330, ec=..., channel=...,
connect=std::shared_ptr (count 115, weak 1) 0x7ffef03f5760)
at src/sessions/session_outbound.cpp:97

@evoskuil evoskuil added the bug label Oct 21, 2016
@evoskuil
Copy link
Member Author

The following indicates that the method threw due to encountering 100 zeros in succession. The only explanations I can think of are lack of thread safety (which is documented) or flawed RNG implementation.

terminate called after throwing an instance of 'std::runtime_error'
what(): random_device::random_device(const std::string&)

@evoskuil
Copy link
Member Author

The zero return failure can be resolved by bounding the distribution:

std::uniform_int_distribution<uint64_t> distribution(begin, end);

Although this does not resolve the core issue (of 100 zero returns from the RNG).

@evoskuil
Copy link
Member Author

evoskuil commented Oct 21, 2016

In order to provide efficient thread safety the generator needs to be moved to thread local storage.

@evoskuil
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant