Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NewSeeded: rename new → try_new, add new_with_fallback #21

Closed
wants to merge 7 commits into from

Conversation

dhardy
Copy link
Owner

@dhardy dhardy commented Oct 23, 2017

Rename OsRng::new → try_new too for consitency
Add dependency on time crate
(Note: better to use chrono, but has no real equivalent to precise_time_ns().)

@pitdicker, can you review?

Rename OsRng::new → try_new too for consitency
Add dependency on time crate
(Note: better to use chrono, but has no real equivalent to precise_time_ns().)
@pitdicker
Copy link

Yes, thank you for asking!

As a test I printed 100 numbers generated with ClockRng, see this gist. It seems to me only the low 20 bits are useful (about 1 second), or even less then that. And then they are only useful once. Also the difference between two times is not constant, but pretty predictable. But taking the difference once should just about be enough to supply the remaining 12 bits for a u32. Something like this:

    let time = precise_time_ns();
    let diff = precise_time_ns() - time;
    let int = (time as u32) ^ (diff as u32 << 20);

A very small PRNG seeded with these 32 bits should give more 'random' results than taking any more from the system clock.

So while the system clock should be good enough for one u32, I don't think having a RNG based on it` is workable.

When using the system clock ass a fallback, what do you think of the following: Initialize a 32-bit 'PCG RXS M XS (LCG)' with it, and use that to initialize the target RNG?

PCG is not as difficult as it seems. With the constants written out a step takes just 4 lines of code. I have the start of something here (different version though).

@pitdicker
Copy link

Can't we use the standard library instead of the time crate? It seems to provide everything we need with SystemTime::now(), duration_since(UNIX_EPOCH) and subsec_nanos().

@dhardy
Copy link
Owner Author

dhardy commented Oct 23, 2017

@pitdicker That sounds much smarter than the code I wrote. Assembling roughly a u32 of entropy and expanding with a small RNG sounds reasonable (better than zero-padding or repeating, and probably not much worse than calling subsec_nanos() repeatedly). On the other hand, generating a fresh u32 via subsec_nanos() or similar for each 4-bytes of the PRNG may still get more entropy overall.

The lower precision part of the time (up to a few days, e.g. 10^6 sec) still provides some varying numbers, but is much more predictable. Whether or not it counts as "entropy" depends on the application (i.e. if you know this was calculated in response to a web query in the last second there is significantly less entropy than if it happened when the server was last rebooted — not that we're going to recommend anyone use system time as entropy for cryptography of course). But using a high-precision u32 repeatedly instead of alternating with the low-precision part like my ClockRng does still only needs to get another 20-bits to get ahead; with 256-bits that's four extra u32s so even with just 5 bits per call (per your gist, looks like we have at least this) this method comes out ahead.

If we wanted to make the fallback as secure as possible, we could require the use of some RNG wrapper, like Reseeding but combining new entropy with current state, with exponential or quadratic back-off on seeding frequency. There's a couple of tradeoffs though: the requirement for an RNG wrapper, reseeding complexity, and susceptibility to entropy-depletion attacks. So not worth it I think.

@pitdicker
Copy link

I agree that making more calls to subsec_nanos() gives a bit more entropy. But I am not really comfortable with it.

Even with very basic statistics the numbers don't look very random. If I sort by the diff2 column in the gist 0 occurs 8 times, and the numbers between -6 and 5 about 3 times on average. Only because there are 13 outliers (probably due to scheduling) in diff1 it looks somewhat random enough.

In rust-lang/rfcs#722 there has been some discussion about reseeding. The linked article https://eprint.iacr.org/2013/338.pdf seems to be the state of the art of analysis around reseeding. It is to way difficult for me, at least without spending days to read it. The takeaway seems to be that extracting entropy and integrating it into an RNGs state is hard.

I don't trust myself (or maybe us) to come up with a good way to really extract more than 32 bits of entropy from the system time. With repeated runs I suspect there are a lot of details that start to matter: actual precision of the hardware timer (in the cpu?), rounding in the kernel, variance of the time between runs...

For a fall-back I would go with something simple.

@pitdicker
Copy link

The lower precision part of the time (up to a few days, e.g. 10^6 sec) still provides some varying numbers, but is much more predictable. Whether or not it counts as "entropy" depends on the application

If I understand you right, this means we could construct an u64? If we only needed one random number, I agree. But if someone were to initialize multiple RNGs, they would not be as different as when each of them was based on a u32 extracted from the system time.

@dhardy
Copy link
Owner Author

dhardy commented Oct 24, 2017

If I understand you right, this means we could construct an u64?

Constructing a u64 is easy; how much of it is unpredictable is another matter. Regarding an attack, it depends on how precisely the attacker knows when the number was drawn; regarding two different seeds being different in the same program, it depends how closely they were made.

I don't trust myself (or maybe us) to come up with a good way to really extract more than 32 bits of entropy from the system time.

Given enough time, and mixed scheduling (not making all the "time" readings in a tight loop), it should be possible to extract an unlimited amount of entropy. But not doing stuff in a tight loop is really going to help, so I think making only one or two readings each time next_u32 is called is much better than only reading from the timer when the ClockRng is constructed is better.

Perhaps best would be to combine the two approaches: construct a simple PRNG, and each time next_u32 is called read the time once or twice and mix it into the PRNG state (possibly with variable shift?).

I still wouldn't trust this approach to withstand a serious attack, but for anything else (including low-resource attacks) it should be fine. Makes me wonder, if we can extract a little entropy from the system timer at a very low cost, maybe we should mix that into the output of OsRng, just in case the system happens to have a broken or compromised entropy source?

@dhardy
Copy link
Owner Author

dhardy commented Oct 24, 2017

I did a little more work to try to estimate available entropy. The good news is that output of the high-resolution counter is well distributed, so there isn't too much rounding going on (on my test system at least). The bad news is that diffs between samples are not at all well distributed, but there may still be 2-4 bits of entropy per call.

That example is simply calling the timer in a tight loop and doing very little work per cycle; if the timer is called less frequently or with allocs or yields between calls there could be far more entropy available.

@pitdicker
Copy link

Wow, you are serious about estimating the entropy of the system clock!

That example is simply calling the timer in a tight loop and doing very little work per cycle; if the timer is called less frequently or with allocs or yields between calls there could be far more entropy available.

I also saw that the difference is much less predictable for the first two rounds of the loop.

What would you suggest after testing? Constructing a single u32/u64 or to really have a ClockRng?

@dhardy
Copy link
Owner Author

dhardy commented Oct 24, 2017

That's true, the difference between the first two samples may be much greater. I initially had differences greater than 1ms, but realised it was probably caused by a HashMap alloc, so switched to pre-allocated array buckets (ignoring large diffs, which I haven't really looked at).

I'm thinking a hybrid approach may be best.

Remove dependency on external crate
Use variation on PCG to mix internal state and generate output
Use multiple init cycles
@dhardy
Copy link
Owner Author

dhardy commented Oct 24, 2017

@pitdicker maybe you'd like to review my latest version? I'm not sure how to estimate entropy, but I think it should be reasonably good (well, still not close to the 64-bits of state I suppose).

It's a tad slower than PCG however...

@dhardy
Copy link
Owner Author

dhardy commented Oct 24, 2017

FYI

test init_clock0        ... bench:       2,126 ns/iter (+/- 31)
test init_clock2        ... bench:      59,307 ns/iter (+/- 728)
test init_clock12       ... bench:     344,871 ns/iter (+/- 6,067)
test init_clock20       ... bench:     573,214 ns/iter (+/- 8,513)
test init_clock32       ... bench:     915,750 ns/iter (+/- 8,663)

@dhardy
Copy link
Owner Author

dhardy commented Oct 24, 2017

See last commit for another approach which is simpler but should be stronger (though slower).

@pitdicker
Copy link

To test the statistical quality I have run a few variants trough PractRand.

d6f8ebf fails after 256 MiB, with only 3 tests not failing.
a306a69 fails after 256 MiB, with only 19 tests not failing.
64f1f63 using ClockRng::new(32); fails after 128 MiB, with only 52 tests not failing.
e40a030 using StrongClockRng fails after 16 MiB (surprisingly low).

If I take a306a69 and apply the MurmurHash finalizer on it, it fails after 256 MiB, with 65 test not failing.

    fn next_u32(&mut self) -> u32 {
        let mut state = get_time() as u32;
        
        state = (state ^ (state >> 16)).wrapping_mul(0x85ebca6b);
        state = (state ^ (state >> 13)).wrapping_mul(0xc2b2ae35);
        state ^ (state >> 16);

        state as u32
    }

So the experiments to extract more randomness from the system clock work against us :-(.

But a hybrid approach works great! Combining a hash of the clock as source of some extra entropy with PCG (messy implementation):

#[derive(Debug)]
pub struct ClockRng {
    state: u32,
}

impl ClockRng {
    /// Create a `ClockRng` (very low cost)
    pub fn new() -> ClockRng {
        let mut state = precise_time_ns() as u32;
        
        state = (state ^ (state >> 16)).wrapping_mul(0x85ebca6b);
        state = (state ^ (state >> 13)).wrapping_mul(0xc2b2ae35);
        state ^ (state >> 16);

        ClockRng { state: state }
    }
}

impl Rng for ClockRng {
    fn next_u32(&mut self) -> u32 {
        // Input 1: high 32 bits of the current time in nanoseconds, hashed with
        // the MurmurHash finalizer.
        let mut time = precise_time_ns() as u32;
        
        time = (time ^ (time >> 16)).wrapping_mul(0x85ebca6b);
        time = (time ^ (time >> 13)).wrapping_mul(0xc2b2ae35);
        time ^ (time >> 16);

        // Input 2: PCG 32 RXS M XS
        let mut state = self.state;
        self.state = state * 747796405 + 2891336453;
        state = ((state >> ((state >> 28) + 4)) ^ state) * 277803737;
        state = (state >> 22) ^ state;

        time ^ state
    }
    /* other methods */
}

A PractRand run is now at 32 GiB without anything to report.

@dhardy
Copy link
Owner Author

dhardy commented Oct 25, 2017

@pitdicker PractRand is designed to measure bias, not entropy. While bias in our output isn't a good thing, it's much less important than entropy. Remember, a fully deterministic PRNG can pass PractRand (not forever maybe, but to more data than you could possibly run). PractRand may still be useful for trying to remove bias I guess.

I am surprised that 64f1f63 does worse than the previous commit. a306a69 does no mixing at all yet gets one of the best results. This surprises me. What happens if you run the same test a couple more times?

The other funny thing is StrongClockRng doing so badly. I expected some bias, but it does worse than a single call to precise_time_ns()? It could be that the many calls to the time function in a tight loop interact badly, but given that they are combined after bit-shift I doubt it.

@dhardy
Copy link
Owner Author

dhardy commented Oct 25, 2017

Directly measuring entropy via the Shannon method is possible, so I did that (roughly 4-5.5 bits per call, but only last 2 bits have high entropy).

So if we mix the result well we should get a good quality generator, right? Lets try that...

Also appears to double performance; this might be because the >sec parts
of the time are now ignored and/or algorithm being more parallelizable.
@dhardy
Copy link
Owner Author

dhardy commented Oct 25, 2017

@pitdicker if you care to, please test the new StrongClockRng; hopefully it's much less biased (although I'm not that familiar with mixing).

I'm considering removing ClockRng and replacing it with StrongClockRng; although the current ClockRng is faster it would be faster still to seed a PRNG like ChaCha from StrongClockRng if much output is required, and stronger too.

@pitdicker
Copy link

I am surprised that 64f1f63 does worse than the previous commit. a306a69 does no mixing at all yet gets one of the best results. This surprises me. What happens if you run the same test a couple more times?

Yes it surprises me to. But I only had time to run the test, not to figure out the cause... PractRand has very reliable tests. I have run them several times but the results stay the same.

if you care to, please test the new StrongClockRng; hopefully it's much less biased (although I'm not that familiar with mixing).

No problem. It is better, but not much, sorry:

RNG_test using PractRand version 0.93
RNG = RNG_stdin32, seed = 0x18026a4
test set = normal, folding = standard (32 bit)

rng=RNG_stdin32, seed=0x18026a4
length= 32 megabytes (2^25 bytes), time= 2.2 seconds
  Test Name                         Raw       Processed     Evaluation
  BCFN(2+0,13-4,T)                  R=+63388  p = 0           FAIL !!!!!!!!  
  BCFN(2+1,13-4,T)                  R=+63272  p = 0           FAIL !!!!!!!!  
  BCFN(2+2,13-5,T)                  R=+78762  p = 0           FAIL !!!!!!!!  
  BCFN(2+3,13-5,T)                  R=+75308  p = 0           FAIL !!!!!!!!  
  BCFN(2+4,13-5,T)                  R=+71051  p = 0           FAIL !!!!!!!!  
  BCFN(2+5,13-6,T)                  R=+80227  p = 0           FAIL !!!!!!!!  
  BCFN(2+6,13-6,T)                  R=+66259  p = 0           FAIL !!!!!!!!  
  BCFN(2+7,13-7,T)                  R=+57279  p = 0           FAIL !!!!!!!!  
  BCFN(2+8,13-8,T)                  R=+40052  p = 0           FAIL !!!!!!!!  
  BCFN(2+9,13-8,T)                  R=+20541  p =  5e-5214    FAIL !!!!!!!!  
  BCFN(2+10,13-9,T)                 R=+11783  p =  1e-2648    FAIL !!!!!!!!  
  BCFN(2+11,13-9,T)                 R= +5870  p =  4e-1320    FAIL !!!!!!!!  
  DC6-9x1Bytes-1                    R= +6709  p =  2e-4517    FAIL !!!!!!!!  
  Gap-16:A                          R= +2938  p =  1e-2409    FAIL !!!!!!!!  
  Gap-16:B                          R=+16748  p = 0           FAIL !!!!!!!!  
  FPF-14+6/16:(1,14-1)              R=+113.3  p =  3.7e-100   FAIL !!!!!     
  FPF-14+6/16:(2,14-2)              R= +47.5  p =  2.4e-41    FAIL !!!       
  FPF-14+6/16:(3,14-2)              R= +45.8  p =  7.4e-40    FAIL !!!       
  FPF-14+6/16:(4,14-3)              R= +68.4  p =  1.0e-59    FAIL !!!!      
  FPF-14+6/16:(5,14-4)              R=  +7.5  p =  4.5e-6   unusual          
  FPF-14+6/16:(7,14-5)              R=+516.8  p =  2.4e-428   FAIL !!!!!!!   
  FPF-14+6/16:(8,14-6)              R=+357.4  p =  1.5e-273   FAIL !!!!!!    
  FPF-14+6/16:(9,14-7)              R=+230.7  p =  2.1e-183   FAIL !!!!!!    
  FPF-14+6/16:(10,14-8)             R=+166.1  p =  1.5e-119   FAIL !!!!!     
  FPF-14+6/16:(11,14-8)             R= +88.3  p =  1.4e-63    FAIL !!!!      
  FPF-14+6/16:(12,14-9)             R= +59.3  p =  1.2e-37    FAIL !!!       
  FPF-14+6/16:(13,14-10)            R= +42.0  p =  6.2e-23    FAIL !!        
  FPF-14+6/16:(14,14-11)            R= +26.1  p =  3.3e-12   VERY SUSPICIOUS 
  FPF-14+6/16:(15,14-11)            R= +18.3  p =  8.1e-9   very suspicious  
  FPF-14+6/16:all                   R=+22915  p = 0           FAIL !!!!!!!!  
  FPF-14+6/16:all2                  R=+96772  p = 0           FAIL !!!!!!!!  
  FPF-14+6/16:cross                 R=+732039 p = 0           FAIL !!!!!!!!  
  BRank(12):128(4)                  R= +34.5  p~=  3.7e-19    FAIL !         
  BRank(12):256(2)                  R= +97.5  p~=  2.2e-30    FAIL !!!       
  BRank(12):384(1)                  R=+152.8  p~=  5.1e-47    FAIL !!!       
  BRank(12):512(2)                  R=+337.8  p~=  1.0e-102   FAIL !!!!!     
  BRank(12):768(1)                  R=+411.1  p~=  8.8e-125   FAIL !!!!!     
  BRank(12):1K(1)                   R=+583.3  p~=  1.2e-176   FAIL !!!!!!    
  [Low8/32]BCFN(2+0,13-5,T)         R=+316293 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+1,13-5,T)         R=+263569 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+2,13-6,T)         R=+260166 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+3,13-6,T)         R=+188056 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+4,13-6,T)         R=+112303 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+5,13-7,T)         R=+70944  p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+6,13-8,T)         R=+41820  p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+7,13-8,T)         R=+20745  p =  7e-5266    FAIL !!!!!!!!  
  [Low8/32]BCFN(2+8,13-9,T)         R=+11858  p =  1e-2665    FAIL !!!!!!!!  
  [Low8/32]BCFN(2+9,13-9,T)         R= +5896  p =  5e-1326    FAIL !!!!!!!!  
  [Low8/32]DC6-9x1Bytes-1           R=+50897  p = 0           FAIL !!!!!!!!  
  [Low8/32]Gap-16:A                 R=+68886  p = 0           FAIL !!!!!!!!  
  [Low8/32]Gap-16:B                 R=+444386 p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(1,14-2)     R=+86776  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(2,14-3)     R=+63292  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(3,14-4)     R=+42429  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(4,14-5)     R=+30154  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(5,14-5)     R=+21577  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(6,14-6)     R=+14558  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(7,14-7)     R=+10546  p =  7e-8393    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(9,14-8)     R= +4640  p =  1e-3338    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(10,14-9)    R=+388.6  p =  5.5e-245   FAIL !!!!!!    
  [Low8/32]FPF-14+6/16:(11,14-10)   R=+10150  p =  2e-5400    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(12,14-11)   R= +4119  p =  5e-1796    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(13,14-11)   R= +5558  p =  3e-2423    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:all          R=+185222 p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:all2         R=+3842559587 p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:cross        R=+1040458 p = 0           FAIL !!!!!!!!  
  [Low8/32]BRank(12):128(2)         R=+337.8  p~=  1.0e-102   FAIL !!!!!     
  [Low8/32]BRank(12):256(2)         R=+824.9  p~=  2.3e-249   FAIL !!!!!!    
  [Low8/32]BRank(12):384(1)         R=+927.7  p~=  2.6e-280   FAIL !!!!!!    
  [Low8/32]BRank(12):512(1)         R= +1272  p~=  5.4e-384   FAIL !!!!!!!   
  [Low1/32]BCFN(2+0,13-7,T)         R=+334440 p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+1,13-7,T)         R=+158068 p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+2,13-8,T)         R=+90156  p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+3,13-8,T)         R=+43780  p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+4,13-8,T)         R=+21431  p =  5e-5440    FAIL !!!!!!!!  
  [Low1/32]BCFN(2+5,13-9,T)         R=+12134  p =  1e-2727    FAIL !!!!!!!!  
  [Low1/32]BCFN(2+6,13-9,T)         R= +5993  p =  7e-1348    FAIL !!!!!!!!  
  [Low1/32]DC6-9x1Bytes-1           R=+157070 p = 0           FAIL !!!!!!!!  
  [Low1/32]Gap-16:A                 R=+96472  p = 0           FAIL !!!!!!!!  
  [Low1/32]Gap-16:B                 R=+279673 p = 0           FAIL !!!!!!!!  
  [Low1/32]FPF-14+6/16:cross        R=+1853012 p = 0           FAIL !!!!!!!!  
  [Low1/32]BRank(12):128(2)         R= +3748  p~=  3e-1129    FAIL !!!!!!!!  
  [Low1/32]BRank(12):256(1)         R= +5405  p~=  3e-1628    FAIL !!!!!!!!  
  ...and 2 test result(s) without anomalies

A few small experiments from my side fared much worse though.

@pitdicker
Copy link

Directly measuring entropy via the Shannon method is possible, so I did that (roughly 4-5.5 bits per call, but only last 2 bits have high entropy).

Good work! About 4-5 bits matches my feeling. This paper (which I have not yet read fully) explains on page 6 and 7 that Shannon entropy is the best case, and we can realistically expect to be able to extract a little less.

But not doing stuff in a tight loop is really going to help, so I think making only one or two readings each time next_u32 is called is much better than only reading from the timer when the ClockRng is constructed is better.

Perhaps best would be to combine the two approaches: construct a simple PRNG, and each time next_u32 is called read the time once or twice and mix it into the PRNG state (possibly with variable shift?).

Yes, this is better than my proposal to seed a small PRNG once with the system clock. When speaking in terms of entropy, every round then adds a little, and every little bit helps.

I still wouldn't trust this approach to withstand a serious attack, but for anything else (including low-resource attacks) it should be fine. Makes me wonder, if we can extract a little entropy from the system timer at a very low cost, maybe we should mix that into the output of OsRng, just in case the system happens to have a broken or compromised entropy source?

OsRng already uses the system timer as an input source, and is very conservative. Mixing it with the system timer will prabably even make it worse. I have read serious advice against trying to improve on in, but can't remember where.

PractRand is designed to measure bias, not entropy. While bias in our output isn't a good thing, it's much less important than entropy.

I am really learning as I go. You are right about PractRand. But I expect a good method to extract entropy and mix it to produce practically unbiased results. So it is useful as an indication.

I do believe being unbiased is a very important property, especially when seeding other RNGs.

I'm thinking a hybrid approach may be best.

What did you think about #21 (comment)? It only adds about 4 bits of entropy each round as you calculated. It is statistically good, I canceled PractRand after 1 TiB of no anomalies.

I think we either have to go with a simple solution, or do good research. Wikipedia has among others the suggestion to use a Cryptographic hash function to extract entropy.

@@ -294,14 +299,37 @@ mod thread_local;
#[cfg(feature="std")]
pub trait NewSeeded: Sized {
/// Creates a new instance, automatically seeded via `OsRng`.
fn new() -> Result<Self, Error>;
fn try_new() -> Result<Self, Error>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been discussed somewhere? What are we trying to be consistent with?

When I started learning about rand, I was impressed by how easy it was to do the right thing. new just seeds an RNG from the OsRng. So I am a bit of a fan of the simple new().

The API guidelines seem to suggest using new is fine.

Would you like to open an issue to discuss the rename?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new() here would be fine I guess; it's just that the last proposal had try_new. The bit about being consistent is renaming in OsRng to match. Maybe that was a bad idea.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally prefer just new as well, should be clear from the type what's going on.

@@ -122,7 +122,8 @@ pub struct ReseedWithNew;
#[cfg(feature="std")]
impl<R: Rng + NewSeeded> Reseeder<R> for ReseedWithNew {
fn reseed(&mut self, rng: &mut R) {
match R::new() {
// TODO: should we use new_with_fallback instead?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reseeding a strong RNG like ISAAC from something much weaker like the system timer does not sound like a good idea.

ReseedingRng wraps an already seeded RNG, so we only have to care about reseeding, not initializing.

In some situations it makes sense to ignore an error, like with TreadRng. Instead of taking down a process, it seems preferable to continue using the existing RNG. But someone relying on reseeding to keep the wrapped RNG secure would probably like to be informed of an error. It should be passed on through try_fill, and otherwise panic.

So a best default is not clear to me. Maybe we should just do the simplification of TreadRng to not rely on ReseedingRng, and handle errors different for them.

In both cases I don't think new_with_fallback is a good idea.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a ReseedingRng was initially seeded from the OS then this failed when reseeding, then yes, it would be better to keep using the old state or mix in new entropy than replace. But the only case I have heard of OsRng failing is that "broken DLL" problem on Windows (i.e. never works) or (possibly) slow initialisation (doesn't work at first, does later).

We could in any case always mix in fresh entropy instead of replace. I don't know.

As for "not seeding a strong RNG like ISAAC from a weak source", why not? It's still better than seeding a weak RNG from a weak source. As that bug report about Firefox shows, sometimes a fallback option is needed, but there's no reason they need to use a weak RNG too.

@@ -51,7 +51,8 @@ impl Rng for ThreadRng {

thread_local!(
static THREAD_RNG_KEY: Rc<RefCell<ReseedingStdRng>> = {
let r = match StdRng::new() {
// TODO: consider using new_with_fallback instead of try_new
let r = match StdRng::try_new() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems exactly what Firefox would need rust-lang/rust#44911

@dhardy
Copy link
Owner Author

dhardy commented Oct 25, 2017

What did you think about #21 (comment)?

Uh, sorry but there are a few issues with that code.

state ^ (state >> 16);
time ^ (time >> 16);

You have two lines of code which don't do anything.

self.state = state * 747796405 + 2891336453;

Time never affects the state, it only affects the output with a simple XOR. So basically you have a simple LCG like PCG with some extra tweaking of the output; you're never accumulating entropy.

Further, you never try to assemble much entropy. I still think ChaChaRng::seed_from(&mut StrongClockRng::new()) would be much stronger because even if there is some bias in the seed it should still be close to the maximum 256 bits of entropy in theory (if I understand the issue correctly, which I admit I may not).

@burdges I wonder if you are able to help us out here?

let xorshifted = ((state >> 18) ^ state) >> 27;
let rot = state >> 59;
let rot2 = (-rot) & w(31);
((xorshifted >> rot.0 as usize) | (xorshifted << rot2.0 as usize)).0 as u32

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is strange that this version fails so quickly on PractRand. What is happening on these lines? Are the two rotations partly cancelling out each other? In would like to test it with the normal PCG version.

@pitdicker
Copy link

Does this sound right to you?

  • entropy from first round: ~20 bits
  • entropy per next round: ~4 bits
  • ClockRng is meant as a fallback to initialize an other PRNG
  • collecting more entropy in ClockRng makes the resulting PRNG harder to brute-force.

My previous thought was to set up a small RNG with the result from the first round.
Then for each next_u32 use the result of that RNG combined with about 4 new bits of entropy.
There is no collecting of entropy, every little bit is directly returned with the next_u32 it is collected in.

You want to go with something stronger, and that makes sense. That would mean mixing in the new values from get_time with the state op the PCG RNG as you do is better. Also using 64 bits of internal state is better, as that can hold 64 bits of entropy. I wonder why it's results are worse, but maybe because of the modification of the output function.

In principle I like the idea behind StongClockRng. What do you think of this? Collect about 8 rounds of get_time for each next_u32 (~32 bits of entropy), and after each round combine as the PCG increment. As last do the output function of PCG to get a 32-bit number.

Hopefully it can give both the good entropy we want, and be unbiased.

@dhardy
Copy link
Owner Author

dhardy commented Oct 26, 2017

@pitdicker you can try that variant on ClockRng; it should have more entropy with 8 rounds of get_time but in terms of bias I don't know if it would be any different. I'm surprised that my modification of PCG didn't work well since one is supposed to be able to use any odd number as increment. Maybe it would be better with a fixed increment, and then do XOR or something to combine the time entropy into the state?

Overall though I still don't think bias in the crypto keys should be a problem aside from the reduction in entropy, and I'm not sure how to measure that (I would expect it to be small, but don't know). I still think something like StrongClockRng would be a good choice for generating keys, because in theory you can get close to the maximum entropy that the key can hold — unless there's some major weakness in the time counters (I certainly don't trust it enough to recommend for cryptography). The bias in the key shouldn't matter because CSPRNGs must prevent calculation of the key from the output, and if a poor key resulted in poor (biased) output, that would already give a massive hint as to what the key is. For non-crypto PRNGs, well, some have poor output with certain keys, but I doubt it would be a significant problem for most applications.

@pitdicker
Copy link

I'm surprised that my modification of PCG didn't work well since one is supposed to be able to use any odd number as increment.

That part was a good idea, and it works well. I only did a quick test during breakfast, where I replaced the output function. I think there is the problem with the output function, that is different from normal PCG.

@pitdicker
Copy link

I still feel uneasy about coming up with something ourselves. What do you think about creating an implementation based on this Jitter RNG?

It also uses the the system clock as source, and has excellent documentation and analysis. It would be quite some work though, there are about 1000 lines of C code. The license is BSD or alternatively GPL.

@pitdicker
Copy link

Apparently that is also the method Linux uses to extract entropy from the timer.

@dhardy
Copy link
Owner Author

dhardy commented Oct 27, 2017

Nice find; that does sound like what we want. At this level of complexity it might also make sense to use an external crate, or to start with an external implementation and consider importing into rand later.

But this is getting a bit beyond the scope of the rand refactor so I'd rather give it less focus (still happy to do code reviews etc.). Do you want to try implementing this @pitdicker? If not I'll get it mentioned in the next "impl period" newsletter or something; it should be a relatively easy project for someone else to get into. (Of course one could instead just wrap the C library in an external crate.)

@dhardy dhardy mentioned this pull request Oct 27, 2017
@pitdicker
Copy link

But this is getting a bit beyond the scope of the rand refactor so I'd rather give it less focus (still happy to do code reviews etc.). Do you want to try implementing this @pitdicker?

Currently I have an implementation of Xorshift* and PCG mostly finished, and polishing it a bit and waiting an the discussion around initialization. Also OsRng needs a good look. Diving into Jitter RNG would be a nice project though... Can you get it mentioned in the "impl period" newsletter? If there is no one I would like to work on it.

How to proceed with this PR? Remove StrongClockRng, make ClockRng private for now and use it as a fallback?

@dhardy
Copy link
Owner Author

dhardy commented Oct 27, 2017

How to proceed for now — not sure; see #23.

@pitdicker
Copy link

On closer look it seems quite doable to implement jitterentropy-library. I will give it a try.

@dhardy
Copy link
Owner Author

dhardy commented Nov 17, 2017

JitterRng has been merged now and #54 covers usage as a fallback. That finally makes this PR obsolete.

@dhardy dhardy closed this Nov 17, 2017
@dhardy dhardy deleted the seeding_fallback branch December 19, 2017 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants