Improve route randomization #1012

ZmnSCPxj · 2018-02-16T03:59:47Z

Fixes: #928

I want to integrate pay command also to use route randomization if the maxfeepercent is high enough, so, this will wait for #993 first so I can add it to payalgo.c.

ZmnSCPxj · 2018-02-16T19:06:03Z

Ready for review

rustyrussell

Hi! I like this patch, but I have some suggestions:

I think the API should rename inefficiency to "fuzz", and make it a percentage.
I think the seed should be optional: it's only really useful for testing reproducibility.
I think the default should be 5% or so, not 0.

ZmnSCPxj · 2018-02-19T02:48:52Z

Okay, will do later.
seed is already optional, but perhaps you mean it should be a randomly-generated seed if not specified, rather than fixed to 8x 0 bytes?
Presumably inefficiency/fuzz? Currently, multiple getroute with the same arguments will return the same route, modulo any channel updates that make the route shorter. I wanted to retain this property, so default inefficiency/fuzz to 0. However, it is also true that channel updates made by other nodes opening new channels and closing old ones will cause getroute to change its return value, so perhaps such a default of 0.05/5% is reasonable.

cdecker

Excellent PR, just the scaling up and down that feels a bit dodgy.

cdecker · 2018-02-19T10:52:57Z

gossipd/gossip.c

+					 &source, &destination,
+					 &msatoshi, &riskfactor, &final_cltv,
+					 &inefficiency_scaled, &seed);
+	inefficiency = ((double) inefficiency_scaled) / 4294967296.0;


Like @rustyrussell I'd prefer having an integer percentage that doesn't require scaling up and down.

ZmnSCPxj · 2018-02-19T12:21:43Z

Rebased and modified as per @rustyrussell 's initial review. However, @cdecker:

Like @rustyrussell I'd prefer having an integer percentage that doesn't require scaling up and down.

Is this what is wanted? Internally I would drop to a doublefor random generation anyway. Indeed I think I would want some way to serialize a double (towire_double, fromwire_double) if I can find some easy way to do so.

Or do you mean, I should accept integer percentage, serialize to u8, then perform the conversion from percentage to a double 0.0 -> 1.0 inside gossipd?

cdecker · 2018-02-19T20:57:07Z

@ZmnSCPxj yeah that was what I was thinking of, upconverting from double to int for transfer and then back again seems strange to me. Having a percentage and downconverting it to double once will reduce any strange imprecision effect.

ZmnSCPxj · 2018-02-20T02:05:54Z

4294967296.0 is exactly a power of two, so it should just be manipulation of the exponent part (modulo weird numbers like NAN and infinity and zero). Then conversion to and from u32 should just be cutting off some bits of mantissa.

Another concern is that our other "percent" argument is pay maxfeepercent, which accepts fractionals, and indeed defaults to 0.5 (0.5%). Do we accept fractionals for fuzzpercent? If not, this would be unexpected since maxfeepercent accepts fractions but fuzzpercent does not. If accept, then we would have roundoff/conversion issue regardless, unless we make a towire_double or fromwire_double.

rustyrussell

Minor changes in normalization, and defaults...

rustyrussell · 2018-02-20T03:50:32Z

lightningd/payalgo.c

-	pay->tries = 0;
+	pay->getroute_tries = 0;
+	pay->sendpay_tries = 0;
+	pay->fuzz = 0.75;


We hand pay->fuzz into towire_gossip_getroute_request() which takes a u32, so this turns into 0 fuzz. That's probably not what we want; should we use the same default value as getroute(), or something else?

rustyrussell · 2018-02-20T03:52:14Z

gossipd/gossip.c

+					 &source, &destination,
+					 &msatoshi, &riskfactor, &final_cltv,
+					 &fuzz_scaled, &seed);
+	fuzz = ((double) fuzz_scaled) / 4294967296.0;


You can't reach 100% here. Perhaps that's OK, but I think it's unintended.

rustyrussell · 2018-02-20T03:54:52Z

lightningd/gossip_control.c

@@ -362,16 +366,16 @@ static void json_getroute(struct command *cmd, const char *buffer, const jsmntok
 	}
 	/* Convert from percentage */
 	fuzz = fuzz / 100.0;
+	fuzz_scaled = (u32) (fuzz * 4294967296.0);


If you specify 100%, this may evaluate to 0.

rustyrussell · 2018-02-21T00:27:40Z

lightningd/gossip_control.c

+	}
+	/* Convert from percentage */
+	fuzz = fuzz / 100.0;
+	fuzz_scaled = (u32) (fuzz * 4294967296.0);


This still means 100% == 0!

Let's fix this properly and do towire_double/fromwire_double. It can just memcpy, since our inter-daemon comms is all same-architecture, and I can't find a portable way to encode a double other than printf/scanf :(

rustyrussell · 2018-02-21T00:28:23Z

lightningd/payalgo.c

-	pay->sendpay_parent = NULL;
+	pay->getroute_tries = 0;
+	pay->sendpay_tries = 0;
+	pay->fuzz = 0.75;


Why is default here 75%, vs 5% for getroute?

This is not a default, but an initial value. pay respects maxfeepercent, and if it gets a route that has too high fee, it will reduce its fuzz value in the hope that the next returned route with a lower fuzz will have a lower fee. This means that to take advantage of higher-than-usual maxfeepercent for improved privacy we should start with a large fuzz (which has a chance of getting much higher fee than would be typical for lower or 0 fuzz).

The point of route randomization is to avoid low-fee routes, which might be honeypots used by three-letter monitoring agencies, by occassionally jumping to higher-than-usual fee routes.

If 75% is a good starting point for pay, why it it not a good default for getroute? It it anticipated that we will see significant failures at 75%? I'm not sure that's true. These numbers seem made up, which is fine, but they do need a comment on what we'd need to know to set decent defaults in future...

Well, my initial idea was to have getroute default to 0%, for compatibility with before this patch, where getroute never fuzzed and (if there were no changes in network topology) would always return the same route again and again. A mere 5% for fuzz is unlikely to change anything, either.

If that behavior is not necessary for getroute, then 75% would be a reasonable default for getroute, too.

ZmnSCPxj · 2018-02-21T12:27:05Z

Added serialization of double instead of converting to u32 back and forth. Also rebased. Also answered @rustyrussell question in review item.

rustyrussell

OK, let's use a fixed 16-byte seed, ready for siphash, and use siphash internally.

We could also use the node pubkey (directly, feed it into siphash) instead of scid to make the iteration more stable (since fuzz would depend on the node itself rather than what edge we came in on).

We can get faster in future by using isaac with a single initialization, but that loses per-node determinism.

rustyrussell · 2018-02-23T00:12:14Z

lightningd/payalgo.c

 	pay->try_parent = tal_free(pay->try_parent);
 	pay->try_parent = tal(pay, char);

-	++pay->tries;
+	/* Generate random seed */
+	seed = tal_arr(pay->try_parent, u8, ISAAC64_SEED_SZ_MAX);


2048 bytes of randonmess is extreme!

8 bytes of non-cryptographic randomness is more than sufficient; we should just add a psuedorand64() which wraps isaac64_next_uint64() and use that here I think.

Okay, will do.

rustyrussell · 2018-02-23T00:14:22Z

lightningd/payalgo.c

-	pay->sendpay_parent = NULL;
+	pay->getroute_tries = 0;
+	pay->sendpay_tries = 0;
+	pay->fuzz = 0.75;


If 75% is a good starting point for pay, why it it not a good default for getroute? It it anticipated that we will see significant failures at 75%? I'm not sure that's true. These numbers seem made up, which is fine, but they do need a comment on what we'd need to know to set decent defaults in future...

rustyrussell · 2018-02-23T00:48:19Z

gossipd/routing.c

+		towire_short_channel_id(&scid, &c->short_channel_id);
+		isaac64_reseed(&myrng,
+			       (const unsigned char *) scid, tal_len(scid));
+		tal_free(scid);


That's pretty heavy. It's nice because it makes the per-edge decision deterministic, though.

run-bench-find_route 10000 10 with 0 fuzz:

10 (10 succeeded) routes in 10000 nodes in 3990 msec (399066864 nanoseconds per route)

With 0.75 fuzz and a fixed rng:

10 (10 succeeded) routes in 10000 nodes in 73423 msec (7342349728 nanoseconds per route)

Ouch!

I switched it to hand in a 64-bit base_seed, and use short_channel_id_to_uint() instead of wire allocation, and initialize the rng with [base_seed, short_channel_id_to_uint()], but it's still v. slow:

10 (10 succeeded) routes in 10000 nodes in 67433 msec (6743351288 nanoseconds per route)

This is partially because it's expensive to set up an isaac64 context just to pull one number out; we're better using siphash for this:

10 (10 succeeded) routes in 10000 nodes in 5668 msec (566855365 nanoseconds per route)

I've put a rough patch (enough to compile and run the bench): https://gist.github.com/rustyrussell/c3334112c096d97e0adb0d3a8fc3fc32

The patch is actually pretty good and I would ACK that.

ZmnSCPxj · 2018-02-23T04:06:18Z

We could also use the node pubkey (directly, feed it into siphash) instead of scid to make the iteration more stable (since fuzz would depend on the node itself rather than what edge we came in on).

My understanding of the algorithm is that each edge is what is assigned a cost, and if we encounter a node multiple times then if the cumulative cost from an edge is lower than the current cost of the node, then the current cost of the node is updated to be coming from the edge. If we use per-node, then all outgoing channels from that node will have the same fuzz cost. At least, that is my understanding of the algorithm.... I will defer to you if you think my understanding is wrong, as you wrote it in the first place anyway.

In preparation for improved route randomization.

Fixes: ElementsProject#928

Now try_parent is used as the parent for all allocations needed for a try.

@rustyrussell

Primary idea by @rustyrussell

ZmnSCPxj · 2018-02-24T01:24:59Z

Rebased, used siphash as per @rustyrussell patch, change getroute default fuzz to 75%, explain fuzz.

rustyrussell

ACK d3493aa

rustyrussell · 2018-02-26T02:02:21Z

doc/lightning-getroute.7.txt

+to provide some randomization to the route generated.
+0.0 means the exact fee of that channel is used,
+while 100.0 means the fee used might be from 0 to twice the actual fee.
+The default is 5.0, or up to 5% fee distortion.


The default is 75.0, or up to 75% fee distortion.

rustyrussell · 2018-02-26T02:05:33Z

gossipd/routing.c

+		 * 2*fuzz*rand              random number between 0.0 -> 2*fuzz
+		 * 2*fuzz*rand - fuzz       random number between -fuzz -> +fuzz
+		 */
+		fee_scale = 1.0 + (2.0 * fuzz * h / UINT64_MAX) - fuzz;


Pretty sure we can just make h an s64 and divide by INT64_MAX: fee_scale = 1.0 + fuzz * h / INT64_MAX

But it's trivial and unlikely to be any faster, so I'll just leave this as a comment.

Negative number absolute range is one higher than positive range, so getting INT64_MIN by chance would break that. Of course very unlikely but well.

rustyrussell · 2018-02-26T02:17:22Z

lightningd/payalgo.c

@@ -243,7 +243,7 @@ static bool json_pay_try(struct pay *pay)
 	pay->try_parent = tal(pay, char);

 	/* Generate random seed */
-	seed = tal_arr(pay->try_parent, u8, ISAAC64_SEED_SZ_MAX);
+	seed = tal_arr(pay->try_parent, u8, sizeof(struct siphash_seed));


I've produced a patch to actually hand a siphash_seed across the wire, which IMHO makes this neater, but I'll create a separate PR after this.

rustyrussell · 2018-02-26T02:32:30Z

ACK d3493aa

rustyrussell · 2018-02-26T02:36:01Z

You are right on the bfg_one_edge logic; per-edge makes more sense than per-node.

ZmnSCPxj force-pushed the route-random branch 2 times, most recently from 8411066 to 5099930 Compare February 16, 2018 14:16

ZmnSCPxj changed the title ~~[WIP] Improve route randomization~~ Improve route randomization Feb 16, 2018

rustyrussell requested review from rustyrussell and cdecker February 17, 2018 11:32

rustyrussell requested changes Feb 19, 2018

View reviewed changes

cdecker reviewed Feb 19, 2018

View reviewed changes

ZmnSCPxj force-pushed the route-random branch from 24a9bbf to 474c015 Compare February 19, 2018 12:13

rustyrussell requested changes Feb 20, 2018

View reviewed changes

rustyrussell requested changes Feb 21, 2018

View reviewed changes

ZmnSCPxj force-pushed the route-random branch 2 times, most recently from bfe76f0 to a557fa8 Compare February 21, 2018 12:25

rustyrussell requested changes Feb 23, 2018

View reviewed changes

ZmnSCPxj added 7 commits February 24, 2018 00:45

wire: Implement fromwire/towire_double.

0e6aae8

gossip_control: Add fuzz and seed parameters.

5d03899

In preparation for improved route randomization.

gossipd: Add fuzz and seed to getroute request.

bb46f5e

gossipd: Implement improved randomization of routes.

f223d5e

Fixes: ElementsProject#928

payalgo: Change sendpay_parent to try_parent.

d1a64f4

Now try_parent is used as the parent for all allocations needed for a try.

payalgo: Randomize paths as long as we respect maxfeepercent.

2cb6023

doc: Update manpage for getroute and pay for route randomization.

56203c2

ZmnSCPxj force-pushed the route-random branch from a557fa8 to dbdf68d Compare February 24, 2018 00:50

ZmnSCPxj added 2 commits February 24, 2018 01:23

routing: Use siphash24 for route randomization

3d2e910

Primary idea by @rustyrussell

gossip_control: Change defult fuzz of getroute to 75%.

9913e3f

Explain fuzz initial values.

d3493aa

ZmnSCPxj force-pushed the route-random branch from f0c4e46 to d3493aa Compare February 24, 2018 01:23

This was referenced Feb 24, 2018

Add delay limit to pay #1086

Closed

Randomized overpaying (treat as routing fees) to reduce traceability #1089

Closed

rustyrussell approved these changes Feb 26, 2018

View reviewed changes

rustyrussell merged commit d23650d into ElementsProject:master Feb 26, 2018

ZmnSCPxj mentioned this pull request Feb 26, 2018

payalgo: Improve telemetry #1100

Merged

ZmnSCPxj deleted the route-random branch February 26, 2018 05:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve route randomization #1012

Improve route randomization #1012

ZmnSCPxj commented Feb 16, 2018

ZmnSCPxj commented Feb 16, 2018

rustyrussell left a comment

ZmnSCPxj commented Feb 19, 2018

cdecker left a comment

cdecker Feb 19, 2018

ZmnSCPxj commented Feb 19, 2018

cdecker commented Feb 19, 2018

ZmnSCPxj commented Feb 20, 2018

rustyrussell left a comment

rustyrussell Feb 20, 2018

rustyrussell Feb 20, 2018

rustyrussell Feb 20, 2018

rustyrussell Feb 21, 2018

rustyrussell Feb 21, 2018

ZmnSCPxj Feb 21, 2018

rustyrussell Feb 23, 2018

ZmnSCPxj Feb 23, 2018

ZmnSCPxj commented Feb 21, 2018

rustyrussell left a comment

rustyrussell Feb 23, 2018

ZmnSCPxj Feb 23, 2018

rustyrussell Feb 23, 2018

rustyrussell Feb 23, 2018

ZmnSCPxj Feb 23, 2018

ZmnSCPxj commented Feb 23, 2018

ZmnSCPxj commented Feb 24, 2018

rustyrussell left a comment

rustyrussell Feb 26, 2018

ZmnSCPxj Feb 26, 2018

rustyrussell Feb 26, 2018

ZmnSCPxj Feb 26, 2018

rustyrussell Feb 26, 2018

rustyrussell commented Feb 26, 2018

rustyrussell commented Feb 26, 2018

Improve route randomization #1012

Improve route randomization #1012

Conversation

ZmnSCPxj commented Feb 16, 2018

ZmnSCPxj commented Feb 16, 2018

rustyrussell left a comment

Choose a reason for hiding this comment

ZmnSCPxj commented Feb 19, 2018

cdecker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZmnSCPxj commented Feb 19, 2018

cdecker commented Feb 19, 2018

ZmnSCPxj commented Feb 20, 2018

rustyrussell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZmnSCPxj commented Feb 21, 2018

rustyrussell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZmnSCPxj commented Feb 23, 2018

ZmnSCPxj commented Feb 24, 2018

rustyrussell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rustyrussell commented Feb 26, 2018

rustyrussell commented Feb 26, 2018