Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth, les, light: enforce CHT checkpoints on fast-sync too #19468

Merged
merged 1 commit into from
Apr 17, 2019

Conversation

karalabe
Copy link
Member

Fast sync is susceptible to a grieving version of an eclipse attack, where a malicious remote node attempts to get a new Geth node to fast sync to some small chain, before a real heavy chain is discovered in the network. This results in Geth falling back to full sync for the main chain, taking too much time.

This attack can only be meaningfully mounted against nodes which are properly exposed on a public IP address (i.e. not firewalled, not NATed). Even then, it's a race against the node finding good peers fast enough, in which case the attack doesn't work any more.

There is no economic advantage in pulling this attack off, only causing sync annoyance. That said, there is currently a number of (at least 4 identified, maybe more) Parity nodes at 207.148.5.229, which are doing variations of this attack. It might be deliberate, or it might also be leftover nodes from some experiment that only have a few blocks on their chain and subsequently disabled sync. A bit unprobable.


This PR repurposes the DAO challenge to do a checkpoint challenge based on the recently hard coded CHTs. It also makes the challenge stricter, in that while the node is doing a fast-sync, remote peers are not permitted to be synced below the checkpoint block. This should also help sync the chain faster, getting rid of stalling or useless peers.

@karalabe karalabe added this to the 1.9.0 milestone Apr 16, 2019
@karalabe karalabe requested a review from holiman April 16, 2019 10:33
@karalabe karalabe force-pushed the enforce-fastsync-checkpoints branch from f64d13d to 4a07d42 Compare April 16, 2019 12:34
@holiman
Copy link
Contributor

holiman commented Apr 16, 2019

This looks clean and nice, however, it has the drawback that peers undergoing sync won't help other unsynced nodes. Would it be possible instead to allow unsynched peers to connect to us, once we have already found a good pivot? Because once we're settled on a pivot and have started syncing, we're not vulnerable to the tarpit-fastsync attack described above any longer.

@karalabe
Copy link
Member Author

It makes things more complicated. Currently the eth protocol manager that handles connecting peers doesn't know much about the downloader, definitely not internal state. I'm also not sure about the change in general as we were considering doing exactly this: disconnecting peers that can't serve fast sync data.

Whilst I agree that we're going from one end of the spectrum (allow everyone) to the opposite (allow only useful ones), I'm not sure it makes sense to complicate things. There are a few thousand peers in the network, maybe a handful that are currently syncing. Perhaps lets use the thousand synced ones first and only then try to help the remainder joiners. Yes, it puts a bit more burden on the existing peers (that is, if I have a joiner peer too), but it also makes us sync faster, so we can meaningfully help the network in turn faster.

@holiman
Copy link
Contributor

holiman commented Apr 16, 2019

Yes, I agree that it's probably the sanest choice for now

@karalabe
Copy link
Member Author

Damn, found a tiny bug. I didn't stop the challenge timer on non-fast sync :P Bleah, need to add a test.

@karalabe
Copy link
Member Author

diff --git a/eth/handler.go b/eth/handler.go
index 9d29f8cb1..bd414e068 100644
--- a/eth/handler.go
+++ b/eth/handler.go
@@ -447,13 +447,13 @@ func (pm *ProtocolManager) handleMsg(p *peer) error {
                }
                // If no headers were received, but we're expencting a checkpoint header, maybe it's that
                if len(headers) == 0 && p.syncDrop != nil {
+                       p.syncDrop.Stop()
+                       p.syncDrop = nil
+
                        // If we're doing a fast sync, we must enforce the checkpoint block to avoid
                        // eclipse attacks. Unsynced nodes are welcome to connect after we're done
                        // joining the network
                        if atomic.LoadUint32(&pm.fastSync) == 1 {
-                               p.syncDrop.Stop()
-                               p.syncDrop = nil
-
                                p.Log().Warn("Dropping unsynced node during fast sync", "addr", p.RemoteAddr(), "type", p.Name())
                                return errors.New("unsynced node cannot serve fast sync")
                        }

@karalabe
Copy link
Member Author

Fixed and tested. @holiman PTAL

@karalabe
Copy link
Member Author

@matthalp This PR might interest you. It changes the fork challenge semantics. Instead of requiring to be on the same DAO fork, it requires to have the same header as contained in a remote peer's CHT. I think you had some nodes that aggressively pruned headers too, this might make it more complicated, since the challenge can be arbitrary, not a specific fork block header.

@holiman
Copy link
Contributor

holiman commented Apr 17, 2019

I approve

@karalabe
Copy link
Member Author

Fixed the linter and squashed.

@karalabe karalabe force-pushed the enforce-fastsync-checkpoints branch from 234bcbf to 38f6b85 Compare April 17, 2019 10:16
Copy link
Member

@rjl493456442 rjl493456442 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Matthalp-zz
Copy link
Contributor

Thanks for keeping me in the loop @karalabe! Fortunately we do keep all headers along the canonical chain, so I don't anticipate any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants