Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ethereum Core Devs Meeting 118 Agenda #354

Closed
timbeiko opened this issue Jul 9, 2021 · 6 comments
Closed

Ethereum Core Devs Meeting 118 Agenda #354

timbeiko opened this issue Jul 9, 2021 · 6 comments

Comments

@timbeiko
Copy link
Collaborator

timbeiko commented Jul 9, 2021

Meeting Info

Agenda

  1. London Updates
    1. Ropsten Issue
    2. gasPrice for 1559 transactions. Comments against it:
  2. Other Discussion Items
    1. December Network Upgrade  #356
    2. EIP-3675: Upgrade consensus to Proof-of-Stake #361
    3. Proposal to add EIP-3670 to Shanghai #360
    4. Proposal to add EIP 3651 (Warm COINBASE) to Shanghai #357
  3. Announcements
@AlexeyAkhunov
Copy link
Contributor

I am not sure I will be able to join the meeting, but I would like to say that there was an interesting discovery made during the Ropsten incident, which we need to be aware of.
Because majority of miners use Geth, and Geth did not have the ability to repair Ropsten nodes that were past the bad block, miners had to perform full sync (not fast sync and not snap sync) before they could start mining on the correct chain.
This means that if something like that were to happen on the main net, there would be no way back, and Geth's version (whatever it is) would need to become de-facto the consensus rule.

In general, I think we need to have a better understanding about "what we are going to do if X happens?" where X is any of the issues that happened on the test-net. Yes, it is unpleasant to think about it, but we need some pre-decisions to relieve Geth devs from the responsibility to make all the hard decisions on the spot, which would be incredibly stressful.

P.S. I have implemented functionality in Erigon today to be able to go back before any bad block in the past (and used it to repair Ropsten), but unfortunately we did not finish mining support for Erigon yet. And even if we did, it would take a while for miners to adopt our code. So this is not a proposed solution, but just extra info that such functionality may be possible to have.

@AlexeyAkhunov
Copy link
Contributor

Ropsten incident also highlights another potential issue that may apply to many implementations (including currently Erigon). It seems that in the network where there are two competing chains co-existing, the minority parts seems to be very unstable, with nodes disconnecting each other and reconnecting all the time. This may be related to what I have observed I think: some nodes are propagating blocks/headers even though they are not on their best chain. This leads to these nodes being kicked out (disconnected) by other nodes on the correct chain. I am going to think how to make our implementation a bit more robust, but perhaps other needs to take a look as well.

@poojaranjan
Copy link
Contributor

If time permits would like to make announcements for upcoming PEEPanEIP meetings for Merge & Block Gas Limit.

@holiman
Copy link

holiman commented Jul 23, 2021

and Geth did not have the ability to repair Ropsten nodes that were past the bad block

That's not correct. There are basically a couple of things that can happen during a fork. I'll outline a couple of scenarios,

Synced node followed wrong chain

You were running geth, and were in sync. At block X, the fork happened. Your node followed the erroneous higher-td chain, and at block Z, you stop the node and update to the patched version.

Problem description; The node is still on the 'bad' chain.
Solution: Do a debug.setHead{X-1) to jump to before the fork. This internally will rewind the chain to some state before X. It might not be X-1, since geth might not have the full state for that block, but it will have the state somewhere. Usually, geth flushes the state to disk every ~10K blocks (or whatever corresponds to 1 hour processing), and/or during shutdown. If geth is running in gcmode=archive, then it flushes every block.

Syncing in the presence of a wrong higher-td chain

You are syncing a geth-node, and a fork has occurred at block X. Since the fork has already happened, and the erroneous chain has higher TD, you will most likely wind up on the 'wrong' side of the chain, with a pivot block X+M. If this happens, you do not have any state for blocks <X+M, so you cannot do debug.setHead to to resolve the situation.

In this case, a resync is required. However, you need to prevent geth from winding up on the wrong side of the fork. This can be done with the whitelist command line parmeter.

$ geth -h | grep white
  --whitelist value                   Comma separated block number-to-hash mappings to enforce (<number>=<hash>)

So you'd do geth --whitelist 123123=0x2342fafa9af9af9af9af9af9

The whitelist means that geth, when peering with another peer, will ask the peer "what's your block 123123". If it gets a header back with a hash that doesn't match the whitelist, it willl disconnect from that peer. So essentially, the node will isolate itself from peers on the wrong chain, and only connect to peers that will deliver blocks from the shorter (but correct) chain.

@timbeiko
Copy link
Collaborator Author

Closed in favor of #365

@kf106
Copy link

kf106 commented Sep 5, 2021

Does anyone have some clear instructions on how to actually fix your node if you upgraded after the split? With the actual numbers?

I am running 1.10.8, and upgraded after the split.

I noticed that the balances on my Ropsten node did not match those on MetaMask and Etherscan a couple of days ago, and that my block height was higher than Etherscan's, so clearly I was on the wrong chain.

But 3.5 GHsh of hashing power was going into continuing that chain, so there are clearly other people still mining away on it too.

The Meeting 118.md article confusingly talks about Ropsten and then lists Mainnet blocks and hashes at the end.

I thought I had understood the issue and how to correct it, and did the following:

  • determined that the problem block was 10679538, so looked up the Etherscan hash for the correct block at that height.
  • added --whitelist 10679538=0x569dccc25294768c23249db843ef7156e8e7c6c94cb82cd84a833f9c3e1d72e5 to my geth node command line
  • restarted my node
  • connected with geth attach
  • tried to rewind my current chain data to a couple of block before 10679539, namely 10679537, or 0xA2F4F1, by running debug.setHead("0xA2F4F1")

Instead of a fixed chain, what I got was a segmentation violation error:

INFO [09-05|11:14:18.197] Looking for peers                        peercount=1 tried=25 static=0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c0 pc=0xb3180f]

goroutine 1856 [running]:
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).findAncestorBinarySearch(0xc00037c540, 0xc034278270, 0x1, 0xa786c6, 0xa19561, 0xa19561, 0x0, 0x17c6020)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:966 +0x58f
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).findAncestor(0xc00037c540, 0xc034278270, 0xc002a518c0, 0xc002a51b00, 0x0, 0x0)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:818 +0x3a5
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).syncWithPeer(0xc00037c540, 0xc034278270, 0x6215089fee241794, 0x743c81e9df254112, 0xe8f5e2540c043313, 0xba9caf4ba6bf2ba, 0xc01709e260, 0x0, 0x0)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:475 +0x517
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).synchronise(0xc00037c540, 0xc03488f880, 0x40, 0x6215089fee241794, 0x743c81e9df254112, 0xe8f5e2540c043313, 0xba9caf4ba6bf2ba, 0xc01709e260, 0xc000000001, 0x0, ...)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:431 +0x3b0
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).Synchronise(0xc00037c540, 0xc03488f880, 0x40, 0x6215089fee241794, 0x743c81e9df254112, 0xe8f5e2540c043313, 0xba9caf4ba6bf2ba, 0xc01709e260, 0xc000000001, 0x4842c0, ...)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:326 +0x8c
github.com/ethereum/go-ethereum/eth.(*handler).doSync(0xc004ca5b00, 0xc02ab9c300, 0x0, 0x0)
  github.com/ethereum/go-ethereum/eth/sync.go:324 +0x125
github.com/ethereum/go-ethereum/eth.(*chainSyncer).startSync.func1(0xc0001e0bd0, 0xc02ab9c300)
  github.com/ethereum/go-ethereum/eth/sync.go:300 +0x38
created by github.com/ethereum/go-ethereum/eth.(*chainSyncer).startSync
  github.com/ethereum/go-ethereum/eth/sync.go:300 +0x76

I'm now using the following whitelist parameter and resyncing from scratch instead:

--whitelist 10679538=0x569dccc25294768c23249db843ef7156e8e7c6c94cb82cd84a833f9c3e1d72e5

I hope I've got that right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants