-
Notifications
You must be signed in to change notification settings - Fork 32
[EPIC] Ethereum header chain over swarm #9
Comments
I'd propose the following appoach.
The Trinity client team should be able to deliver on each of these phases in the 1-2 week time frame which suggests that if the Swarm team can deliver on a similar timeline we should be able to have a fully working POC of this within a 4 week timeline. |
One issue to resolve is from the client perspective to know what type of node you are connected to. At the beginning we can just use the |
still a little soon for it but we should migrate this to a markdown document that we can open pull requests to at some point so that we can track changes. Probably only needed once we have the two clients talking to each other. |
@zelig https://notes.ethereum.org/k1yEcw1gSo-iCNmEOmpUUg I will not be surprised if there is something broken, there you go, should be a good starting point for us to get our nodes connected to each other. |
Rationale
The header chain for the mainnet is needed by every Ethereum client. The data is effectively append only and regularly accessed. This makes it a prime candidate for storage on the Swarm network.
Owner
@pipermerriam
Stakeholder Point of Contact
Description
Storage
For swarm to store and serve headers, they need to behave like other chunks. This only requires an extra validator using the Keccak SHA3 256-bit addressing.
For swarm to access header chunks, the localstore needs to be prepopulated. The following options are viable and non-exclusive
The latter option is easiest if we subsume the pull mechanism (swarm nodes requesting data from ethereum clients) under the protocol with which swarm and eth clients communicate.
Communication
For ethereum nodes to retrieve this data, they will need a communication channel with swarm nodes. Obvious options include.
DevP2P
network to talk directly to swarm nodes (probably over a new sub-protocol likebzzeth
Candidate for DevP2P communication
One way for nodes to communicate would be over
devp2p
. This has the following benefitsThe following commands define version
1
of a new sub protocol identified by the stringbzz-eth
.This p2p protocol is somewhat special in that it is asymmetrical, ie., the two peers are not sending the same type of messages.
In particular the swarm nodes never send
NewBlockHeaders
, only receive them.Handshake (0x00)
This MUST be the first message sent (under this protocol) after a p2p connection has been established.
serve_headers
: boolean indicating if this node can be expected to serve requests for headers.NewBlockHeaders (0x01)
hash
: the block hashnumber
: the block number corresponding to the provided block hash.Advertise headers that the connected peer may be interested in. For a given session with a peer, no block hash should be sent more than once (never re-advertise the same block hash).
If later we find that swarm nodes do not always need new headers announced, a
GetNewHeaders
message could be introduced.GetBlockHeaders (0x02)
request_id
: any 32 bit integerhashes
: array of 32-byte hashes.Request a set of headers referenced by their ethereum hashes.
BlockHeaders (0x03)
request_id
: Therequest_id
from theGetHeaders
message.headers
: array of rlp encoded block headers.Response to
GetBlockHeaders
.headers
must be a subset of RLP encoded block headers. No ordering is enforced on the response headers.Note that it is allowed to send several
headers
responses to the same request. This way, the swarm node can send all it has whenever it has something and serve the eth client with minimal latency.Note that it is allowed to send a
Headers
message with emptyheaders
array. This serves as an indication to the requesting eth client that the peer has no more headers available out of the requested batch. Even though this cannot be enforced, it is prudent so that the eth client can register the request context closed and fire alternative requests on the outstanding headers. This has increased relevance once requests become non-free in order to control cost vs concurrency trade-off.Context
Ethereum node implementation notes
It's worth noting that Ethereum clients that want to retrieve this data will need to learn about the latest headers from a separate mechanism such as other ETH peers, since it will not be possible to request headers by their block number. Once an ETH peer has a recent header that they trust, they can use the
parent_hash
to track their way backwards to the genesis block. At a later stage of this track, swarm nodes will need to be able to do the same, see https://hackmd.io/oj9_cT2KQimMdIPe_W_ejQ#It seems that a reasonable algorithm for syncing the header chain when connected to both a set of ETH peers and a set of BZZ peers would be to use the ETH peers to construct a "header skeleton" which is the header chain with large gaps, and then to use the
bzz-eth
peers to fill the gaps.Swarm node data validation notes
Swarm nodes will want to validate headers they receive. The things that can be validated are:
keccak(rlp-encoded-header-bytes)
matches the expected content hash.ethash
validation of the proof-of-work seal.parent_hash
fieldFor POC, doing the RLP and
keccak
validation are likely adequate to catch obvious bugs.Issues
Dependencies
Swarm needs to support a new hash type that is the
keccak(raw-binary-data)
so that Ethereum nodes are able to use the hashes it has available to request data and the Swarm nodes are able to know what data is being requested.Timeline
NewBlockHeaders
.GetBlockHeaders
requests.GetBlockHeaders
requests sent by Ethereum nodes.The Trinity client team should be able to deliver on each of these phases in the 1-2 week time frame which suggests that if the Swarm team can deliver on a similar timeline we should be able to have a fully working POC of this within a 4 week timeline.
Acceptance criteria
To implement an easy test harness, we will assume the swarm node will be connected to at least 2 eth clients
The text was updated successfully, but these errors were encountered: