Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized Merkle Tree Index: use 0-based indexing #1008

Closed
mratsim opened this issue Apr 28, 2019 · 4 comments
Closed

Generalized Merkle Tree Index: use 0-based indexing #1008

mratsim opened this issue Apr 28, 2019 · 4 comments

Comments

@mratsim
Copy link
Contributor

mratsim commented Apr 28, 2019

The current specs for generalized merkle tree uses 1-based indexing (https://github.com/ethereum/eth2.0-specs/blob/2787fea5feb8d5977ebee7c578c5d835cff6dc21/specs/light_client/merkle_proofs.md#generalized-merkle-tree-index)

The differences between 0 and 1 based indexing are the following:

root at 0 root at 1
Left child i*2 + 1 i*2
Right child i*2 + 2 i*2 + 1
Parent (i-1)/2 i/2

Convention

Using 0-based indexing would be using the common convention of most programming languages (save Fortran, Matlab and Julia).

Errors

It is well known that mixing 0-based and 1-based indexing tends to create off-by-one errors, we already had several such spec bugs in the past. And using 1-based indexing is introducing new risks.

Memory saving

I expect clients to allocate binary trees with power-of-2 sizes for efficiency reasons.
If we have N nodes in the merkle tree where N is a power of 2 we will waste N-1 node space.

Concrete example: 1 byte can store 256 values in the range [0, 255]
Assuming we have values in range [1, 256] instead like the current proposition we would need 2 bytes of storage and will waste the [257-511] range which represent 255 values.

Alignment issues

Assuming we store merkle proofs in arrays of uint32 at a low-level, they would need to be aligned to 32-bit boundaries on many architecture that would benefit the most from light clients (ARM, MIPS, ...).

Furthermore Merkle proofs would probably be excellent candidates to be accelerated by SIMD/vector hardware which requires also requires alignment.

This off-by-one will cause unaligned loads/store that implementers will have to work around, probably via padding further increasing memory waste or via temporaries.

Potential counter-argument: extra computation

Some might be worried that 0-based is slower than 1-based indexing
where we can just use 2*i and 2*i+1 to find children instead of 2*i+1 and 2*i+2
and i/2 to find parents instead of (i-1)/2.

  1. It doesn't make a difference because at a low-level, CPUs support Scale*Index+Offset
    addressing mode if scale is a power of 2. Execution time is the same. (This is called SIB addressing in x86 and scaled register in ARM).
  2. The performance is dominated by fetching data from memory. During
    the previous memory fetch, the CPU can precompute the next location to fetch.
    as there is no data dependencies.
@vbuterin
Copy link
Contributor

The memory saving is really negligible, and I think it actually improves alignment; " If we have N nodes in the merkle tree where N is a power of 2 we will waste N-1 node space." is incorrect because most of the Merkle trees we use have 2^N leaves, and so 2^(N+1) - 1 total nodes, so adding an empty value in the zero position just gets us to 2^(N+1)

Also, and much more importantly, generalized indices are not meant to correspond to memory positions and are you are not meant to think of it as an array; for most complex SSZ objects, that would lead to a lot of empty space in your serialization because of regions that are unused.

The purpose of a generalized index is to serialize a binary path, and you can think of the algorithm as follows: the highest 1-bit is a flag saying "the bits before this were just zeroes but the bits after this are where the actual path starts", followed by the bits of the path from top to bottom. Viewed this way, offsetting everything down by 1 would break this abstraction.

@mratsim
Copy link
Contributor Author

mratsim commented Apr 29, 2019

Also, and much more importantly, generalized indices are not meant to correspond to memory positions and are you are not meant to think of it as an array; for most complex SSZ objects, that would lead to a lot of empty space in your serialization because of regions that are unused.

The issue with that is that we will need a translation layer between generalized indices and the actual implementation which is 0-based raising the risk of off-by-one errors.

Assuming we implement as sparse Merkle trees, the space difference disappears for both 0-based and 1-based but the risk of errors stays.

Regarding the bit abstraction part, this is indeed the same coding as used in Fenwick Trees/Binary Indexed Trees

A Fenwick tree is most easily understood by considering a one-based array. Each element whose index i is a power of 2 contains the sum of the first i elements. Elements whose indices are the sum of two (distinct) powers of 2 contain the sum of the elements since the preceding power of 2. In general, each element contains the sum of the values since its parent in the tree, and that parent is found by clearing the least-significant bit in the index.

To find the sum up to any given index, consider the binary expansion of the index, and add elements which correspond to each 1 bit in the binary form.

For example, say one wishes to find the sum of the first eleven values. Eleven is 10112 in binary. This contains three 1 bits, so three elements must be added: 10002, 10102, and 10112. These contain the sums of values 1–8, 9–10, and 11, respectively.

To modify the eleventh value, the elements which must be modified are 10112, 11002, 100002, and all higher powers of 2 up to the size of the array. These contain the sums of values 11, 9–12, and 1–16, respectively. The maximum number of elements which may need to be updated is limited by the number of bits in the size of the array.

However in practice going through the branches would just use recursive "if" test, the binary coding would only be used for visual inspection and we can just not the binary representation to switch around the bits.

Comparatively, off-by-one errors are much harder to notice, debug and track down and I'd rather pay the small cost of breaking this abstraction to avoid off-by-ones.

@vbuterin
Copy link
Contributor

vbuterin commented May 4, 2019

Generalized indices are intended for all kinds of SSZ objects, the great majority of which will be highly unbalanced (eg. think of the BeaconState), meaning that the mapping of tree leaves to generalized indices is going to have a lot of "holes" in it. So using the generalized index as a memory location is going to be a terrible idea, as you would have a really inefficient memory setup (for the ShardState it's looking like generalized indices might exceed 2**300)

So for example, in the data structure below (eg. the SSZ structure {x: bytes32, y: {z: bytes32, w: bytes32}}):

    R
  A   B
     C D

The indices would be: R=1, A=2, B=3, C=6, D=7. Nothing would correspond to 4 or 5.

The reason I don't use the sequential addressing as in the image of the chart in that wiki article is that it's a desideratum for a description of a given Merkle path to not depend on anything outside that path. You especially don't want a description of the path to depend on the size of a variable length list somewhere else in the data structure.

@JustinDrake
Copy link
Collaborator

This was discussed on today's call (cc @protolambda and @djrtwo)—sticking with 1-based indexing for generalised indices lead to cleaner implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants