Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

beacon/light: add CommitteeChain #27766

Merged
merged 19 commits into from
Dec 8, 2023
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
574f085
beacon/light: add CommitteeChain
zsfelfoldi Jul 24, 2023
8ff3896
beacon/light: added comments and removed unused code
zsfelfoldi Jul 25, 2023
6eb5767
beacon/light: comments and various minor changes
zsfelfoldi Sep 3, 2023
46bdc03
beacon/light: more issues fixed
zsfelfoldi Sep 7, 2023
b039c2c
beacon/light: reset invalid chain
zsfelfoldi Sep 8, 2023
2776aef
beacon/light: rollback slot-by-slot instead of in a single batch
zsfelfoldi Sep 8, 2023
7128f9e
beacon/light: changed AfterLast to Next
zsfelfoldi Sep 8, 2023
3a46f01
beacon/light: bubble up getSyncCommittee errors
zsfelfoldi Sep 8, 2023
2baa1bf
beacon/light: moved canonicalStore to separate file
zsfelfoldi Sep 13, 2023
3d68662
beacon/light: multiple small changes
zsfelfoldi Sep 13, 2023
623d195
beacon/light: changed First, Next to Start, End
zsfelfoldi Nov 10, 2023
4d71502
beacon/light: more renames and cleanups
zsfelfoldi Nov 10, 2023
9fd49b0
beacon/light: unexported fixed roots, moved checkpoint init into Comm…
zsfelfoldi Nov 29, 2023
0ee5336
beacon/light: fixed some bugs
zsfelfoldi Nov 30, 2023
94421cf
beacon/light: always use external db parameter in canonicalStore
zsfelfoldi Nov 30, 2023
8b1390e
beacon/light: new Range methods, simplified deleteFrom
zsfelfoldi Nov 30, 2023
0bf9f8e
beacon/light: addressed review comments
zsfelfoldi Dec 2, 2023
3b38b3b
beacon/light: always use rlp encoding in canonicalStore
zsfelfoldi Dec 5, 2023
751dc7a
beacon/light: fix potential database iterator leaking
rjl493456442 Dec 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions beacon/light/canonical.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
// Copyright 2023 The go-ethereum Authors
// This file is part of the go-ethereum library.
//
// The go-ethereum library is free software: you can redistribute it and/or modify
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// The go-ethereum library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>.

package light

import (
"encoding/binary"
"fmt"

"github.com/ethereum/go-ethereum/common/lru"
"github.com/ethereum/go-ethereum/ethdb"
"github.com/ethereum/go-ethereum/log"
)

// canonicalStore stores instances of the given type in a database and caches
// them in memory, associated with a continuous range of period numbers.
// Note: canonicalStore is not thread safe and it is the caller's responsibility
// to avoid concurrent access.
type canonicalStore[T any] struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this should live in the light package. Maybe would be better in internal? Also is there an equivalent structure for storing regular chaindata? I think it is good to avoid creating new single-use generic data structures when possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can think about it as general purpose but on the other hand I see no other use case for it outside package light right now. I'll leave it as it is for now, according to the 2:1 vote results :)

keyPrefix []byte
periods Range
cache *lru.Cache[uint64, T]
encode func(T) ([]byte, error)
decode func([]byte) (T, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that using generics here is a good thing.

  1. In the constructor to committee chain, you define on-the-fly decoders and encoders, and initiate the canonicalStores. That is a bit clunky, IMO, and not the most elegant/readable way to organize the code.

  2. The caches are all lru of size 100. But typically, we would want to use different cache sizes depending on how heavy the items are. In blockchain, we cache more headers than blocks, for example.

I think this could be written a bit dumbed-down as three explicit stores, perhaps with some shared methods, without expanding the code too much. I'm not 100% sure though, maybe I'll have to try it out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a proof of concept of a no-generics rewrite: https://github.com/zsfelfoldi/go-ethereum/pull/22/files

+289, -185 LOC, so maybe the generics-version wins. I dunno.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is a typical use case for generics and it is probably better than the duplication in your version. Maybe I could do encode/decode more elegantly and also use a param for the cache size (though it's not really an issue because they all operate on the same range of periods). I'll think about it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about keeping generics but getting rid of custom encoders, only rely on rlp-encoding things?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For

  • SerializeSyncCommitte: it would consume 24627 bytes instead of 24624 bytes.
  • LightClientUpdate: no change
  • FixedCommitteeRoot (common.Hash): it would consume 33 bytes instead of 32.

I think that would be much better.

Alternatively, if you want to be more explicit, you could do

type codec struct {
	rlp.Encoder
	rlp.Decoder
}

...
type canonicalStore[T codec] struct {
	keyPrefix []byte
...

But then you need to add custom marshallers/unmarshallers on the types.

}

// newCanonicalStore creates a new canonicalStore and loads all keys associated
// with the keyPrefix in order to determine the ranges available in the database.
func newCanonicalStore[T any](db ethdb.KeyValueStore, keyPrefix []byte,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func newCanonicalStore[T any](db ethdb.KeyValueStore, keyPrefix []byte,
func newCanonicalStore[T any](db ethdb.Iteratee, keyPrefix []byte,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, though I'm not really sure whether it's really better to use different interfaces for the same backing database in different functions. I guess it's mostly a matter of taste, now at least we consistently use minimally required interfaces.

encode func(T) ([]byte, error), decode func([]byte) (T, error)) *canonicalStore[T] {
cs := &canonicalStore[T]{
keyPrefix: keyPrefix,
encode: encode,
decode: decode,
cache: lru.NewCache[uint64, T](100),
}
var (
iter = db.NewIterator(keyPrefix, nil)
kl = len(keyPrefix)
first = true
)
for iter.Next() {
if len(iter.Key()) != kl+8 {
log.Warn("Invalid key length in the canonical chain database", "key", fmt.Sprintf("%#x", iter.Key()))
continue
}
period := binary.BigEndian.Uint64(iter.Key()[kl : kl+8])
if first {
cs.periods.Start = period
} else if cs.periods.End != period {
log.Warn("Gap in the canonical chain database")
break // continuity guaranteed
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a bit curious about this. We have a similar thing in ancients, where we have n 'tables', and during startup we sanity-check, and if any one table (e.g bodies) is shorter than the others, we need to trim the others so that they align.

In this code, you simply log a message and move on. What is the failure mode here? Is it an acceptable error if e.g. the updates are not aligned with the committee roots? Seems like we have three options

  • As is: use what we have -- the range found, and move on
  • Correlate: try to ensure that the three stores are aligned, if that's even applicable
  • Error: raise a proper error to the calling code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the last option, there is no need to fix a faulty database here since it's easy enough to resync. This is also what checkConstraints does, the db is reset if it fails. Probably this is what would happen anyway if one of the stores were truncated. Now I return proper errors, show the error log and then reset the db just like in case of other inconsistencies.

first = false
cs.periods.End = period + 1
}
iter.Release()
return cs
}

// databaseKey returns the database key belonging to the given period.
func (cs *canonicalStore[T]) databaseKey(period uint64) []byte {
var (
kl = len(cs.keyPrefix)
key = make([]byte, kl+8)
)
copy(key[:kl], cs.keyPrefix)
binary.BigEndian.PutUint64(key[kl:], period)
return key
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (cs *canonicalStore[T]) databaseKey(period uint64) []byte {
var (
kl = len(cs.keyPrefix)
key = make([]byte, kl+8)
)
copy(key[:kl], cs.keyPrefix)
binary.BigEndian.PutUint64(key[kl:], period)
return key
}
func (cs *canonicalStore[T]) databaseKey(period uint64) []byte {
return binary.BigEndian.AppendUint64(cs.keyPrefix, period)
}

Isn't this equivalent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is safer

func (cs *canonicalStore[T]) databaseKey(period uint64) []byte {
	return binary.BigEndian.AppendUint64(append(nil, cs.keyPrefix...), period)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learned something today :) (never noticed that an AppendUint64 exists)


// add adds the given item to the database. It also ensures that the range remains
// continuous. Can be used either with a batch or database backend.
func (cs *canonicalStore[T]) add(backend ethdb.KeyValueWriter, period uint64, value T) error {
if !cs.periods.CanExpand(period) {
return fmt.Errorf("period expansion is not allowed, first: %d, next: %d, period: %d", cs.periods.Start, cs.periods.End, period)
}
enc, err := cs.encode(value)
if err != nil {
return err
}
if err := backend.Put(cs.databaseKey(period), enc); err != nil {
return err
}
cs.cache.Add(period, value)
cs.periods.Expand(period)
return nil
}

// deleteFrom removes items starting from the given period.
func (cs *canonicalStore[T]) deleteFrom(batch ethdb.Batch, fromPeriod uint64) (deleted Range) {
keepRange, deleteRange := cs.periods.Split(fromPeriod)
deleteRange.Each(func(period uint64) {
batch.Delete(cs.databaseKey(period))
cs.cache.Remove(period)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (cs *canonicalStore[T]) deleteFrom(batch ethdb.Batch, fromPeriod uint64) (deleted Range) {
keepRange, deleteRange := cs.periods.Split(fromPeriod)
deleteRange.Each(func(period uint64) {
batch.Delete(cs.databaseKey(period))
cs.cache.Remove(period)
})
func (cs *canonicalStore[T]) deleteFrom(db ethdb.KeyValueWriter, fromPeriod uint64) (deleted Range) {
keepRange, deleteRange := cs.periods.Split(fromPeriod)
deleteRange.Each(func(period uint64) {
db.Delete(cs.databaseKey(period))
cs.cache.Remove(period)
})

Let's use minimal interfaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

cs.periods = keepRange
return deleteRange
}

// get returns the item at the given period or the null value of the given type
// if no item is present.
func (cs *canonicalStore[T]) get(backend ethdb.KeyValueReader, period uint64) (value T, ok bool) {
if !cs.periods.Contains(period) {
return
}
if value, ok = cs.cache.Get(period); ok {
return
}
if enc, err := backend.Get(cs.databaseKey(period)); err == nil {
if v, err := cs.decode(enc); err == nil {
value, ok = v, true
cs.cache.Add(period, value)
} else {
log.Error("Error decoding canonical store value", "error", err)
}
} else {
log.Error("Canonical store value not found", "period", period, "start", cs.periods.Start, "end", cs.periods.End)
}
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is a bit non-idiomatic, and it's kind of difficult to see that this chunk of code ignores one error (namely, if the backend.Get returns an error, it gets silently overlooked).

Please move things around so that you terminate on the happy path

  1. Do operation
  2. if error, handle
  3. do another operation
  4. if error handle
    ...
    N. Success all the way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; I also removed the named return values, I think it's much more readable now.

}
Loading