-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: refactor and fuzz tests for checkpoint segments and pools #93
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!! Some recommendations for refactoring and the fuzz tests:
types/ckpt_segment_pool.go
Outdated
} | ||
|
||
// TODO: generalise to NumExpectedProofs > 2 | ||
// TODO: optimise the complexity by hashmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we implement this TODO in this PR as well. I suppose we could do both TODOs, we havee all the necessary structures and the code will be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first TODO requires support from Babylon, which at the moment only provides ConnectParts
with two parts as input. So probably we cannot add this support at the moment.
The second TODO is in fact implemented, where the pool has a map with two keys (index of checkpoint segment and hash of OP_RETURN data). I feel like finding matches inevitably requires traversal of two lists of checkpoint segments. Could you think of any other optimisation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the second key of the hashmap was something that was shared by both checkpoint segments (instead of the hash of the data contained in them, e.g. epoch number or something even more specific)? Then, we could use that key to do filtering on the amount of stuff we iterate in the second loop. Currently, for every first part we're iterating all second parts which is suboptimal. Won't have much effect in practice since we do not expect the pool to have many elements, but I think we might simplify the code a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the second key of the hashmap was something that was shared by both checkpoint segments (instead of the hash of the data contained in them, e.g. epoch number or something even more specific)?
This scenario is already ruled out by the checksum in the checkpoint, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we could use the checksum.
types/ckpt_segment_pool.go
Outdated
} | ||
} | ||
|
||
func (p *CkptSegmentPool) Add(ckptSeg *CkptSegment) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we add docstrings to all of those new functions. Relevant to refactoring.
types/ckpt_segment_pool.go
Outdated
matchedCkpts = append(matchedCkpts, ckpt) | ||
// remove the two ckptSeg in pool | ||
delete(p.Pool[uint8(0)], hash1) | ||
delete(p.Pool[uint8(1)], hash2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen in the case that BBN is down when we try to submit the checkpoints? I suppose we panic, although this is not a breaking condition for the vigilante. I suggest that this function only concerns itself to extracting relevant checkpoints and leaves the deletion of existing checkpoints into another function that is triggered externally when a checkpoint has been successfully included in BBN. This way, the reporter won't lose information and can be relied upon in the case that BBN goes down for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that this function only concerns itself to extracting relevant checkpoints and leaves the deletion of existing checkpoints into another function that is triggered externally when a checkpoint has been successfully included in BBN.
Good idea, will do. A possible way of doing this is to move these matched checkpoints from checkpoint segment pool to another list of matched checkpoints, then report these matched checkpoints, and finally remove them from the list. In fact, ADR-001 (which I think is a good idea) will require reporter to implement such a list anyway (where each checkpoint has to be k-deep before being reported). How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good idea, agree all the way ✊
types/ckpt_segment_pool.go
Outdated
} | ||
|
||
// Sort the matched pairs by epoch, since they have to be submitted in order | ||
sort.Slice(matchedCkpts, func(i, j int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe do the sorting elsewhere. The checkpoint pool should not be aware of checkpoint submitting invariants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, we can do the sorting in the list of matched checkpoints described above. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
types/ckpt_segment_pool_test.go
Outdated
} | ||
|
||
func FuzzCkptSegmentPool(f *testing.F) { | ||
datagen.AddRandomSeedsToFuzzer(f, 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use more seeds. Unless they add much overhead.
types/ckpt_segment_pool_test.go
Outdated
f.Fuzz(func(t *testing.T, seed int64) { | ||
rand.Seed(seed) | ||
|
||
tag := btctxformatter.MainTag(48) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we decide between MainTag
and TestTag
based on the seed. Also, we decide the tag index based on the seed.
types/ckpt_segment_pool_test.go
Outdated
"github.com/stretchr/testify/require" | ||
) | ||
|
||
func randNBytes(n int) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's the datagen.GenRandomByteArray
function
types/ckpt_segment_pool_test.go
Outdated
Epoch: rand.Uint64(), | ||
LastCommitHash: randNBytes(btctxformatter.LastCommitHashLength), | ||
BitMap: randNBytes(btctxformatter.BitMapLength), | ||
SubmitterAddress: randNBytes(btctxformatter.AddressLength), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should look like a valid BTC address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test aims at ensuring the correctness of matching checkpoint segments. Would checking the format of addresses be out of the scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should use as realistic data as possible. Eitherway, we won't have to check whether the address is valid since the generation mechanism will accomplish that. Might be a good addition to the datagen functionality of bbn.
types/ckpt_segment_pool_test.go
Outdated
require.Equal(t, 2, pool.Size()) | ||
|
||
// find matched pairs of segments in the pool | ||
ckpts := pool.Match() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function only checks that checkpoints get matched only when a pool contains 2 elements that are matching. Other elements that we should test for
- Variable number of checkpoints inside the pool. We can emulate the number of checkpoints from the seed (e.g. deciding between inserting 0..10 checkpoints.
- The checkpoints are returned in sorted order in terms of the epoch.
- There might be checkpoints which do not have a match (e.g. only one part of the checkpoint has been included). We can emulate this by randomly generating a flag that decides if this should happen (i.e. only add one part of a checkpoint). Then, we can randomly decide if this part is the first one or the second one.
Thanks for the review Vitalis! Now this PR has addressed the comments. Most notably,
Some future works still remain:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Only minor stuff:
reporter/btc_handlers.go
Outdated
log.Debug("Found no matched pair of checkpoint segments in this match attempt") | ||
return nil | ||
} | ||
|
||
// Sort the matched pairs by epoch, since they have to be submitted in order | ||
// TODO: find smarter way for sorting | ||
sort.Slice(r.CheckpointCache.Checkpoints, func(i, j int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could implement a GetSortedCheckpoints
method.
reporter/btc_handlers.go
Outdated
// for each matched pair, wrap to MsgInsertBTCSpvProof and send to Babylon | ||
for _, ckpt := range ckpts { | ||
for r.CheckpointCache.NumCheckpoints() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe update the comment below to demonstrate that this is a while
loop that modifies the list in place.
Also, we could implement methods HasCheckpoints
and RemoveEarliestCheckpoint
to make the operation more clear.
types/ckpt_cache_test.go
Outdated
|
||
// if we don't want a match, then mess up with one of BabylonData | ||
if !match { | ||
if rand.Intn(2) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's the datagen.OneInN method
types/ckpt_cache_test.go
Outdated
|
||
// get a random tag, either for mainnet or testnet | ||
var tag btctxformatter.BabylonTag | ||
if rand.Intn(2) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also use datagen.OneInN
here.
Fixes BM-153
This PR provides the fuzz tests for the checkpoint segment pool. Along the way, this PR also refactors the checkpoint structures for readability and modularity.