Avoid storing redundant attestations in slasher DB #2112

michaelsproul · 2020-12-22T03:27:30Z

Description

The slasher stores all the attestations provided to it individually, even if some of them contain duplicate information. The most common instance of this seems to be unaggregated attestations stored alongside their aggregate, which wastes space. I suspected this was an issue, but hadn't measured how bad it was in practice. The use of --subscribe-all-subnets on some of the SigP nodes revealed the extent of the problem: a 70GB database using --subscribe-all-subnets vs a 14GB database without.

Version

Lighthouse v1.0.4

Steps to resolve

I think one change that's straight-forward to implement would be the following:

Deduplicate the attestations in-memory, when they are hashed and stored in the attestation queue prior to being processed as part of a batch. A mapping from (validator_index, attestation_data_root) => indexed_attestation could be used, where on insert, we keep only the max indexed attestations (by # of attesters). Some Arc magic could gracefully handle the sharing and garbage collection.

This will be close to optimal so long as attestations and their aggregate arrive in the same batch. If that assumption turns out to be too strong, some more sophisticated (and likely costly) method to deduplicate them upon writing to disk could be used (perhaps in addition to the in-memory deduplication).

The text was updated successfully, but these errors were encountered:

## Issue Addressed Closes #2112 Closes #1861 ## Proposed Changes Collect attestations by validator index in the slasher, and use the magic of reference counting to automatically discard redundant attestations. This results in us storing only 1-2% of the attestations observed when subscribed to all subnets, which carries over to a 50-100x reduction in data stored 🎉 ## Additional Info There's some nuance to the configuration of the `slot-offset`. It has a profound effect on the effictiveness of de-duplication, see the docs added to the book for an explanation: https://github.com/michaelsproul/lighthouse/blob/5442e695e5256046b91d4b4f45b7d244b0d8ad12/book/src/slasher.md#slot-offset

michaelsproul added A1 slasher labels Dec 22, 2020

michaelsproul mentioned this issue Mar 26, 2021

[Merged by Bors] - Start a release checklist #2270

Closed

michaelsproul added A0 and removed A1 labels Mar 26, 2021

michaelsproul mentioned this issue Mar 30, 2021

Optimise slasher storage of attester data #2286

Closed

michaelsproul mentioned this issue Aug 25, 2021

Slasher Database Errors (MapFull) #2538

Closed

michaelsproul self-assigned this Nov 2, 2021

michaelsproul mentioned this issue Nov 3, 2021

[Merged by Bors] - De-duplicate attestations in the slasher #2767

Closed

michaelsproul closed this as completed Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid storing redundant attestations in slasher DB #2112

Avoid storing redundant attestations in slasher DB #2112

michaelsproul commented Dec 22, 2020

Avoid storing redundant attestations in slasher DB #2112

Avoid storing redundant attestations in slasher DB #2112

Comments

michaelsproul commented Dec 22, 2020

Description

Version

Steps to resolve