Verifiable Deal Aggregation #283
Replies: 4 comments 6 replies
-
this looks incredible, thanks @Kubuxu! this would really help us speed up our write pipelines in *.storage :) |
Beta Was this translation helpful? Give feedback.
-
It seems like in order for this to work you need some zeroes lying around in here to pad things out to the correct M values. If so does this require aggregators shipping data to SPs without using an IPLD transport, require some sideband communication detailing how to transform the IPLD data, or am I missing something? My understanding of the current system from https://spec.filecoin.io/#section-systems.filecoin_files.piece is that the way data is currently ingested into SPs in the "online deal flow" is:
If we need to insert zeroes in the middle of the sector to get the correct deal padding for each DAG this implies that either:
Note that while option 1 sounds nice and easy we'd still likely need some standard format for the sectors so that SPs could create an index mapping of the blocks being stored so that they can make them retrievable for end users who want to request some IPLD DAG stored by the SP. |
Beta Was this translation helpful? Give feedback.
-
First, I think this would be a great thing to have. Again, I think it's great that Kuba started this FIP, but just want to mention some of my worries, things we should not loose sight of. |
Beta Was this translation helpful? Give feedback.
-
Nice work. What actor hooks would we need for a subdeal client that is an actor to be able to confirm that its data was included in a deal made by an aggregator?
For the simple case would it be better in any way for the aggregator to provide inclusion proofs rather than the set of SubPieceInfos. In total, the SubPieceInfos are less data to prove all sub-pieces, but for any individual client, would an inclusion proof be smaller or easier to verify? Do any of these answers change for the adaptive version? |
Beta Was this translation helpful? Give feedback.
-
Authors: @Kubuxu @ribasushi @nicola
With deal aggregators emerging and gaining popularity, it is much easier for clients to make deals for small pieces of data.
This is great for ecosystem growth, unfortunately, it causes clients to lose one important capability:
the ability to verify that their data is stored inside a sector.
It is essential to recognise the importance of this capability. Verifiability is one of the core properties web3 ecosystem, whether it would be for the purpose of a single user or to enable the composition of services. Without Verifiable Deal Aggregation, possible use cases like data availability for L2 protocols, storing NFTs or small websites cannot evolve beyond using trusted aggregators.
Below I propose two solutions for Verifiable Deal Aggregation, one of them is simpler but constrains sub-deal sizes and alignment. The other is more complex in design and execution but allows for more flexible sub-deal sizes.
Simple Verifiable Deal Aggregation
As the name suggests, this protocol is quite simple, but the sub-deal size is constrained to be a power of two after padding.
Protocol:
This protocol reuses facilities for computing UnsealedSectorCID out of PieceCIDs, which, while simple, has one crucial drawback.
Sizes of deals (and thus sub-deals) in this framework are limited to
32*2^{N}
, where N is an integer to between 2 and 30 inclusive, for data size after padding (128/127th of unpadded data size).The ComputeUnsealedSectorCID (or equivalent) is fast enough to be exposed on-chain, and the amount of data needed to prove sub-deal inclusion on-chain is minimal (with optimisation, the maximum is
32*(30-log2(SizeOfDeal/32))
and the expected size would be in the area of 128-512B).Adaptive Verifiable Deal Aggregation
The main draw of this scheme is that the deals don't have to be power-of-2 aligned, increasing the flexibility of deal aggregation at the price of increased verification cost. The verification costs and proof size scales with the misalignment of the sub-deal data.
The misalignment of M can be explained as follows:
x * 256MB
)x * 128MB
)Pow2Size/(2^M)
, at offsetx * Pow2Size/(2^M)
The effective size of the deal is also rounded up to the
Pow2Size/(2^M)
. For example, a 300MiB deal, which normally would be rounded up and utilise 512MiB of space, with a M=2 will utilise 384MiB of sector space, leaving room for inserting 128MiB deal after another deal 512MiB deal with M=2 or 1GiB deal with M=3.Higher misalignment values require more inclusions proofs to prove the misaligned data. The number of required inclusion proofs is
2^M+(0 or 1)
with half of themM+log2(DealSize/SubDealSize)
long and halfM
long.This adaptive property means that:
This protocol version allows the best of both worlds while possibly enabling all aggregated deals to be verifiable. The simple protocol has less chance of adoption for all aggregate deals as the expected deal size overhead due to rounding to next power of two is 25%, this overhead is halved for adaptive verifiable deal aggregation with M=1 or reduced to 6.25% for M=2.
cc @nicola @ribasushi @anorth
EDIT 1: The Adaptive proof cost changed from 4^M to
2^M+(0 or 1)
.Beta Was this translation helpful? Give feedback.
All reactions