-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: proof of correct derivation #13
Conversation
…tion into cl/ns-filter
Co-authored-by: Gus Gutoski <[email protected]>
feat: proof of correct namespace filtering
feat: commitment equivalence gadget
Partially done. @alxiong will take care of the contract part. |
CI is fixed now, only fails due to out of memory (got killed). I'll review your work first, then update the contract part. The first part of my review attempts to improve the documentation (my next commit) to align our understanding/spec. |
perf: optimize proof generations with precompiles
/// Types for block Merkle tree. This struct is essentially the same as | ||
/// `LightWeightSha3MerkleTree`, as the later one only keeps a frontier | ||
pub type BlockMerkleTree = SHA3MerkleTree<Commitment<BlockHeader>>; | ||
pub type BlockMerkleTreeProof = <BlockMerkleTree as MerkleTreeScheme>::MembershipProof; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserialization of this struct still takes over 1m cycles.
MerkleNode
is a enum so we can't derive CanonicalSerialize
for it.
prover log for 0f11a12
wait, so it's actually not that slow. not sure why it differs from the previous run outside the tmux.. maybe bad seed? |
prover log for cf21fee
|
impl Serialize for VidCommon { | ||
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> | ||
where | ||
S: serde::Serializer, | ||
{ | ||
let mut bytes = Vec::new(); | ||
self.0 | ||
.serialize_uncompressed(&mut bytes) | ||
.map_err(|e| S::Error::custom(format!("{e:?}")))?; | ||
Serialize::serialize(&bytes, serializer) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you help us understand why this manual CanonicalSerde
leads to fewer cycles than the serde's default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No maybe not the validity checking. It's about compression and extra layer of TaggedBase64
: https://github.com/EspressoSystems/jellyfish/blob/main/utilities/src/serialize.rs#L44-L87
For prime field element the overhead is not too much, just a TaggedBase64
wrapper. But for curve points, compression really makes a huge difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work.
(Prioritized reviewing the README.md and test running the prover logic)
Cycle count tracing suggests that we spend 1.8M instructions (30% of total instructions) in Looking at the implementation it seems that the algorithm is already optimized (uses binary extended euclidian and avoids unnecessary reduction step). Did we explore optimizations in this part of our code? trace-output:
|
Co-authored-by: Anders Konring <[email protected]>
Thanks @akonring for pointing out this. You are right that most of instructions are spent on the Montgomery inversion. And quite a lot of them are related to (de)serialization. Some serializations are necessary, e.g. the one inside It will be great to remove these. But I don't know a good way to do it. cc @alxiong |
Also further optimizations will be done in a separate PR. Here let's focus on the functionality. |
I can't think of an easy way, but here's what's possible: iiuc, places where we don't need normal form, or don't expect to interpret from random bytes into field elements, we can keep the byte repr directly from Mont form. ^^ this would incur some code repetition, but we don't have to fork arkworks. Alternatively, we can fork |
agree! Furthermore, if we look at the prover time reported at #13 (comment) and #13 (comment): but the inner SNARK generation (128s v.s. 94s) is a minority portion of the overall prover time (473 v.s. 393s) That said, we set the parameters to be small in this PR, /// low degree for demo only
pub const SRS_DEGREE: usize = 8usize;
/// payload bytes for each block shouldn't exceed max size
/// during encoding, every 30 bytes is converted to a 254-bit field element
pub const MAX_PAYLOAD_BYTES_PER_BLOCK: usize = SRS_DEGREE * 30;
/// number of storage node for VID
pub const NUM_STORAGE_NODES: u32 = 10;
/// produce derivation proof for a batch of espresso blocks
pub const NUM_BLOCKS: u64 = 5; Plus, we greatly simplify the rollup commitment by assuming it's a sha256, and this part for actual rollup would contribute much more cycles. it's still very important to optimize for real integration (beyond demo) |
mega PR for all prover works
Closes: #6 #9
This PR:
This PR does not:
Key places to review: