Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: clean up tree definitions #1655

Merged
merged 4 commits into from
Dec 23, 2022
Merged

fix: clean up tree definitions #1655

merged 4 commits into from
Dec 23, 2022

Conversation

vmx
Copy link
Contributor

@vmx vmx commented Dec 21, 2022

There were quite a few public type definitions, that were mostly replaced by the SectorShape* types. This commit cleans them up and moves them around if appropriate.

This should make the code easier to follow and the public API surface smaller.

BREAKING CHANGE: BinaryLCMerkleTree, BinaryMerkleTree, BinarySubMerkleTree, LCMerkleTree, LCStore, MerkleStore, MerkleTree, OctLCMerkleTree, OctLCSubMerkleTree, OctLCTopMerkleTree, OctMerkleTree, OctSubMerkleTree, OctTopMerkleTree, QuadLCMerkleTree and QuadMerkleTree are removed from the public interface.


This PR was triggered by a question about optimizing the GPU usage. I then dug deeper into the code to find out which tree sizes we are actually using and then found out that there are lots of types we don't actually use.

There were quite a few public type definitions, that were mostly replaced by
the `SectorShape*` types. This commit cleans them up and moves them around if
appropriate.

This should make the code easier to follow and the public API surface smaller.

BREAKING CHANGE: `BinaryLCMerkleTree`, `BinaryMerkleTree`,
`BinarySubMerkleTree`, `LCMerkleTree`, `LCStore`, `MerkleStore`, `MerkleTree`,
`OctLCMerkleTree`, `OctLCSubMerkleTree`, `OctLCTopMerkleTree`, `OctMerkleTree`,
`OctSubMerkleTree`, `OctTopMerkleTree`, `QuadLCMerkleTree` and `QuadMerkleTree`
are removed from the public interface.
Copy link
Contributor

@storojs72 storojs72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this change, as it removes multiple confusing types of trees, since we have 3 various types of memory layout (Memory, Disk, LevelCacheStore) and 3 levels of tree's arity (Base, Sub, Top).

A small remark - I think that comment about arities is partially correct. Up to you, but I would rewrite it approximately as following:

The base tree arity defines structure of the tree (binary / oct tree) instantiated directly over the data represented as a set of leaves. E.g. DiskTree<_, _, _, U2> means definition of binary disk-backed tree, where each upper-layer node is actually a hash of two underlying nodes / leaves.

The sub-tree arity defines structure of so called "composed" tree - a first layer of base trees composition. E,g, DiskTree<_, _, U8, U2> means that we combine 8 binary trees into a single composed tree under root computed via hashing 8 binary trees' roots.

And, finally, the top-tree arity defines structure of so called "composed-composed" tree - a second layer of base trees composition. E,g, DiskTree<_, U4, U8, U2> means that we combine 4 composed trees (each combines 8 binary trees) into a single composed-composed tree under root computed via hashing 4 composed trees' roots.

storage-proofs-core/src/merkle/mod.rs Outdated Show resolved Hide resolved
storage-proofs-core/src/merkle/mod.rs Outdated Show resolved Hide resolved
@vmx
Copy link
Contributor Author

vmx commented Dec 22, 2022

Thanks @storojs72 for the clarification. Let me try to repeat it, to make sure I understand it correctly.

The base arity is used for all levels from the leaves up to a certain point (I originally misunderstood it as the base arity only be used for a single level).

The sub-artity and the top-arity are single levels only.

@vmx
Copy link
Contributor Author

vmx commented Dec 22, 2022

I had a another look at it. What happens is: You hash the tree with the base arity. We have trees where that would lead to multiple roots (64GiB sectors are an example). In order to get a single root, a sub-=/top-arity is specified. It will then hash the remaining nodes into a single root node.

Though now I have a question: what is the difference between those two:

pub type SectorShapeSub2 = LCTree<DefaultTreeHasher, U8, U2, U0>;
pub type SectorShapeTop2 = LCTree<DefaultTreeHasher, U8, U8, U2>;

@storojs72
Copy link
Contributor

The base arity is used for all levels from the leaves up to a certain point (I originally misunderstood it as the base arity only be used for a single level).

The sub-artity and the top-arity are single levels only.

I think its correct. Depending on size of the data - consider it n , base binary tree can have up to log2[n] intermediate levels. When we group some base trees into a single composed tree, we add one additional level of composition. If we further group some composed trees into a single composed-composed tree, we add second additional level of composition. TBH, I don't know where did this requirement come from and why we have 2 levels of compositions (not 3 or 10) used for representing Filecoin sectors... The only additional detail I can mention is that Poseidon hashing works optimally exactly with 8-0-0, 8-8-0, 8-8-2 (oct) trees

@storojs72
Copy link
Contributor

pub type SectorShapeSub2 = LCTree<DefaultTreeHasher, U8, U2, U0>; - it means we work with LevelCacheStore-backed compound tree (as top-tree arity is U0 - so there is no highest level of composition) that consists from 2 (sub-tree arity is U2) oct trees (base arity is U8),

pub type SectorShapeTop2 = LCTree<DefaultTreeHasher, U8, U8, U2>; - it means we work with LevelCacheStore-backed compound-compound tree (as top-tree arity is not U0), consisted from 2 (top tree arity is U2) compound trees, where each compound tree consists from 8 (sub tree arity is U8) oct trees (base arity is U8).

@vmx
Copy link
Contributor Author

vmx commented Dec 22, 2022

I wonder what practical differences those two trees have (perhaps the caching?), as they are both oct-trees where the top level has arity2. I guess I need to check the source.

@storojs72
Copy link
Contributor

I think that sector size is the key. For 32gib sector we use 8-8-0 trees (compound), while for 64gib - 8-8-2 (compound-compound). Using "compound-compound" trees opens possibility to parallelise computation of inclusion proofs, I guess

@vmx vmx requested a review from storojs72 December 22, 2022 13:53
Copy link
Contributor

@storojs72 storojs72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vmx vmx merged commit 128f720 into master Dec 23, 2022
@vmx vmx deleted the tree-cleanup branch December 23, 2022 15:33
storojs72 pushed a commit that referenced this pull request Jan 11, 2023
There were quite a few public type definitions, that were mostly replaced by
the `SectorShape*` types. This commit cleans them up and moves them around if
appropriate.

This should make the code easier to follow and the public API surface smaller.

BREAKING CHANGE: `BinaryLCMerkleTree`, `BinaryMerkleTree`,
`BinarySubMerkleTree`, `LCMerkleTree`, `LCStore`, `MerkleStore`, `MerkleTree`,
`OctLCMerkleTree`, `OctLCSubMerkleTree`, `OctLCTopMerkleTree`, `OctMerkleTree`,
`OctSubMerkleTree`, `OctTopMerkleTree`, `QuadLCMerkleTree` and `QuadMerkleTree`
are removed from the public interface.
storojs72 pushed a commit that referenced this pull request Jan 11, 2023
There were quite a few public type definitions, that were mostly replaced by
the `SectorShape*` types. This commit cleans them up and moves them around if
appropriate.

This should make the code easier to follow and the public API surface smaller.

BREAKING CHANGE: `BinaryLCMerkleTree`, `BinaryMerkleTree`,
`BinarySubMerkleTree`, `LCMerkleTree`, `LCStore`, `MerkleStore`, `MerkleTree`,
`OctLCMerkleTree`, `OctLCSubMerkleTree`, `OctLCTopMerkleTree`, `OctMerkleTree`,
`OctSubMerkleTree`, `OctTopMerkleTree`, `QuadLCMerkleTree` and `QuadMerkleTree`
are removed from the public interface.
storojs72 added a commit that referenced this pull request Jan 19, 2023
* fix: use current process binding to limit thread cores (#1633)

Use the current processes bound cores to limit the possible cores that
threads can be bound to. This allows core binding to work properly when the
lotus-worker service is limited to certain CPUs by cgroups.

* fix: update ec-gpu-gen (#1638)

`ec-gpu-gen` needs to be updated to v0.5 as v0.4 has dependencies that
depend on yanked version. It's an indirect dependency of `bellperson` and
`neptune`, which are upgraded here.

Moving from `memmap` (which is deprecated) to `memmap2` was also needed
als dependencies also switched.

`chrono` updated their API, so there was also a small change needed.

* fix: broken Links in README.md #1637 (#1649)

* feat: Introduce PoRepConfig::new_groth16() (#1635)

Instead of constructing the `PoRepConfig` directly, use a constructor
function. This simplifies the code and makes things less error-prone.

Fixes #1632.

Co-authored-by: 卜翰 <[email protected]>

* fix: clean up tree definitions (#1655)

There were quite a few public type definitions, that were mostly replaced by
the `SectorShape*` types. This commit cleans them up and moves them around if
appropriate.

This should make the code easier to follow and the public API surface smaller.

BREAKING CHANGE: `BinaryLCMerkleTree`, `BinaryMerkleTree`,
`BinarySubMerkleTree`, `LCMerkleTree`, `LCStore`, `MerkleStore`, `MerkleTree`,
`OctLCMerkleTree`, `OctLCSubMerkleTree`, `OctLCTopMerkleTree`, `OctMerkleTree`,
`OctSubMerkleTree`, `OctTopMerkleTree`, `QuadLCMerkleTree` and `QuadMerkleTree`
are removed from the public interface.

* fix: update ec-gpu-gen (#1638)

`ec-gpu-gen` needs to be updated to v0.5 as v0.4 has dependencies that
depend on yanked version. It's an indirect dependency of `bellperson` and
`neptune`, which are upgraded here.

Moving from `memmap` (which is deprecated) to `memmap2` was also needed
als dependencies also switched.

`chrono` updated their API, so there was also a small change needed.

* chore: update Cargo.lock

* fix: there was a memmap -> memmap2 missing

* fix: make poseidon tests pass

Neptune currently is a fork of pasta_curves. That needs patching as
well, in order to get the correct names for the fields out.

* ci: Apply rustfmt and fix clippy

* Pick relevant branch of neptune

* fix: Use SHA256 hasher for binary trees

Co-authored-by: Clint Armstrong <[email protected]>
Co-authored-by: Volker Mische <[email protected]>
Co-authored-by: hanbu97 <[email protected]>
Co-authored-by: 卜翰 <[email protected]>
storojs72 added a commit that referenced this pull request Jan 19, 2023
* fix: use current process binding to limit thread cores (#1633)

Use the current processes bound cores to limit the possible cores that
threads can be bound to. This allows core binding to work properly when the
lotus-worker service is limited to certain CPUs by cgroups.

* fix: update ec-gpu-gen (#1638)

`ec-gpu-gen` needs to be updated to v0.5 as v0.4 has dependencies that
depend on yanked version. It's an indirect dependency of `bellperson` and
`neptune`, which are upgraded here.

Moving from `memmap` (which is deprecated) to `memmap2` was also needed
als dependencies also switched.

`chrono` updated their API, so there was also a small change needed.

* fix: broken Links in README.md #1637 (#1649)

* feat: Introduce PoRepConfig::new_groth16() (#1635)

Instead of constructing the `PoRepConfig` directly, use a constructor
function. This simplifies the code and makes things less error-prone.

Fixes #1632.

Co-authored-by: 卜翰 <[email protected]>

* fix: clean up tree definitions (#1655)

There were quite a few public type definitions, that were mostly replaced by
the `SectorShape*` types. This commit cleans them up and moves them around if
appropriate.

This should make the code easier to follow and the public API surface smaller.

BREAKING CHANGE: `BinaryLCMerkleTree`, `BinaryMerkleTree`,
`BinarySubMerkleTree`, `LCMerkleTree`, `LCStore`, `MerkleStore`, `MerkleTree`,
`OctLCMerkleTree`, `OctLCSubMerkleTree`, `OctLCTopMerkleTree`, `OctMerkleTree`,
`OctSubMerkleTree`, `OctTopMerkleTree`, `QuadLCMerkleTree` and `QuadMerkleTree`
are removed from the public interface.

* fix: update ec-gpu-gen (#1638)

`ec-gpu-gen` needs to be updated to v0.5 as v0.4 has dependencies that
depend on yanked version. It's an indirect dependency of `bellperson` and
`neptune`, which are upgraded here.

Moving from `memmap` (which is deprecated) to `memmap2` was also needed
als dependencies also switched.

`chrono` updated their API, so there was also a small change needed.

* chore: update Cargo.lock

* fix: there was a memmap -> memmap2 missing

* fix: make poseidon tests pass

Neptune currently is a fork of pasta_curves. That needs patching as
well, in order to get the correct names for the fields out.

* ci: Apply rustfmt and fix clippy

* Pick relevant branch of neptune

* fix: Use SHA256 hasher for binary trees

Co-authored-by: Clint Armstrong <[email protected]>
Co-authored-by: Volker Mische <[email protected]>
Co-authored-by: hanbu97 <[email protected]>
Co-authored-by: 卜翰 <[email protected]>
storojs72 added a commit that referenced this pull request Jan 19, 2023
* fix: use current process binding to limit thread cores (#1633)

Use the current processes bound cores to limit the possible cores that
threads can be bound to. This allows core binding to work properly when the
lotus-worker service is limited to certain CPUs by cgroups.

* fix: update ec-gpu-gen (#1638)

`ec-gpu-gen` needs to be updated to v0.5 as v0.4 has dependencies that
depend on yanked version. It's an indirect dependency of `bellperson` and
`neptune`, which are upgraded here.

Moving from `memmap` (which is deprecated) to `memmap2` was also needed
als dependencies also switched.

`chrono` updated their API, so there was also a small change needed.

* fix: broken Links in README.md #1637 (#1649)

* feat: Introduce PoRepConfig::new_groth16() (#1635)

Instead of constructing the `PoRepConfig` directly, use a constructor
function. This simplifies the code and makes things less error-prone.

Fixes #1632.

Co-authored-by: 卜翰 <[email protected]>

* fix: clean up tree definitions (#1655)

There were quite a few public type definitions, that were mostly replaced by
the `SectorShape*` types. This commit cleans them up and moves them around if
appropriate.

This should make the code easier to follow and the public API surface smaller.

BREAKING CHANGE: `BinaryLCMerkleTree`, `BinaryMerkleTree`,
`BinarySubMerkleTree`, `LCMerkleTree`, `LCStore`, `MerkleStore`, `MerkleTree`,
`OctLCMerkleTree`, `OctLCSubMerkleTree`, `OctLCTopMerkleTree`, `OctMerkleTree`,
`OctSubMerkleTree`, `OctTopMerkleTree`, `QuadLCMerkleTree` and `QuadMerkleTree`
are removed from the public interface.

* fix: update ec-gpu-gen (#1638)

`ec-gpu-gen` needs to be updated to v0.5 as v0.4 has dependencies that
depend on yanked version. It's an indirect dependency of `bellperson` and
`neptune`, which are upgraded here.

Moving from `memmap` (which is deprecated) to `memmap2` was also needed
als dependencies also switched.

`chrono` updated their API, so there was also a small change needed.

* chore: update Cargo.lock

* fix: there was a memmap -> memmap2 missing

* fix: make poseidon tests pass

Neptune currently is a fork of pasta_curves. That needs patching as
well, in order to get the correct names for the fields out.

* ci: Apply rustfmt and fix clippy

* Pick relevant branch of neptune

* fix: Use SHA256 hasher for binary trees

Co-authored-by: Clint Armstrong <[email protected]>
Co-authored-by: Volker Mische <[email protected]>
Co-authored-by: hanbu97 <[email protected]>
Co-authored-by: 卜翰 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants