Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merkleization #77

Closed
7 of 10 tasks
Scooletz opened this issue May 9, 2023 · 10 comments
Closed
7 of 10 tasks

Merkleization #77

Scooletz opened this issue May 9, 2023 · 10 comments
Assignees
Labels
ethereum An Ethereum specific work item that requires a good understanding of Eth 🌴Merkle Merkle construct

Comments

@Scooletz
Copy link
Contributor

Scooletz commented May 9, 2023

Implement Merkleization for the tree

The last 3 will be addressed in separate issues.

@Scooletz Scooletz added this to Paprika May 9, 2023
@Scooletz Scooletz converted this from a draft issue May 9, 2023
@Scooletz Scooletz added the ethereum An Ethereum specific work item that requires a good understanding of Eth label May 9, 2023
@emlautarom1 emlautarom1 self-assigned this May 29, 2023
@emlautarom1 emlautarom1 moved this from Todo to In Progress in Paprika May 29, 2023
@emlautarom1
Copy link
Contributor

To start working on this, I suggest writing some test on the current Nethermind client (here, probably: https://github.com/NethermindEth/nethermind/blob/22cc2b238e72e59a28c4b9e99019a4a145cc5895/src/Nethermind/Nethermind.Trie.Test/TrieTests.cs#L26) and check what we get as the RootHash. Then, migrate those tests into Paprika and use the obtained hashes as expected values during testing.

@emlautarom1
Copy link
Contributor

Following the approach described above, I've written the following test in the Nethermind client (NC moving forward) and translated it into Paprika:

[Test]
public void Single_account()
{
    MemDb memDb = new();
    using TrieStore trieStore = new(memDb, Prune.WhenCacheReaches(1.MB()), Persist.EveryBlock, _logManager);
    PatriciaTree patriciaTree = new(trieStore, _logManager);

    var key = new Keccak(new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7 });
    var account = new Account(23, 13);

    patriciaTree.Set(key.Bytes, new AccountDecoder().Encode(account).Bytes);
    patriciaTree.Commit(blockNumber: 0);
    patriciaTree.UpdateRootHash();

    var rootHash = patriciaTree.RootHash;
    TestContext.WriteLine(rootHash);
}

This, translated to Paprika looks something like:

[Test]
public void Single_account()
{
    using var db = PagedDb.NativeMemoryDb(SmallDb);

    var key = Values.Key0;
    var account = new Account(Values.Balance0, Values.Nonce0);

    using (var batch = db.BeginNextBatch())
    {
        batch.Set(key, account);
        batch.Commit(CommitOptions.FlushDataAndRoot);
    }

    using (var batch = db.BeginReadOnlyBatch())
    {
        var rootHash = batch.GetRootHash();
        var expectedRootHash =
            new Keccak(Convert.FromHexString("E2533A0A0C4F1DDB72FEB7BFAAD12A83853447DEAAB6F28FA5C443DD2D37C3FB"));

        Assert.That(rootHash, Is.EqualTo(expectedRootHash));
    }
}

Key and Account are expected to be the same in both projects.

This raises a couple of questions:

  • Is the NC test setup (ex. memDb, trieStore) and Paprika test setup (ex. NativeMemoryDb) correct?
  • Is NC serializing the Account in the same way as Paprika?
  • Could the blockNumber required when committing in NC be problematic?
  • I added GetRootHash to the IReadOnlyBatch interface for now. Is that correct, or should we define it somewhere else?

@emlautarom1
Copy link
Contributor

As for testing, eventually I think it could be useful to have some sort of test generator on NC: this generator would randomly produce a sequence of actions (setting, committing, ???), execute them, compute the root hash, and then save the sequence into a file. Then, Paprika could load those files, run the same actions, compute the root hash and validate it against the expected hash.

@LukaszRozmej
Copy link
Member

As for testing, eventually I think it could be useful to have some sort of test generator on NC: this generator would randomly produce a sequence of actions (setting, committing, ???), execute them, compute the root hash, and then save the sequence into a file. Then, Paprika could load those files, run the same actions, compute the root hash and validate it against the expected hash.

why not bring Nethermind Client modules to Paprika or Paprika to Nethermind Client for this and actually have it integrated? Or create a 3rd project that would integrate and fuzz both and then compare.

@Scooletz
Copy link
Contributor Author

This raises a couple of questions:

* Is the NC test setup (ex. `memDb`, `trieStore`) and Paprika test setup (ex. `NativeMemoryDb`) correct?
* Is NC serializing the Account in the same way as Paprika?
* Could the `blockNumber` required when committing in NC be problematic?
* I added `GetRootHash` to the `IReadOnlyBatch` interface for now. Is that correct, or should we define it somewhere else?

What NC does it uses the RLP serialization of a TreeNode that is of type of Leaf and then hashes it. It uses the idiomatic way of storing nodes. In Paprika if you'd like to get a hash of a leaf, you need to gather parts, write their RLP and then Keccak the output.

I don't think the block number is problematic at all, as Paprika needs to deliver a single hash at the end of the day. The hash is the hash of the state tree. The rest, how it's used etc. it's up to NC.

It's fine to have GetRootHash it there for now and discuss later. Especially, that once #88 is merged the API usage will change a bit. I think it should not impact your work as the underlying premise is to still be working with the existing components of the tree (data pages etc.)

As for testing, eventually I think it could be useful to have some sort of test generator on NC: this generator would randomly produce a sequence of actions (setting, committing, ???), execute them, compute the root hash, and then save the sequence into a file. Then, Paprika could load those files, run the same actions, compute the root hash and validate it against the expected hash.

That would be the ultimate testing scenario. Initially, it'd be great to have a suite of unit tests that are capable of showing that basic scenarios are covered. Unfortunately one needs to write the same tests with NC first so that the value is calculated properly. We could have a submodule for NC or build some parts as packages to be reused.

@LukaszRozmej
Copy link
Member

For parallelization - if you can flatten the work items, then you can easily parallelize the whole process, regardless of branching. Even if you can't it is possible to parallelize it based on some counter. In NC we use Semaphore for that and either visit children in single-threaded manner if threads are saturated or invoke a multi-threading visit if there are threads available.

@Scooletz Scooletz self-assigned this Jun 26, 2023
@Scooletz Scooletz added this to the Merkleization milestone Jun 26, 2023
@Scooletz Scooletz removed this from the Merkleization milestone Aug 9, 2023
@Scooletz Scooletz added the 🌴Merkle Merkle construct label Aug 9, 2023
@Scooletz
Copy link
Contributor Author

Scooletz commented Aug 9, 2023

Memoization will be addressed at #120

@Scooletz
Copy link
Contributor Author

Scooletz commented Aug 9, 2023

Storage will be handled at #121

@Scooletz
Copy link
Contributor Author

Scooletz commented Aug 9, 2023

The PageDb component will be handled at #122

@Scooletz
Copy link
Contributor Author

Scooletz commented Aug 9, 2023

The leftovers are in separate issues.

@Scooletz Scooletz closed this as completed Aug 9, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in Paprika Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ethereum An Ethereum specific work item that requires a good understanding of Eth 🌴Merkle Merkle construct
Projects
Status: Done
Development

No branches or pull requests

3 participants