Extract db-related methods from baseTrie #74

s1na · 2019-01-16T14:27:00Z

This PR extracts db-related methods such as getRaw, putRaw from BaseTrie and moves them to a new class DB. It also modifies how checkpointing is done in CheckpointTrie.

The goal was to do these two modifications according to this comment, but after implementation I figured out some problem with that design. I experimented with a few other designs, but they all seemed hacky like the current checkpointing method. This PR is one of those methods which I liked a bit more than others. But if you think it doesn't improve on the status quo, we can close this PR.

To achieve checkpointing we somehow have to make the original trie read-only, and perform all subsequent writes to a temporary (scratch) db. On revert the scratch db is dropped and trie root is set to the checkpointed root. On commit nodes in the scratch db which correspond to the current trie (with current root) have to be written to the main persistent db.

For this purpose, this PR introduces a ScratchDB, which is an in-memory DB with a backend (pointer to main persistent db). On checkpoint, CheckpointTrie replaces the db that BaseTrie uses with an instance of ScratchDB, therefore all writes are written to the scratch. If a key is not found in the ScratchDB, it tries to fetch it from its backend.

I also have some questions regarding getRaw and putRaw. My impression from reading the existing code is that putRaw will write to the persistent DB even if trie is during a checkpoint. Is that intentional? If so, should getRaw and delRaw also be done on persistent db regardless of checkpointing? We should probably highlight this in documentation, or rename them to make this side-effect clear.

Signed-off-by: Sina Mahmoodi <[email protected]>

coveralls · 2019-01-16T14:37:25Z

Coverage increased (+0.3%) to 93.613% when pulling 4ea8d58 on refactor/db into 029b5fe on master.

holgerd77 · 2019-01-16T14:38:01Z

Whew, at least tests are already passing, great! 👍 😄

holgerd77 · 2019-01-17T12:36:22Z

Think I like this approach, this get's definitely more readable and distinctly sorted out regarding the functionality respectively responsibility.

I wonder if we should generally switch to a more directory-structured file structure (and also take on some breaking import changes as a consequence), suggestion:

main directory
- base.js
- checkpoint.js
- index.js
- secure.js
- node.js
- proof.js
util directory
- async.js
- hex.js
- nibbles.js
- tasks.js (or eventually merge with async.js)
db directory
- index.js
- scratch.js
readStream directory
- index.js
- scratch.js

Update: Updated the above with latest reflections/ideas.

holgerd77 · 2019-01-17T12:37:12Z

Find this file and naming mix generally still too messy...

holgerd77 · 2019-01-17T14:19:22Z

Updated the directory structure above with latest reflections/ideas on this.

Side node: we can also address this in a separate PR.

holgerd77 · 2019-01-18T08:25:29Z

Will do some more proper review now. 😄 Let's probably address these file structure changes in a separate PR, will eventually also open a new issue on this. You can nevertheless give some first opinion if you want.

holgerd77 · 2019-01-18T08:34:35Z

Tested the basic usage example, worked.

holgerd77

Looks more than it is (my review comments), generally this looks really good and very happy with the changes. This brings a lot more transparency in the behavior of the library, also eventually allows in the future to do easier switches of the DB backend store (respectively allows to easier add this functionality to the API + code base).

holgerd77 · 2019-01-18T08:39:51Z

src/baseTrie.js

+      this.db = db
+    } else {
+      this.db = new DB(db)
+    }


I am a bit undecided here, since this adds complexity to the public API. Do we (already) want to expose the DB class through this and allow the additional instantiation with DB? Would you say this should now be the recommended way of using the API? Or should we stick for now to just keep the API as is and just forward the level instance passed?

You're right, we should probably keep DB internal and have the public interface work with level instance.

I have actually a tendency to the other way around, to enforce instantiation by using the DB wrapper class. Then we already have some good preparation if we want to add new storage options beside from level DB in the future. What do you think? I am in the breaking-changes mood! 😄

src/baseTrie.js

src/checkpointTrie.js

src/scratch.js

s1na · 2019-01-21T08:06:28Z

Thanks for the extensive review! Will go through your comments now.

Re. the file structure, agree and what you proposed looks good for the current state. Only maybe we can wait a bit more before solidifying the structure, as there's another PR incoming, and then there's transition to typescript which could offer new opportunities for structuring.

holgerd77 · 2019-01-21T08:58:49Z

Makes very much sense to let the file structure changes sink in a bit more and do this along TypeScript transition. Have opened a separate issue #75 where we can continue discussion respectively track the latest state of proposals on that.

holgerd77 · 2019-01-21T09:18:36Z

Ok, have answered everything. Let me know if you have questions or would rather tend to choose other paths than the ones I have suggested.

Signed-off-by: Sina Mahmoodi <[email protected]>

s1na · 2019-01-21T15:09:27Z

Tried to apply your suggestions. Hope I didn't miss anything. Raw methods are dropped (and their test file). Also made createScratchReadStream private. DB uses this._leveldb internally to refer to the leveldb instance.

holgerd77 · 2019-01-21T16:09:42Z

Hmm, can't we use the tests on the DB class?

There is still a linting error preventing the CI run to pass.

holgerd77 · 2019-01-21T16:13:40Z

src/baseTrie.js

-    } else {
-      this.db = new DB(db)
-    }
+    this.db = db || new DB()


Can you update the API docs here as well?

Signed-off-by: Sina Mahmoodi <[email protected]>

holgerd77 · 2019-01-22T10:18:57Z

Ready for re-review? 😄

s1na · 2019-01-22T10:21:20Z

Yep 😄 I was just waiting to see if CI passes.

I've replaced the rawOps tests with unit tests for db.js, scratch.js and added some additional cases to checkpointing tests (moved them to a separate file checkpoint.js).

holgerd77 · 2019-01-22T10:25:57Z

Cool, will directly go for it. 😄

(already a +0.4% test coverage increase, great! 👍)

holgerd77 · 2019-01-22T10:27:56Z

Update: ah, strange coverage behavior now the final report is -1.5%, I had a +0.4% shown while CI was still running. Hmm.

holgerd77 · 2019-01-22T10:57:18Z

Just as some note, no need to change in this round: maybe we should name DB even more generic to Storage or something, then this is already better named/prepared for some future storage backend exchange? The exposed DB API is pretty limited in scope (get, put, batch, copy), so it should be not too far-fetched to easily allow switching here in the future with other-backend-targeting wrapper classes?

holgerd77 · 2019-01-22T11:02:40Z

Also just as some note and not a PR blocker: these re-generated docs are better than nothing, but after all these changes we have to significantly restructure the docs to reflect the directory/file structure + API exposed.

holgerd77 · 2019-01-22T11:04:36Z

Maybe if we just release after TypeScript transition, that we also be the point for the docs update.

holgerd77 · 2019-01-22T11:14:26Z

Also some note: we had some extensive discussion with Alex when he was doing the changes around changing or not changing the API by the decision where to put the distribution files (so that require('secure.js') would not have to be changed to require('dist/secure.js'). We ended up to put these all in the root directory and exclude in .gitignore.

However after some time dealing with this I have to say this is a total mess. On TypeScript transition we should also switch to just the standard way we are doing on other libraries and put everything in the dist folder. There will be several breaking changes anyhow so this will be a good occasion to update.

Signed-off-by: Sina Mahmoodi <[email protected]>

s1na · 2019-01-22T11:22:06Z

Weird that coverage changed at the end! but luckily it helped us find a bug! A few points:

Added a simple put/get test for secure, as well as a copy test.
Added copy tests for checkpointTrie, both before any checkpointing and after. The after case didn't work. I've now fixed the problem.
The copy method of all tries doesn't actually copy the leveldb. This means if they copy trie and use both instances in parallel, they're using the same leveldb. Tried to add this to docs. I don't know however if this is desired (the alternative of copying whole leveldb is also not elegant).
I had forgotten to mention that I've changed the DB api. Its methods only accept Buffer type now and throw an error if input is not a Buffer. I thought this is a good opportunity to do this, as we're breaking API anyway :)

holgerd77 · 2019-01-22T11:27:10Z

👍
👍
If the behavior before was the same we can leave this for now, already a larger improvement if this is clearly stated in the docs now.
That's good/desired, hope we manage to get all breaking changes collected together at the end on release notes though! 😄

s1na · 2019-01-22T11:27:26Z

Regarding your comments:

Storage also sounds good. Yeah, the API is really simple, so in principle it should be possible to use various storage backends. We can also probably do experiments and benchmarks easily.
Yeah I think during transition to TS it would make sense to update docs.
Definitely agree, I wanted to bring this up, glad you mentioned it. Yeah, it'd make sense to put all in dist and ask them to be imported via something like const { SecureTrie } = require('merkle-patricia-tree') (so that import is not dependent on file path). Also, maybe make exports much more explicit.

holgerd77

Ok, had some last overall look, would give this a cautious GO now. Feel free to merge if you feel confident with it.

holgerd77 · 2019-01-22T11:31:23Z

On 3.: Think you have some better oversight then me on ways to organize this, feel free to realize the one you think is most appropriate. Just this throwing everything into root dir makes no sense.

s1na added 6 commits January 15, 2019 13:45

Mv raw methods to DB class

94a14b5

Signed-off-by: Sina Mahmoodi <[email protected]>

Add ScratchDB which CheckpointTrie uses

7ec6972

Signed-off-by: Sina Mahmoodi <[email protected]>

Rename checkpoint-trie to checkpointTrie

0692c38

Signed-off-by: Sina Mahmoodi <[email protected]>

Add comments to scratch

d157e34

Signed-off-by: Sina Mahmoodi <[email protected]>

Fix linting errors

c490a14

Signed-off-by: Sina Mahmoodi <[email protected]>

Regenerate docs

b8ada89

Signed-off-by: Sina Mahmoodi <[email protected]>

holgerd77 suggested changes Jan 18, 2019

View reviewed changes

holgerd77 mentioned this pull request Jan 21, 2019

More consistent and better structured directory/file layout #75

Closed

s1na added 3 commits January 21, 2019 15:46

Rm raw methods and their tests

829393d

Signed-off-by: Sina Mahmoodi <[email protected]>

Rename DB._db to DB._leveldb

3a17d8c

Signed-off-by: Sina Mahmoodi <[email protected]>

Make createScratchReadStream private

2b307fb

Signed-off-by: Sina Mahmoodi <[email protected]>

holgerd77 reviewed Jan 21, 2019

View reviewed changes

src/baseTrie.js

} else {

this.db = new DB(db)

}

this.db = db || new DB()

Copy link

Member

holgerd77 Jan 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the API docs here as well?

s1na added 3 commits January 22, 2019 09:07

Fix linting error and jsdoc comments

5c99116

Signed-off-by: Sina Mahmoodi <[email protected]>

Regenerate docs

defcc7e

Signed-off-by: Sina Mahmoodi <[email protected]>

Add tests for db, scratch and more cases for checkpoint

a4c7f54

Signed-off-by: Sina Mahmoodi <[email protected]>

Fix checkpoint copy, add tests for checkpoint and secure copy

4ea8d58

Signed-off-by: Sina Mahmoodi <[email protected]>

holgerd77 approved these changes Jan 22, 2019

View reviewed changes

s1na merged commit c315566 into master Jan 22, 2019

s1na deleted the refactor/db branch January 22, 2019 13:17

s1na mentioned this pull request Feb 5, 2019

Data gets lost when using string as a key. (memdown + 2 tries) #77

Closed

holgerd77 mentioned this pull request Apr 29, 2019

Extract db-related logic from baseTrie #67

Closed

s1na mentioned this pull request Jun 24, 2019

Support for proofs of null/absence. Dried up prove/verify. #82

Merged

ryanio mentioned this pull request Apr 17, 2020

Release - v4.0.0 #111

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract db-related methods from baseTrie #74

Extract db-related methods from baseTrie #74

s1na commented Jan 16, 2019

coveralls commented Jan 16, 2019 •

edited

Loading

holgerd77 commented Jan 16, 2019

holgerd77 commented Jan 17, 2019 •

edited

Loading

holgerd77 commented Jan 17, 2019

holgerd77 commented Jan 17, 2019

holgerd77 commented Jan 18, 2019

holgerd77 commented Jan 18, 2019

holgerd77 left a comment •

edited

Loading

holgerd77 Jan 18, 2019

s1na Jan 21, 2019

holgerd77 Jan 21, 2019

s1na commented Jan 21, 2019

holgerd77 commented Jan 21, 2019

holgerd77 commented Jan 21, 2019

s1na commented Jan 21, 2019

holgerd77 commented Jan 21, 2019

holgerd77 Jan 21, 2019

holgerd77 commented Jan 22, 2019

s1na commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

s1na commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

s1na commented Jan 22, 2019

holgerd77 left a comment

holgerd77 commented Jan 22, 2019

Extract db-related methods from baseTrie #74

Extract db-related methods from baseTrie #74

Conversation

s1na commented Jan 16, 2019

coveralls commented Jan 16, 2019 • edited Loading

holgerd77 commented Jan 16, 2019

holgerd77 commented Jan 17, 2019 • edited Loading

holgerd77 commented Jan 17, 2019

holgerd77 commented Jan 17, 2019

holgerd77 commented Jan 18, 2019

holgerd77 commented Jan 18, 2019

holgerd77 left a comment • edited Loading

Choose a reason for hiding this comment

holgerd77 Jan 18, 2019

Choose a reason for hiding this comment

s1na Jan 21, 2019

Choose a reason for hiding this comment

holgerd77 Jan 21, 2019

Choose a reason for hiding this comment

s1na commented Jan 21, 2019

holgerd77 commented Jan 21, 2019

holgerd77 commented Jan 21, 2019

s1na commented Jan 21, 2019

holgerd77 commented Jan 21, 2019

holgerd77 Jan 21, 2019

Choose a reason for hiding this comment

holgerd77 commented Jan 22, 2019

s1na commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

s1na commented Jan 22, 2019

holgerd77 commented Jan 22, 2019

s1na commented Jan 22, 2019

holgerd77 left a comment

Choose a reason for hiding this comment

holgerd77 commented Jan 22, 2019

coveralls commented Jan 16, 2019 •

edited

Loading

holgerd77 commented Jan 17, 2019 •

edited

Loading

holgerd77 left a comment •

edited

Loading