Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Multi-Writer DEP #10

Merged
merged 11 commits into from
Jul 6, 2018

Conversation

bnewbold
Copy link
Contributor

@bnewbold bnewbold commented Mar 5, 2018

Current status: needs proof-reading, and there are some unresolved issues, but ready for review

Rendered pre-merge

Big TODOs:

  • write "Semantics and Usage" section
  • re-structure and re-write implementation details
  • examples (with valid messages)
  • security and privacy concerns

hypercore feed, and it is broadly considered best practice not to distribute
secret keys between multiple users or multiple devices. In fact, the current
hypercore implementation has no mechanism to resolve disputes or recover if
multiple agents used the same secret key to append to the same feed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps note that this is by design (for simplicity, scalability etc)

- secure key distribution and authentication (eg, if a friend should be given
write access to a hyperdb database, how is that friend's feed key found and
verified?)
- merge conflict resolution, potentially using application-layer semantics

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last part could be misunderstood. we detect conflicts and provide apis for the user to deal with those multiple values (if they want)

replicated data type" ([CRDT][crdt]).

[vc]: https://en.wikipedia.org/wiki/Vector_clock
[crdt]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great section

[scaling]: #scaling

TODO: brief note on scaling properties (eg, reasonable numbers of writers per
database)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to 1000 in the first impl will prob scale thing. noting that we have ways of scaling to millions of writers but thats for later versions :)

is written to `./local/`. All other feeds are written under directories
prefixed `./peers/<feed-discovery-key>/`.

TODO: the above disk format is incorrect?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the actual on-disk format? I don't have any actual hyperdbs with more than source and local on my laptop.

- when this node was written, the largest seq # in the third feed I have is 5

For example, Bob's vector clock for Alice's seq=3 entry above would be `[0, 3]`
since he knows of her latest entry (seq=3) and his own (seq=0).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth noting here that the order of the clock is not consistent across writers. it's only consistent the inflatedfeed message of the same feed


TODO:

- Why is this design the best in the space of possible designs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maps almost perfectly to append-only logs, proven tech (a HAMT is nothing new)

TODO:

- Why is this design the best in the space of possible designs?
- What other designs have been considered and what is the rationale for not choosing them?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we looked at a bunch of different data structures. it basically boils down to this: we need a kv store that is sparse friendly (ie no external indexing needed), not too much of a space overhead, roundtrip friendly, and that can be mapped on an append-only log.

not many approaches exists that fit this over than a trie. a normal prefix trie might end up being better (lex ordered instead of hash ordered) though but it's close to trivial to support both.

you mention complexity here. it's actually the same complexity to implement this datastructure in single writer mode as it is to impl the current hyperdrive one (same scheme, just uses hashes). the multiwriter one has more potential race conditions but a continuation of the the same idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these notes! TBC, these questions came straight from the DEP template.


- Why is this design the best in the space of possible designs?
- What other designs have been considered and what is the rationale for not choosing them?
- What is the impact of not doing this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving it up to the ecosystem to figure out multiwriter?

i think the only over valid approach would have been to try and solve multiwriter at the hypercore layer but it becomes just as tricky there as it has been getting this right

[unresolved]: #unresolved-questions

What is the technical term for the specific CRDT we are using?
"Operation-based"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure

"Operation-based"?

If there are conflicts resulting in ambiguity of whether a key has been deleted
or has a new value, does `db.get(key)` return an array of `[new_value, None]`?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.get always return an array of nodes and not values. this is to provide more context. in case of the scenario you describe one of the nodes would have value === null (we are considering adding a .deleted flag instead)



# Changelog
[changelog]: #changelog

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimpick has been instrumental and getting this production ready as well

@aral
Copy link

aral commented Jun 6, 2018

I’m sure you guys have already looked into the current academic research around CRDTs but I just came out of a deep dive where I reached the same conclusions (DAG with publickey auth weaved into the graph). On the conflict resolution side, have you evaluated the causal tree/Logoot/LSEQTree approaches at all?

Links to all three from the quick summary of my own research:
https://indienet.info/other/spikes/crdt/

PS. Given that you’ve basically built what I was planning to, I believe we are going to be using DAT for Indie Site going forward which makes me extremely happy :)

@bnewbold
Copy link
Contributor Author

@mafintosh needs proof-reading, and there are some unresolved issues, but I think this is mostly ready for review. I left some questions in "unresolved".

I had been procrastinating this for a long time because I thought it was going to be really gnarly, but it's actually super simple compared to all the hyperdb trie nitty gritties.

Looks like the hyperdb DEP will need a small update with the deleted and Header changes.

@bnewbold bnewbold changed the title WIP: Multi-Writer DEP Draft: Multi-Writer DEP Jun 17, 2018
multiple nodes will be returned. If a key has been deleted from the database, a
node with the `deleted` flag will be returned; note that such "tombstone" nodes
can still have a `value` field, which may contain application-specific metadata
(such as a timestamp).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if all nodes to be returned are deleted null will be returned instead (similar to looking up a non-written key)


Couldn't the local feed's sequence number be skipped from vector clocks,
because it's implied by the sequence number of the hypercore entry itself? Same
with the key in the feed list (for inflated entries).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, just added for now for simplicity

"Operation-based" or "State-based"?

What is the actual on-disk layout (folder structure), if not what is documented
here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content/<disc-key> <-- any content feed
peers/<disc-key> <-- writers
source <-- original writer
local <-- local writer if not original

@pfrazee
Copy link
Contributor

pfrazee commented Jun 21, 2018

Great work, @bnewbold.

Do we need to take some time to discuss the permissions schemes? I thought we were going to have an owner / writer distinction, where owners can authorize and writers can only write.

[unresolved]: #unresolved-questions

What is the technical term for the specific CRDT we are using?
"Operation-based" or "State-based"?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tough to answer because there's only one op currently (put) but I'm fairly sure it's ops-based.

Op-based vs State-based

@bnewbold
Copy link
Contributor Author

bnewbold commented Jul 6, 2018

I integrated most comments.

@pfrazee I think this is the first I've heard of an owner/writer permissions distinction; is that in current hyperdb? If it wasn't in the 3.0 release, I think at this point i'd say we should ship this as draft and update as things change.

@pfrazee
Copy link
Contributor

pfrazee commented Jul 6, 2018

@bnewbold sounds good

@bnewbold bnewbold merged commit 3ad7ed8 into dat-ecosystem-archive:master Jul 6, 2018
@bnewbold bnewbold deleted the dep-multiwriter branch July 6, 2018 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants