Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a DEP proposal for a standard about related feeds #134

Closed
serapath opened this issue Mar 26, 2020 · 24 comments
Closed

a DEP proposal for a standard about related feeds #134

serapath opened this issue Mar 26, 2020 · 24 comments
Labels
announcement Announcing something to the DAT community.

Comments

@serapath
Copy link
Member

serapath commented Mar 26, 2020

Deadline: no deadline
Link: datdotorg/datdot-research#17 (comment)
Call for Action: please read through and give some feedback/suggestions/questions/...

@serapath serapath added the announcement Announcing something to the DAT community. label Mar 26, 2020
@dan-mi-sun
Copy link

count cobox in for a discussion about standards. we're keen to have cobox and datdotorg be as easily interoperable as possible :-)

@okdistribute
Copy link
Contributor

okdistribute commented Mar 26, 2020

Count mapeo also in, although I think we wouldn't adopt a standard that includes a 'manifest feed' -- I would propose scoping back the proposal to make minimal changes to the existing codebases.

Adding an optional 'manifest' to the handshake that includes all discovery keys to be replicated is what we do in multifeed and what we could commit to with mapeo.

Could this be adapted for corestore as a way to request all the discovery keys present in it? Right now, corestore only does 1 at a time using a discovery-key event. cc @andrewosh

@cblgh
Copy link

cblgh commented Mar 26, 2020

fwiw i think whatever ends up working for mapeo will probably also be able to work for cabal, but maybe @noffle can correct me on that (having experience w/ both) when she has spoons to do so

also kudos on starting the discussion this way, @serapath! i think this is a great approach

@serapath
Copy link
Member Author

serapath commented Mar 27, 2020

thx everyone for you support :-) I was a bit anxious putting it together and it's still a bit messy.

thx @cblgh thats nice to hear - i'm really trying, but it's pretty tough for me to wrap my head around all the issues with the different approaches :-)


thx @dan-mi-sun that's lovely to hear :-) and thx @okdistribute for your support :-)

Adding an optional 'manifest' to the handshake that includes all discovery keys to be replicated is what we do in multifeed and what we could commit to with mapeo.

Yes, I know. I checked the source code - but ...

Could this be adapted for corestore as a way to request all the discovery keys present in it? Right now, corestore only does 1 at a time using a discovery-key event.

mafintosh said he would not support an extension message based approach, because it lacks the "trust guarantees" of a feed based approach.

I understand some issues with a manifest feed, which according to mafintosh is a performance hit, but on the other side, it could be an additional feature. I was talking to @mafintosh and he actually recommended me the approach to use the Custom Header that DEP-0007 specifies, but also to use a manifest feed because that's how there are guarantees about the related feeds, while using just extension messages lacks the trust and the history feeds with merkle trees provide.


@martinheidegger commented on the linked issue above, but i will respond here:

I was talking to mafintosh and am aware of the performance issue, which is why I was thinking about the manifestfeed as something additional.

The first message in a hypercore is already now supposed to include information which identifies the data structure type according to DEP-0007 - so this is already an existing standard and ideally data structures and/or protocols should already support it.

The good part of the manifestfeed is, that it gives you better guarantees, while raw extension messages can't be trusted at all to my knowledge. The manifestfeed is the approach recommended by mafintosh too.

@serapath
Copy link
Member Author

serapath commented Mar 27, 2020

edit: related details are written down in a section of a comment here: datdotorg/datdot-research#17 (comment)

What about listing all the approaches and then specifying multiple standards?

e.g. we define some kind of DEP-0011 and specify multiple approaches, like:

  1. to get feed (data structure) type use DEP-0007 approach
  2. use the manifest feed approach to get all related feeds
  3. or use the manifest extension mesage approach to get all related feeds

Then any data structure can choose one of the different approaches (we can add more to the list above if needed) and for each approach we can define and list the reasons and the features or pro's and con's. So a service (e.g. a "generic hosting service" like datdot) that needs to know the (data structure) type and/or the related feeds can try all the approaches we specify and hopefully one of them will work :-) Additionally it's possible to support the specialised approaches of each individual type that a service encounters to have improved performance, which is what is probably needed for hyperdrive anyway.

@okdistribute
Copy link
Contributor

that would work!

@serapath
Copy link
Member Author

serapath commented Mar 27, 2020

edit: adding on top of above comment #134 (comment)

What if multiple approaches are supported but give conflicting answers regarding which feeds are related? I'd like to avoid that.

otherwise...

The manifest feed approach would need to know message 0 of the main feed and then decide:

  1. if the message tells it's a hyperdrive, we anyway have to deal with it
  2. if it tells manifest feed header, we roll with that
  3. if it doesnt have a manifest feed header we proceed with manifest extension message
  4. ... @okdistribute you mentioned maybe we could use a different multifeed like structure on the chat, but you didn't go into details what you meant with that.

All in all - the feature and the order in which the mechanisms take precedence are meant fot be used in cases where a peer doesn't know the data structure and wants to get related feeds for (e.g. pinning and stuff...), while keeping other existing mechanisms that data structures use for performance, which means it could also be an opt-in feature for loading after you get the initial data through e.g. an extension - that only certain peers will bother invoking?

That order could be included in the specification, so conflicts would not happen, because the process stops before an alternative approach could even mention something which is conflicting and as it stands, for multifeed it would always trigger (3.) and for hyperdrive it would always trigger (1.) and for other corestore related data structures that will exist in the future - hopefully it will trigger (2.) or if they choose to do so, it could also trigger (3.) :-)

@martinheidegger martinheidegger pinned this issue Apr 2, 2020
@martinheidegger
Copy link
Contributor

martinheidegger commented Apr 24, 2020

@serapath Do you still wish this to be open? Should we announce this further?

@serapath
Copy link
Member Author

I can take the gist of what was said here and move it to a different place and reference this issue and then it can be closed. When we have more progress, I can open an issue again :-)

Do you think that would be better?

@okdistribute
Copy link
Contributor

How about the datprotocol/deps repository?

@martinheidegger
Copy link
Contributor

@serapath let's keep the announcement here just for times when you need feedback that we can collect at comm-comms. As I understand, at this points other people's input can not really help this issue, right? Then let's close this issue and open again when a clear question to the community arises.

@cblgh
Copy link

cblgh commented May 2, 2020

small ping that cabal is very interested in this kinda thing atm :3 especially with regard to enabling synchronizing of encrypted hypercores using a single identifier

i.e. each log from a cabal is encrypted, and the set of logs for a particular cabal is discoverable over the identifier, just a long cabal key c.f. ciphercore's blindKey

@serapath
Copy link
Member Author

serapath commented May 4, 2020

we are very very busy trying to finally get the first MVP ready to finish our first milestone, but all of this is still very high priority. Our first milestone will support only hypercore itself, but as soon as we finally start our second and then third milestone, which are actually long overdue we urgently need to work on making more complex data structures work which use more than one hypercore, so there is a lot of time reserved to get back to this very issue :-)

@serapath
Copy link
Member Author

serapath commented May 5, 2020

@cblgh If you or cabal also maintains an issue, that would be great. Let's continue talking and pushing the details forward. Maybe we can do this immediately in parallel :-)

@okdistribute
Copy link
Contributor

yes I think cobox might also be interested in this

@serapath
Copy link
Member Author

serapath commented May 5, 2020

@dan-mi-sun do you have an issue in cobox that tracks this proposal?
Would be cool to link those kind of issues there, so there is always a place to come and visit and where projects can summarise their thoughts about this proposal and express what they agree with or what they would like to see changed :-)

That would make it easier to address things and be sure people or orgs and their concerns or ideas or rather wishes are not forgotten or lost :-)

@cblgh ...so same for cabal, would be good if you made a cabal/kappa/... issue somewhere and maybe link this one here? I for sure will read and follow it :D

@serapath
Copy link
Member Author

serapath commented May 9, 2020

regarding url schemes and protocol handlers, i added some thoughts here:

@serapath
Copy link
Member Author

serapath commented May 15, 2020

How about the datprotocol/deps repository?

@okdistribute i like the proposal you made 18 days ago and prepared myself already for that, but I guess that is no longer valid for reasons that are beyond my understanding. Lack of maintainers doesn't seem to be the reason, because I at least offered myself and to just take care of interoperability standards and nothing else seems manageable.

@serapath
Copy link
Member Author

serapath commented Jun 8, 2020

edit: adding on top of above comment #134 (comment)

currently "related feeds proposal" tells you a parent -> children relation, but it won't tell you the direction in reverse. There might be use cases where that is important to relate in the opposite direction, like:

  1. in case you lost your private key or accidentally corrupted your old feed and want to revoke
  2. or if a feed has been accidentally corrupted

Maybe there are many ways to solve use cases and those proposals should be seperate, but I will list below why they actually might be ...related :-)

related feeds proposal consists of chunk0 specifying:

  • manifest feed (or hyperdrive, or kappa extension messages) => ("parent pointing to children")
  • certificate feed => ("parent pointing to grandparent" - if one exists)

certificate feed key could work like this:
a standard to write into the first chunk of any feed some information about a "(self signed) certificate feed" (for announcements of revoked feed keys and/or replacement feeds and their keys)

  1. A client listening to a feed which specifies a certificate feed key will subscribe to that feed and/or it's parent certificate feed key, etc... if they exist.
  2. Whenever a revoke-and-replace message is received from a certificate feed about a feed that authorized that certificate feed in it's chunk0, that client will from then on listen to that new feed instead, and expect the new feeds merkle tree to be identical from their chunk 0 up to a particular length from which things are supposed to be a seemles "continuation" of the previous hypercore and continue as normal

@martinheidegger
Copy link
Contributor

martinheidegger commented Jun 9, 2020

Just to summarize yesterdays off-thread conversation: I don't think that you will be able to make a "certificate"-feed that is helping against the loss of private keys. Any additional feed that you post doubles the risk (doesn't half it) - our conclusion was: You need to keep it safe: period.

In some sense dat-dns or better: HyperDomains (system as outlined by @pfrazee here) would solve the issue of updating an existing reference.

As far as the issue about feed-corruption: I did write down my thought on this topic once: https://gist.github.com/martinheidegger/82dbf775e3ff071d897819d7550cb3d7 - I think it might be a reasonably solution to maintain an existing dat. But this solution is an edge-case that is hard to implement and test for and generally speaking it is understandable why we rather focus on making sure that the case becomes less-and-less likely - (the case was very common with dat 1.0, now with core-store it should be significantly reduced).

While the questions of feed identity and feed corruption are interesting, they seem to be distracting from this issue about common related-feeds. Dont you think?

Note: i mistook the gist link (they are unfortunately named similar) and updated it to the reflections on core healing

@serapath
Copy link
Member Author

serapath commented Jun 9, 2020

@martinheidegger
Copy link
Contributor

We had a discussion on this during the last dat conference: https://youtu.be/hzIU5X7g7PI

The content in this issue is quite long: @serapath would you be okay with closing this issue and maybe open one (or more) issues that summarize the current state?

@martinheidegger martinheidegger unpinned this issue Aug 13, 2020
@serapath
Copy link
Member Author

serapath commented Nov 5, 2020

Yes, I will open one or more issues and summarize the current state.

I'm quite busy right now so this will take a bit more time, but my perception is also that nobody needs the solution right away and is urgently waiting for it.

If anyone is reading this comment and needs it super urgent and wants to discuss things sooner, let me know - in that case I can see if I can do it sooner.

@martinheidegger
Copy link
Contributor

Closing the issue for now, looking forward to updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
announcement Announcing something to the DAT community.
Projects
None yet
Development

No branches or pull requests

5 participants