-
Notifications
You must be signed in to change notification settings - Fork 548
Feature Request: Linked Dats/files #794
Comments
Related Discussion: #752 |
We'll keep this on our minds. I think subdats may end up being the solution for this but we'll see. |
What are subdats? Also I was thinking something like an additional resources list within dat.json |
Resource listing, for example in a manifest file, is actually a bit
controversial (I know @Treora has thoughts about this).
With <img>, <script>, <link>, etc., we already have a way to declare what
resources a website/app depends on. While using a manifest file does allow
you to do things like state which resources are mandatory and which are
optional, it also introduces maintenance problems. Every time you update a
<script> tag in your document you then have to update the manifest.
Realistically, manifest files won’t be well-maintained, so you have to
wonder if theyre worth using to solve this problem at all.
…On Fri, Dec 22, 2017 at 09:52 Hugh Isaacs II ***@***.***> wrote:
What are subdats?
Also I was thinking something like an additional resources list within
dat.json
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#794 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHO8QUa7BOYLFaRauvTk7kYybu2pZ3USks5tC9BBgaJpZM4RHrXl>
.
|
That said, this is an important problem to solve. We need to explore
whether or not it makes sense to also cache external assets when you save
an app to your library, or choose to help seed it. I’m just not sure that
using a manifest is the right choice.
…On Fri, Dec 22, 2017 at 11:25 Tara Vancil ***@***.***> wrote:
Resource listing, for example in a manifest file, is actually a bit
controversial (I know @Treora has thoughts about this).
With <img>, <script>, <link>, etc., we already have a way to declare what
resources a website/app depends on. While using a manifest file does allow
you to do things like state which resources are mandatory and which are
optional, it also introduces maintenance problems. Every time you update a
<script> tag in your document you then have to update the manifest.
Realistically, manifest files won’t be well-maintained, so you have to
wonder if theyre worth using to solve this problem at all.
On Fri, Dec 22, 2017 at 09:52 Hugh Isaacs II ***@***.***>
wrote:
> What are subdats?
>
> Also I was thinking something like an additional resources list within
> dat.json
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#794 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AHO8QUa7BOYLFaRauvTk7kYybu2pZ3USks5tC9BBgaJpZM4RHrXl>
> .
>
|
Thanks for looping me in Tara; I do have thoughts but no answers however. All I will note right now is that having a deterministic way to tell which assets constitute a document/app/site seems generally desirable, and it might be best to try find more generic solutions than solving this problem in a way that is specific to beaker/dats. I am not very knowledgeable about this, but here are some related specs and efforts I came across, that in some way list the resources required for offline consumption:
Instead of adding an explicit manifest with resources, I would also consider the option of extracting the required resources from the document itself; e.g. all the |
I think the biggest problem with this is that these tags can be changed at runtime. For example, Rotonde right now injects its list of dependencies ( |
I think the biggest problem with this is that these tags can be
changed at runtime. For example, Rotonde right now injects its list of
dependencies (|img|, |script|, and |style| tags) after you load the
initial |script| tag on the portal's |index.html|.
You would indeed need to explicitly declare that these will (possibly)
be required. The questions are (Q1) how, and (Q2) whether the
'statically' depended assets also have to be declared in that way. Two
answer sets that seem natural to me:
* (Q1) declare assets in a separate manifest file; (Q2) yes,
everything goes in there.
* (Q1) add <link rel="preload"> for each dynamic dependency; (Q2) no,
these links will be extracted just like the src of an img.
Also these are generally direct links to single files. For
dependencies that function more as a database (a folder of json
files), you would need to declare a dependency on a folderset.
Another good point. One thought on this: if you would solve this using a
syntax for folders, e.g. putting dat:1234ab/mydata/* in your manifest
file (or better even without the asterisk), you would also be able to
put it in a link tag. While http does not have a concept of folders,
perhaps dat urls do?
|
Subdats, and eventually a dat-cdn, who knows :-) |
Most of my observations are already captured in #752. I'll just add some observations. HTML elements are not the ideal place for this because any policy we'd want to create regarding "save to library" would operate at the site level, not the page level. So, we need a specific file that can tell us the policy information. JSON is much easier to parse, in that case, than HTML is, and you might not always have HTML in a site, but you will have the dat.json manifest. The 'subdat' concept is an idea that gets brought up a lot. In unix terms, it's basically a symlink from one dat to another. In git terms, it's like a submodule. It's a way to map an archive to a subfolder of another archive. Eg:
Subdats are interesting because they could solve a lot of problems at once -- one such problem being this question of caching dependent dats. We could do a policy where subdats are saved along with their parent dats. I've been hesitant to 👍 subdats so far because they also add complexity to the core rules of dat, but I think there's a good chance we'll end up implementing them eventually. I just want to give us time to think about it. |
We should consider the Web Packaging standard in our discussions about this |
Not sure if it was mentioned, but this sounds like a perfect extensionf or the existing dat.json manifest. Maybe something as simple as {
"title": "Application Title",
"dependencies": [
"url": "dat://4483a2..66/",
"url": "dat://4483a2..66/"
]
} This could have potential for performence improvements by pre-fetching the dat metadata when the initial metadata is being downloaded. Plus this is a dat-specific extension that could work for dats that weren't necessarily made to work with HTML or even a browser. |
I was under impression that Dat protocol also uses content addressablity via merkle trees under the hood (is it not) but it seems that unlike IPFS it is scoped to an individual archive. Are there technical reasons (other than implementation effort it wolud take) why Dat could not make content addressablity across all of the Dat protocol ? It seems like it would resolve the issue and likely improve overall network performance. In general I think supporting links at the protocol level say |
The concepts page in the docs and the security and privacy page have a pretty good overview of why it is the way it is. One of the main advantages of this is privacy. With IPFS where everything is content addresed, it's easy to globally see who has a given file. With Dat, you only know if somebody is looking for a specific dat. And if you don't know the URL, you don't know what's in it or who has it. If you're looking for a specific piece of content, it's impossible to know which dats contain in. |
Just returning to say that dat.json now has a links object. https://github.com/datprotocol/dat.json It's likely that'll be used for this feature. This opens dat.json up to the possibility of using the subresource, prefetch, dns-prefetch, preconnect, prerender and preload features in browsers, so those are options now. I vote for "subresource" it was a non-standard addition to Chrome (removed in Chrome 50) and while the term doesn't fit the HTTP web use case, I think it fits the Dat web well. Plus many developers are already familiar with using it and it's use in Dat sites wouldn't be far off from its original intent in Chrome (only problem I can think of right now is confusion with the subresource-integrity feature). EDIT: Also we should lock this feature down to just to specific files included in Dats not entire Dats as I can definitely see this being a hard drive space problem in the future. We have to avoid the situation where someone new to all of this loads terabytes of files onto many computers just because they wanted to use X amount of Dat based CDNs. |
@pfrazee sorry for necroposting, but just being curious if closing this issue means the idea faded off the radar, or it may have become irrelevant due to other developments? Might you have a pointer to discussions/publications reflecting current state of play, if there are any? You said above “I think subdats may end up being the solution for this but we'll see.”. And indeed, with the one-way mounts now having been introduced in Hyperdrive 10, I suppose one could mount all external resources’ drives and only use relative paths to point at them (though I guess you would have to mount their whole drives..). Does this solve the issue in your view? |
PS Also related seems this recent discussion in dat-ecosystem/comm-comm#134 about a format-agnostic approach to linked dats: “a generic seeding service should not need any data structure specific code to know how to seed the data.” (source) |
@Treora I do think mounts are our answer for Beaker. Ultimately for commanding any remote to cohost data, I think the API will be based on hypercores, so then the client commanding the remote needs to be data-structure aware |
@Treora thank you for linking the comm-comm issue and the source link. If you want to discuss further I'll answer here datdotorg/datdot-research#17 (comment) There are many ways why feeds need to be linked parent to dependant to dependencies, dependencies to dependant, domain to content, feed to author, related feeds amongst each other and I think it would be bad to have everyone (app/protocol/datastructure) make those things up instead of following a general standard |
I've been thinking about this quite a bit (especially in the context of torrents), we need a feature that let's Dats declare other Dats or files from them that'll be needed to function for when we store them offline in our libraries.
Like how almost all Rotonde pages use the same JavaScript file ("dat://2714774d6c464dd12d5f8533e28ffafd79eec23ab20990b5ac14de940680a6fe/rotonde.js").
There should be a way to tell the browser that when the user adds this Dat site to the library to prompt them to also store another Dat or specific files from it, complete with version support to protect the host site from any breaking changes.
The text was updated successfully, but these errors were encountered: