-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow users to pin a URI-M or all captures for a URI-R from their local replay UI #201
Comments
We need to understand what do we mean by "user" here. Pining functionality would work in the context of the IPFS service our API is connecting to. For replay however, we don't necessarily need to connect to a service, instead we can just query the global network to resolve the hash for us. By client, if we mean end users, then that scenario would be helpful if we were using client-side JS-based implementation. |
However we fetch the content, through ipwb replay or the js ipfs implementation, it would be useful to allow a user to pin the contents* for persistent offline viewing and allow them to view and share it without needing to reconnect to other peers. We may also want to allow the user that indexes the WARC and pushes to IPFS (via the ipwb indexer) to recommend pinning in the index -- for this I am think on an individual or very small group basis. * The locally stored dereferenced IPFS hash contents. Disregard the sloppy nomenclature. |
I am still not sure on which layer you are focusing on. Some of these things can be achieved by utilizing client-side caching too. |
This is dealing with the client side -- the user of the replay system. The discussion above was a tangent of allowing this functionality from the indexer's perspective. Per the ticket title, the goal is to add a mean of allowing a user to explicitly pin the payload retrieved from IPFS via the IPWB replay UI. I'd rather make this independent on the client, i.e., the browser -- the ticket essentially amounts to enabling client-side caching using an agent without regard to the user-agent user in subsequent replay. |
Replay users in general should have no business to tell the replay server to pin or not to pin the content on the IPFS service it is primarily connected to (which is not an essential piece for replay). However, the replay server itself might decide to ask it's corresponding IPFS server (if there is one) to pin more frequently resolved (or all) content locally for faster successive fetches. The ultimate client, that is the browser, can utilize regular caching of the combined response, which will only be useful if the same client is requesting the same URI-M multiple times. |
Ok, you're probably right in the scenario where the user viewing the replay web UI -- that user has no business dictating what's pinned with the potentially remote ipfs daemon. What if there was something akin to "pin locally" that could instruct a local daemon to pin what they're viewing, which more often than not will probably be the same local instance? The idea of browser-based caching is a few steps off, still, as we have yet to really resolve the impending issues of remotely accessing a ipwb replay instance (#146). The crux of his ticket is a single user, reading in a CDXJ shared with them (or locally generated), ensuring that the content of the hashes they push from WARCs are pinned -- a sort of base case. |
If the replay is connected to a local IPFS instance (or controlled by the same body), then we have a couple potential options to ensure availability of the content when resolved. The replay can ask to pin every resource when it is requested as pinning is an idempotent action so duplicate requests will cause no harm. If when creating index the content was pushed to the same IPFS node which is linked to the replay, the content will already be there, no need to perform explicit pinning. However, if the index was shared/moved elsewhere and/or the replay is connected to a different IPFS node, then we can have a separate process that can be run one time (after every index change) to pull all the references in the index to the local IPFS instance and pin them. This can be an independent process which is not tied to the replay system. |
From my understanding, it may be there now but potentially not in the future if garbage collected. As a related note, a user will likely wish to also pin all embedded resources. There exists an opportunity to pin resources (if replay is running locally) as they are fetched. This would allow a subset of the CDXJ entries to be locally pinned instead of requiring everything listed in the index. |
You are mixing something up here. At the moment, losing entries from the CDX would be disastrous, even if those entries are for the embedded resources, because from the replay perspective, they are all independent resources, it is the browser that put them together to compose the page they way it looks. Pinning resources locally and losing entries from the CDX would make them non-discoverable. Pinning, in my opinion can be done separately as a batch process independent of the replay system. One can grab the list of hashes extracted from the index or from the access log (or from any other source for that matter) and drain them down locally and pin them. This should be done at the node where the IPFS is running, and not necessarily couple with the replay. I think, separation of concerns is very important here. |
I agree and batch pinning is not what this ticket is about.
Entries will not be lost, just a subset will be pinned on an interactive basis. |
Related to #60.
When replaying a URI-M whose header and payload are accessed through another node via IPFS, the header and payload will eventually get garbage collected from the local system per https://discuss.ipfs.io/t/how-are-conflicts-handled/469/3 . Provide a UI element to allow a user to explicitly indicate that they wish to retain the URI-M (i.e., the payload and headers associated with the URI-M) on their local system. This can be accomplished by
ipfs pin
ning.Doing so will allow a user to accumulate captures locally and will facilitate collaboration of arbitrary sets of captures.
The text was updated successfully, but these errors were encountered: