-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Way to get CIDs of intermediate objects when querying with a path #8526
Comments
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
Finally, remember to use https://discuss.ipfs.io if you just need general support. |
An alternative to returning an array of CIDs that were traversed over that would accomplish the same thing would be to instead return a CAR file containing the CIDs AND data for every ipfs object along the path that was given in the initial query. |
2022-01-07 discussion: this would be be common usecase-specific form of #8239 . We'd likely implement this specific usecase using the more generic form of being able to fetch for a specific selector. @stbrody : do you have a sense from Ceramic's perspective as to which of these two is higher priority? Also, this isn't something the go-ipfs mainteners expect to getting to in the short term but could certainly direct others into where/how to solve. I'm marking this as blocked until #8239 is handled. |
I suppose if #8239 were done in such a way that we could get the entire tree structure loaded onto our local ipfs node, then doing multiple iterative calls over the same paths in the tree wouldn't be nearly as bad. You'd still wind up re-processing the same path multiple times, but with data that's all local so it will be much more performant. My sense is that both this and #8239 are valuable in different ways, but I'd imagine this one would likely be easier to implement. And there are cases where this ticket actually helps more than #8239 does. Like if you're doing an in-order traversal over part of a tree structure. If you're only going to wind up processing some part of the tree, then pulling the whole tree to your local node is overkill, which can be especially bad if the tree is large. It would also be bad if you had to wait for all the data matching the selector (in this case the whole tree) had to be loaded locally before you can get the result from the first item you want to process.
Yes, that's more or less what I'm imagining, though I'd want it exposed via the http-client.
I'll defer to @oed on this one |
FWIW, while I do see these two as related in the use cases they help improve, technically I think they're probably fairly independent. |
2022-03-18 conversations: maintainer priority and plan of record is:
General: As people have been asking about selectors, we're going to add them to more APIs but we don't want to overload users with the more complicated selector syntax. We're treating paths and selectors separately (resolve the path and then apply the selector). |
2022-06-03 conversation: this is still blocked per the discussion above. There will be a relevant gateway selector spec in the next month. |
Checklist
Description
Summary:
dag.get
with apath
argument should be able to return an array of CIDs, representing all the intermediate IPFS objects it traversed along thepath
to eventually reach the object it ultimately returns. That would enable much more efficient sequential iteration over complex IPLD data structures.Use case:
Imagine you are trying to do an in-order traversal over a tree structure encoded in IPLD. From knowing the number of elements in the tree (which could be stored in the root of the tree) and how many children each intermediate node has, you can deterministically calculate the depth of the tree. That would allow you to build a
path
selector specifying the path from the root of the tree to the left-most leaf node fairly easily, which could then be passed toipfs.dag.get
to get the data from the first leaf node in the tree. But now you want to fetch the second leaf node. You could once again deterministically build apath
selector from the root to the second leaf node, but that would have the path once again running from the root, which if the tree is large may involve traversing many intermediate nodes multiple times. Instead, ideally you'd like to already have the CID of the parent node of the first leaf node, and then be able to issue a new query with just the path from that parent node to its second child to get the second leaf node of the overall tree. The problem is thatdag.get
with the path to the first leaf node will only return the data of the leaf node, not any information about the intermediate nodes it passed through to get there, so you have no way to know the CID of its parent. If thedag.get
call returned not just the data from the first leaf node, but also an array of the CIDs it passed through to get there when traversing the path, then you'd be able to intelligently pop CIDs off the back of the resulting list to move back up the tree, and issue new dag queries with new paths to other children nodes as you continue to iterate over the tree structure.The text was updated successfully, but these errors were encountered: