Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a TraversingLinksystem supporting traversal resumption #354

Closed
wants to merge 3 commits into from

Conversation

willscott
Copy link
Member

This is a proposal for how we would extend the current Progress / traversal structure to optionally support efficient resumption of traversals.

The trade-off is that more state is retained in order to know which links have been seen by which points in the traversal. This proposal handles that decision by allowing the user to wrap their Progress via WithTraversingLinksystem. This method wraps the linksystem block loader in order to track and appropriately skip forward when a subsequent resumer is used.

I'd be happy if we could arrive at a more intuitive design, but this seems like the path of least resistance for maintaining current API compatibility.

Copy link
Contributor

@mvdan mvdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the cost of resuming a traversal? As a user, that would be my main worry. For example, if I have a deeply nested graph, and I want to resume at a deep point, will the resumption need to do work to get back there? One can imagine a similar situation with very wide nodes, such as a very long list where I want to resume a traversal towards the end of the list.

position: 0,
pathOrder: make(map[int]datamodel.Path),
pathTree: newPath(nil),
target: nil,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can omit zero values from composite literals

return []datamodel.Link{pn.link}
}
return []datamodel.Link{}
} else if len(segs) == 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use a switch

next := segs[0]
if child, ok := pn.children[next]; ok {
return child.allLinks()
} else if !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can just be an else

@willscott
Copy link
Member Author

What is the cost of resuming a traversal

Instead of keep track of all previous links that have been followed, which is already done in a progress, you are now keeping track of the ipld.PathSegment map as well.
If you have many small / highly-pathed objects relative to the number of links this could be larger than the existing state, but it shouldn't be that bad overall.

Copy link
Collaborator

@hannahhoward hannahhoward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinating.

I think the code works on its own but I believe the comment I left would make it slightly more efficent.

traversal/traversing_linksystem.go Outdated Show resolved Hide resolved
@warpfork
Copy link
Collaborator

I think my main concern/ask for either this or #358 isn't even so much the code at this point -- it's: can we get a serial fixture to appear in ipld/ipld with it?

I realize some of this may feel like it's golang specific APIs, but I doubt it's going to stay that way -- other languages will need to replicate the behavior at some point, and we might as well be ready for them with fixtures. Also, personally, I'm finding serial fixtures a lot easier to review for this stuff.

There's a spec_test.go file in the same directory that should provide an example, although it may need some extending to test a new feature like this.

@willscott
Copy link
Member Author

I would like to leave this as a golang specific API at this point, and not attempt to define a specification for how resumption is exposed that is cross-language standardized until we understand what is appropriate in other language. It feels generally like we are not proposing a change in spec or standard at this point, but rather providing a language-specific convenience method.

@mvdan
Copy link
Contributor

mvdan commented Feb 16, 2022

@willscott regarding the cost - sounds good. Perhaps it would be good to summarize this in the docs; as someone reading the API docs and not knowing how traversals and selectors are implemented, it would be very hard to estimate how expensive a resumption could be.

@BigLep
Copy link

BigLep commented Mar 22, 2022

@willscott : is this still needed in light of #358? If so, are you able to incorporate @mvdan 's feedback?

@BigLep BigLep requested review from mvdan and removed request for warpfork March 22, 2022 14:27
@willscott
Copy link
Member Author

This is fairly separable, and I plan to close this PR and move this code with the go-car PR that makes use of it. It's probably not a generic interface we want to have to keep supporting forever at this library level, but it was useful for a couple related PRs that have been set up already using the same structure.

@BigLep
Copy link

BigLep commented Apr 5, 2022

@willscott : this came up during 2022-04-05 triage. Are you able to close this now?

@willscott
Copy link
Member Author

Closing as this logic has now been fully transferred to ipld/go-car#291

@willscott willscott closed this Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

5 participants