Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remotecache: Only visit each item once when walking results. #1577

Merged
merged 1 commit into from
Jul 17, 2020

Conversation

sipsma
Copy link
Collaborator

@sipsma sipsma commented Jul 17, 2020

Before this change, walkAllResults did not skip an item if it had already been
visited, so walking the graph of results actually followed every single possible
path in the graph instead of just visiting each item once.

On small graphs this wasn't noticeable, but on sufficiently large remote cache
imports this rapidly scaled out of control. For example, I first encountered
this when importing a max-cache from a build of a full linux rootfs from source
(which takes about 30 minutes to build w/out cache on an 18-core machine) and
after 30 minutes the cache import was still running with all CPUs pegged at 100%

To fix this, walkAllResults now keeps a map of already visited items (keyed by
their pointer) and skips visiting an item if it's already been visited, making
it usable on large remote cache imports.

Signed-off-by: Erik Sipsma [email protected]

Before this change, walkAllResults did not skip an item if it had already been
visited, so walking the graph of results actually followed every single possible
path in the graph instead of just visiting each item once.

On small graphs this wasn't noticeable, but on sufficiently large remote cache
imports this rapidly scaled out of control. For example, I first encountered
this when importing a max-cache from a build of a full linux rootfs from source
(which takes about 30 minutes to build w/out cache on an 18-core machine) and
after 30 minutes the cache import was still running with all CPUs pegged at 100%

To fix this, walkAllResults now keeps a map of already visited items (keyed by
their pointer) and skips visiting an item if it's already been visited, making
it usable on large remote cache imports.

Signed-off-by: Erik Sipsma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants