Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Progress Controller #2258

Open
sipsma opened this issue Jul 15, 2021 · 0 comments
Open

Refactor Progress Controller #2258

sipsma opened this issue Jul 15, 2021 · 0 comments

Comments

@sipsma
Copy link
Collaborator

sipsma commented Jul 15, 2021

As discussed in #2254 the progress controller code has some known specific issues in addition to it (and the progress writing in general) being hard to reason about:

  1. The case mentioned in fix dropped pull progress output due to canceled context #2254 where progress inside a flightcontrol group might get lost during parallel calls
  2. Trying to keep track of the state of a progress stream (i.e. whether one needs to be closed, which one is being used within a flightcontrol group, etc.) is difficult and involves tracking hierarchies of Contexts. See i.e. goroutine leak on solves #2112 and Add retry on image push 5xx errors #2043
  3. The progress.Controller is passed through descriptor handlers which then get passed around all over the place through refs, meaning events occurring on single cache record need to synchronize with the controller callbacks that originate from all over the codebase. This makes the above point significantly more convoluted.

All of the above makes the code involving progress (of which there is a lot) harder to change and more bug-prone.

There's a lot of possible approaches to improving this. One initial suggestion would be to centralize progress handling in a singleton ProgressManager object, which has a pub-sub API for publishing events and subscribing progress streams to them. So, as a more concrete example:

  1. At the beginning of a build, a progress stream is registered with the ProgressManager
  2. As the build progresses, it subscribes its stream to certain events it wants to show progress for. E.g. the pull code would, instead of setting up a progress controller in the descriptor handler, just call the ProgressManager and subscribe the current stream to any events for the descriptors it's creating refs for.
  3. Event publishing would happen independently, e.g. when a descriptor is being de-lazied and actually pulled, the CacheManager would just publish those events to the ProgressManager, which would internally handle forwarding those events to any open progress streams subscribed to them. This synchronization of writers+readers is much simpler due to centralization of it.
  4. At the end of the build, the progress stream is closed with the ProgressManager

That's just one starting idea, in general anything that centralizes synchronization logic and decouples the different parts of the code that read/write progress would probably be a promising approach.

cc @coryb @tonistiigi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant