You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an on-demand sync is triggered, the ctx to the call to SyncImage() comes from the request context.Context.
Thus, when the user gives up, the image sync is also canceled.
This is unfortunate, I think, in many use cases where Zot is used as a proxy cache.
Even if we couldn't make it on time this time, it isn't that we don't want to have this image cached.
If the client has a timeout limit and a retry loop (like common setups with Kubernetes and containerd do - at least in mine), the current implementation will not be able to make any progress.
This is because, as soon as the client hits the timeout, the sync job fails (because it makes an HTTP request with a canceled Context, I guess), and when the client retries, the sync has to start over again.
It would be better if the pull keeps running in the background, so that the next time there's a request to this image, they could wait for the remaining progress and have any chance of seeing it finish.
Note that the current logic already supports the joining (to wait) to an ongoing pull in another goroutine:
Somewhere during the code flow, use a background Context instead of the request Context.
Note that we do not want to completely ignore the request Context; we need to halt the control flow and return when the request is canceled.
We could give syncImage a context.Background(), or a larger timeout (15 minutes?).
Then, below, a select {} with both syncResult and ctx.Done() would suffice.
Describe alternatives you've considered
Since we're spawning a goroutine on every cache miss, this could lead to a large number of background goroutines.
In that prospect, having a full-fledged background process controller (like limiting the number, etc.) could be a long-term goal.
Additional context
The image size in question is ~10 GiB, and network speed is as low as 10 MiB/s. That would take 1k seconds (= 16m 40s).
The text was updated successfully, but these errors were encountered:
I didn't go right into writing a PR because I'm not familiar with the code base and am not sure if this is OK.
The leaky goroutine thing feels a bit dangerous so I'd like more opinions on this.
Once there's a resolution, I'd be happy to come up with a PR.
Is your feature request related to a problem? Please describe.
cf. #2416
When an on-demand sync is triggered, the
ctx
to the call toSyncImage()
comes from the requestcontext.Context
.Thus, when the user gives up, the image sync is also canceled.
Code flow:
zot/pkg/api/routes.go
Line 518 in 0d0eae5
zot/pkg/api/routes.go
Line 1941 in 0d0eae5
zot/pkg/extensions/sync/on_demand.go
Line 76 in 0d0eae5
zot/pkg/extensions/sync/service.go
Line 311 in 0d0eae5
This is unfortunate, I think, in many use cases where Zot is used as a proxy cache.
Even if we couldn't make it on time this time, it isn't that we don't want to have this image cached.
If the client has a timeout limit and a retry loop (like common setups with Kubernetes and containerd do - at least in mine), the current implementation will not be able to make any progress.
This is because, as soon as the client hits the timeout, the sync job fails (because it makes an HTTP request with a canceled Context, I guess), and when the client retries, the sync has to start over again.
It would be better if the pull keeps running in the background, so that the next time there's a request to this image, they could wait for the remaining progress and have any chance of seeing it finish.
Note that the current logic already supports the joining (to wait) to an ongoing pull in another goroutine:
zot/pkg/extensions/sync/on_demand.go
Lines 54 to 61 in 0d0eae5
Describe the solution you'd like
Somewhere during the code flow, use a background Context instead of the request Context.
Note that we do not want to completely ignore the request Context; we need to halt the control flow and return when the request is canceled.
A good place could be here:
zot/pkg/extensions/sync/on_demand.go
Lines 76 to 81 in 0d0eae5
We could give
syncImage
acontext.Background()
, or a larger timeout (15 minutes?).Then, below, a
select {}
with bothsyncResult
andctx.Done()
would suffice.Describe alternatives you've considered
Since we're spawning a goroutine on every cache miss, this could lead to a large number of background goroutines.
In that prospect, having a full-fledged background process controller (like limiting the number, etc.) could be a long-term goal.
Additional context
The image size in question is ~10 GiB, and network speed is as low as 10 MiB/s. That would take 1k seconds (= 16m 40s).
The text was updated successfully, but these errors were encountered: