Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer the output of gzip.Writer to avoid stalling #923

Merged
merged 1 commit into from
Jan 28, 2021

Conversation

jonjohnsonjr
Copy link
Collaborator

@jonjohnsonjr jonjohnsonjr commented Jan 28, 2021

Use a bufio.Writer to buffer gzipped output while we are reading from
the other end of an io.Pipe to allow gzip to keep compressing its input.

A 64K buffer was chosen for its humor value. The default size of
bufio.Writer was too small when testing against a local registry.
Increasing beyond 64K didn't seem to have any noticeable effect. It
might make sense to make this smaller, but I don't see a reason to worry
about it ATM.

Fixes #920

In local testing, this improved write speed from ~8MB/s to ~30MB/s. Against an actual registry (Docker Hub or GCR), this seems to improve things by ~30%.

Use a bufio.Writer to buffer gzipped output while we are reading from
the other end of an io.Pipe to allow gzip to keep compressing its input.

A 64K buffer was chosen for its humor value. The default size of
bufio.Writer was too small when testing against a local registry.
Increasing beyond 64K didn't seem to have any noticeable effect. It
might make sense to make this smaller, but I don't see a reason to worry
about it ATM.
@jonjohnsonjr
Copy link
Collaborator Author

Thinking about it a little more, 64K is probably pretty good, actually. The Read()s seem to be happening in ~32K chunks, so having that plus some additional buffer should remove gzip as the bottleneck for a while.

Copy link
Collaborator

@imjasonh imjasonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@jonjohnsonjr
Copy link
Collaborator Author

We might want to look at using a sync.Pool if we find that this causes the garbage collector to churn too much: https://github.com/containerd/containerd/blob/4d7d63f390e73f62676cd4994233805a6f821ab7/archive/compression/compression.go#L58

I was poking around containerd and docker to see if they do anything like this, in the hopes of landing a fix against them too, and it seems they do even better :P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pushing large images to Dockerhub 504
2 participants