Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up Docker builds, especially Docker builds in CI #292

Closed
briansmith opened this issue Feb 7, 2018 · 6 comments
Closed

Speed up Docker builds, especially Docker builds in CI #292

briansmith opened this issue Feb 7, 2018 · 6 comments
Assignees
Labels

Comments

@briansmith
Copy link
Contributor

Docker builds are too slow, especially on CI. This causes us to try to avoid Docker for build performance reasons (see #280, for example).

I don't see anything inherent in what we're doing with our Docker builds that would make them necessarily slower than equivalent non-Docker builds. My hypothesis is that we can eliminate waste in the Docker builds and get them to be about the same performance as non-Docker builds. I'm experimenting with that now.

@briansmith briansmith self-assigned this Feb 7, 2018
@briansmith
Copy link
Contributor Author

See #293.

I'm also experimenting with precompiling some of the Go dependencies, similar to how we precompile Rust dependencies now.

@briansmith
Copy link
Contributor Author

I didn't have success getting builds faster in Travis CI, beyond the trivial win in #297.

I experimented with doing the proxy build + test (the slowest part of the build) in Docker on Google Clould Container Builder.

Using a 32-cpu machine dropped the release build + test time from >18 minutes to just over 7 minutes: https://console.cloud.google.com/gcr/builds/563ce218-c6f1-47fa-87e3-75b16bb419af?project=runconduit (Travis CI runtime was over 15 minutes.)

I rewrote some of the stuff to enable more incremental caching of the proxy build stage. Initial results show that the full build got ~2 minutes slower. I think we need to do a few iterations of the ">7 minute" build above and this ">9 minute" build to figure out what the real performance is.

With the incremental caching logic in place, I was able to do a <2min no-op (nothing in the proxy changed) rebuild using a cached proxy build image: https://console.cloud.google.com/gcr/builds/97ee2230-878b-4d39-8cd8-05fce409e0c7?project=runconduit

I also did a rebuild where one proxy Rust source file was changed, showing that the initial (deps) layers from the previous run were used: https://console.cloud.google.com/gcr/builds/1f12a1ec-d95b-4432-889a-a2cba370ac1a?project=runconduit. The intermediate layer caching saves about 2.5 minutes over the ">9 minute" build above. Again, we need to do a bunch of iterations to see if this caching is a significant win or not.

Regardless, all of these timings from Google Cloud Container Builder are significantly faster than the timings I got from doing the build on Travis CI. Thus, I think we should change the "Docker Deploy" stage of the Travis CI to do the image building in Google Cloud Container Builder.

/cc @olix0r @klingerf

@briansmith
Copy link
Contributor Author

I rewrote some of the stuff to enable more incremental caching of the proxy build stage. Initial results show that the full build got ~2 minutes slower. I think we need to do a few iterations of the ">7 minute" build above and this ">9 minute" build to figure out what the real performance is.

I see that the Container Builder has two different fields for showing the build time, which are about 2 minutes apart. Thus the ">7 minute" build and the ">9 minute" build took about the same amount of time.

@briansmith
Copy link
Contributor Author

Also, sadly, the 8-CPU builds weren't significantly faster than the 32-CPU builds. The 8-CPU builds were significantly faster than the default (1? 2?) CPU builds though.

@briansmith
Copy link
Contributor Author

#332, #331, #329, #327, and #325 are incremental improvements on this that speed up incremental docker builds of the controller by over 50% on my machine.

Further, on Monday I'm expecting to have a change ready that I expect to massively improves things.

@stale
Copy link

stale bot commented Oct 4, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Oct 4, 2018
@stale stale bot closed this as completed Oct 18, 2018
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant