-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak Travis to use GCE #28500
Tweak Travis to use GCE #28500
Conversation
r? @pcwalton (rust_highfive has picked a reviewer for you, use r? to override) |
cc me |
Maybe I'm misremembering, but I thought we preferred building our patched up LLVM for maximal similarity to the buildbots? |
27da22e
to
c4c61e7
Compare
Yeah that's nice to have, but we've also gotten more requests recently to ensure we work with a stock build of Travis, and we may want to start gating on Travis CI as well soon (in addition to buildbot). The gating should fix the "accidentally broken" problem and if possible we can also have a build on Travis which builds LLVM from source. Looks like the travis build still had failures:
Retrying with some more diagnostics and hopefully some fixed. |
@joshk interestingly it looks like all our calls to the |
c4c61e7
to
a1e8c82
Compare
Oh wait travis-ci/travis-ci#4751 indicates that |
cb4dd2d
to
a6b9fe8
Compare
Is it working better now? |
Also, I see some things we can do to improve this further :) |
a6b9fe8
to
8fc41a3
Compare
@joshk yeah so far looking so good, the IPv6 tests are passing in the docker container and the only remaining failure is because there's a known bug in LLVM 3.6 which causes our tests to fail, so I just need to figure out how to install LLVM 3.7 instead! |
@alexcrichton i would suggest creating a new Docker image (eg. something like |
f02f8ff
to
b284636
Compare
Huzzah! That did the trick! So, to summarize the changes here to get the build working:
So I'm pretty comfortable with this:
So all in all, r? @brson |
Wow, nice! Two quick questions:
|
How many cores do these machines have? e.g. is there a recommended -j for us to use?
The current way our makefiles are set up doesn't make this super easy, but logically this is totally plausible. We've got one ~30min build to produce a compiler followed by N test suites (which greatly vary in size), but all of the N test suites can be run in parallel (just gotta make sure they're all run). |
The current instances use 2 CPUs, but playing around with a higher As for the test suite break up, this is cool to know. This would mean, at least for now, that each Job would also need to produce a new compiler. We have plans to improve this, but this might be some work which we include in next year and work together on. |
Ah ok, I think that @gankro played around with different values of |
You can use Env vars to break up your build into N many jobs. But if we can partner on some work, then we can look into build pipeline support for Travis, allowing you to build a compiler once instead of N times. And maybe we can partner this year on the ability to opt in for larger VMs? |
That sounds like it'd work for us! We'd love to put more of our testing on travis, so we'd basically be producing N different compiler configurations, each of which needs to run M test suites, so we could manually encode the NxM matrix into .travis.yml but for now we'll probably stick to just N lines :). If we could encode into our configuration, however, N+M steps (e.g. how to build a compiler and then how to test any compiler) that'd be awesome! I'm sure we'd also be more than willing to help out wherever possible, ideally we'd be able to move off buildbot completely but we're probably aways out from that! |
@joshk hm interesting, looks like the recent build we added timed out (enabling debug assertions and debuginfo in the compiler itself). It also looks like the normal build took ~15min longer than usual (perhaps normal?), in light of that should we keep trying to optimize our build, or would it be possible to increase the timeout a bit more? It's totally reasonable to say a 3hr timeout is a bit unreasonable :) |
I can increase it to 3hrs if you like, so you can test further, but this means, of the 5 jobs you can run at once, you could have the queue blocked for 3 hours at a time due to long running jobs. |
@joshk hm ok, if it's alright we'll take the higher timeout for now and probably investigate how to parallelize more or just run fewer tests on our end. |
Done! (sorry for the wait) |
r=me |
Hm, still waiting on a successful run from travis, haven't gotten one from the debug builder yet... |
c96a9ad
to
0f09984
Compare
OK, looks like we may not be able to run the debug builder on travis (just takes too long), so just updated back to only using the system LLVM + |
Travis CI has new infrastructure using the Google Compute Engine which has both faster CPUs and more memory, and we've been encouraged to switch as it should help our build times! The only downside currently, however, is that IPv6 is disabled, causing a number of standard library tests to fail. Consequently this commit tweaks our travis config in a few ways: * ccache is disabled as it's not working on GCE just yet * Docker is used to run tests inside which reportedly will get IPv6 working * A system LLVM installation is used instead of building LLVM itself. This is primarily done to reduce build times, but we want automation for this sort of behavior anyway and we can extend this in the future with building from source as well if needed. * gcc-specific logic is removed as the docker image for Ubuntu gives us a recent-enough gcc by default.
💔 Test failed - auto-linux-64-x-android-t |
http://buildbot.rust-lang.org/builders/auto-linux-64-x-android-t/builds/6536/steps/test/logs/stdio failure was Weird, that test is marked |
Travis CI has new infrastructure using the Google Compute Engine which has both faster CPUs and more memory, and we've been encouraged to switch as it should help our build times! The only downside currently, however, is that IPv6 is disabled, causing a number of standard library tests to fail. Consequently this commit tweaks our travis config in a few ways: * ccache is disabled as it's not working on GCE just yet * Docker is used to run tests inside which reportedly will get IPv6 working * A system LLVM installation is used instead of building LLVM itself. This is primarily done to reduce build times, but we want automation for this sort of behavior anyway and we can extend this in the future with building from source as well if needed. * gcc-specific logic is removed as the docker image for Ubuntu gives us a recent-enough gcc by default.
0f09984
to
27dd6dd
Compare
Travis CI has new infrastructure using the Google Compute Engine which has both faster CPUs and more memory, and we've been encouraged to switch as it should help our build times! The only downside currently, however, is that IPv6 is disabled, causing a number of standard library tests to fail. Consequently this commit tweaks our travis config in a few ways: * ccache is disabled as it's not working on GCE just yet * Docker is used to run tests inside which reportedly will get IPv6 working * A system LLVM installation is used instead of building LLVM itself. This is primarily done to reduce build times, but we want automation for this sort of behavior anyway and we can extend this in the future with building from source as well if needed. * gcc-specific logic is removed as the docker image for Ubuntu gives us a recent-enough gcc by default.
#[cfg(all(unix, not(target_os="android")))] | ||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcrichton did you mean to remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah oops! I did indeed not mean to!
RUN apt-get -y --force-yes install llvm-3.7-tools | ||
|
||
RUN mkdir /build | ||
WORKDIR /build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcrichton IIRC, each RUN
creates a new file system layer. Combining the RUN
s into one (just using &&
) might speed up your docker build
.
If this image is cached anywhere (which might make sense), you might also want append some clean up calls to the RUN
calling apt-get
to reduce the size. I'm no authority on this, but I've often seen stuff like apt-get autoremove -y && apt-get clean all && rm -rf /var/lib/apt/lists/*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Right now I don't think that this is anywhere near the limiting factor of our builds, however, so it may not be too bad one way or the other.
If building an image ends up taking too long in the future we'll probably want to just send it up to the hub and download it from there, but hopefully it won't be taking too too long!
This test was mysteriously messed with as part of rust-lang#28500 r? @alexcrichton
This test was mysteriously messed with as part of rust-lang#28500 r? @alexcrichton
Travis CI has new infrastructure using the Google Compute Engine which has both
faster CPUs and more memory, and we've been encouraged to switch as it should
help our build times! The only downside currently, however, is that IPv6 is
disabled, causing a number of standard library tests to fail.
Consequently this commit tweaks our travis config in a few ways:
primarily done to reduce build times, but we want automation for this sort of
behavior anyway and we can extend this in the future with building from source
as well if needed.
recent-enough gcc by default.