-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use GitHub Actions for CI (attempt 2) #2465
Use GitHub Actions for CI (attempt 2) #2465
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤞
Ironically, the only thing that failed was one of the long-running test buckets in Travis CI. Let me restart that one job. |
It's still failing Travis CI... Since I'm not a frequent contributor to rocket-chip, I don't think I have a good sense of the current state of the build. Have other people noticed the seventh and final test bucket ( Comparing the running times between GitHub Actions and Travis CI, it seems like the GitHub Actions builds usually complete about ~20 minutes faster, although I see that Test Bucket 3 took 1 hr 30 min on GH Actions and 1 hr 35 min on Travis CI, so sometimes they are close. Bucket 7 took 55 min on GH Actions, and is timing out after 1 hr 20 min on Travis. One thing that is notable is that with Travis CI, we are setting timeouts per |
For another data point, the last build that I did on my personal fork of rocket-chip took 1 hr 5 min on Bucket 7: https://github.com/richardxia/rocket-chip/runs/663756436?check_suite_focus=true |
I have had to restart travis jobs 4-5 times to get past the download/timeout issues we've been having. So, your PR failing travis is not an anomaly. |
The Travis failure doesn't surprise me, either. We can ignore it if we want to give Actions a shot. |
Ironically due to Travis CI continuing to time out on it.
I tried restarting it like three times today, to no avail. I am now bumping the Travis CI timeout on that last bucket from 80 to 100 minutes: 3145f71 |
Can we please merge this and disable Travis? 🥇 |
I will merge this in and let someone else do the disabling Travis part. |
I set the Travis web hook to inactive. Not sure if that was exactly the
right step to take. I can push more buttons if necessary and someone tells
me which buttons to push.
…On Thu, May 14, 2020 at 5:24 PM Richard Xia ***@***.***> wrote:
Merged #2465 <#2465>
into master.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2465 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAH3XQQWBLJRHFNUHZ577RDRRSDU7ANCNFSM4M7MAH5A>
.
|
Replaces #2459 to see if not having two simultaneous PRs open from the same branch fixes anything. This is still being submitted from my personal fork, since I want to test that this still works for people who don't have commit access to chipsalliance/rocket-chip
Related issue: N/A
Type of change: other enhancement
Impact: API modification
Development Phase: implementation
Release Notes
I'm going to make up my own PR template to organize my thoughts:
Problem Statement
I've occasionally run into friction while using Travis CI, and I think it's worth trying out other CI services to see whether they have nicer experiences. Some of my main issues with Travis CI are:
travis_wait
is an annoying kludge, since it has the side effect of hiding all of the normal console output of your command until either it completes or until the timeout kicks.Why GitHub Actions may be a better solution
Addressing each of my points above, one by one:
Drawbacks to using GitHub Actions
Although I have had a fairly positive experience with GitHub Actions, I have noticed some things that feel like a step back from Travis CI:
What this PR changes
I've done a heavy amount of rebasing and squashing so that my commits are ordered in a specific way.
Refactoring work that is probably useful to merge in even if we don't want to use GitHub Actions
45ef2f3 - I pulled out the actual
make
commands from the.travis.yml
file into a bash script so that I could more easily run them in either Travis CI or GitHub Actions. This would affect anyone that wants to update the CI tests in both Travis CI and GitHub Actions, since they must now modify that bash script. I would want this to be easily modifiable by others, so please let me know if you find this more confusing or worse than what we had previously with Travis CI.b5fca2d - I added a
verilator.hash
, modeled afterriscv-tools.hash
, so that I could share that version number between both Travis CI and GitHub Actions and so that I could use that hash to compute the cache key in GitHub Actions.468da4e - I modified the
regression/Makefile
targets to invert the dependency betweenriscv-tests.stamp
androcket-tools_checkout.stamp
. Previously,riscv-tests.stamp
depended onrocket-tools_checkout.stamp
, andriscv-tests.stamp
was essentially a no-op, since it relied onrocket-tools_checkout.stamp
to actually check out riscv-tests. I modified this because I noticed it was taking 30 minutes to just clone all of riscv-tools' submodules, particularly riscv-gnu-toolchain and fsf-binutils-gdb, even if you have a cache hit on a precompiled riscv-tools, since the GDB tests require the actual riscv-tests source code to be checked out.I changed this so that now
rocket-tools_checkout.stamp
depends onriscv-tests.stamp
, andriscv-tests.stamp
is now responsible for doing the clone of rocket-tools.git and riscv-tests.git.rocket-tools_checkout.stamp
is still responsible for cloning all the other submodules. If @aswaterman or someone could let me know if I've done this correctly, that would be great, since I don't think I understand how these targets work or how they are meant to be used in local development.Adding the actual GitHub Actions workflow definitions
I broke up the commits such that each commit introduces a different job in the workflow file (e.g. wit submodule check, prepare riscv-tools cache, etc.). I think the main interesting one is the one that actually runs the main tests, which I set up as a matrix job: 015dc0b
The other main commit of note is 7885c0f, where I added a
README_GITHUB_ACTIONS.md
.Proposed rollout
I'm not sure if we want to make a big change to flip from Travis CI to GitHub Actions, so I tried to develop this PR to support running both CI systems simultaneously. I think it may even be advantages to perpetually run both CI systems, since an outage in one service will (hopefully) not affect the other service, and we can more comfortably waive transient failures from one service with the passing results from the other.
I think we actually do get quite a lot of resilience from using both providers, since they even use different cloud vendors underneath: Travis CI uses Google Cloud Platform, while GitHub Actions uses Microsoft Azure. This means that we're more robust to datacenter-wide failures as well as entire GCP- or Azure-wide failures.
That said, it is more overhead to maintain two CI systems, and I think there is a social risk of letting one system rot if it starts deterministically failing, since we always have the other to rely on.
Special thanks
@aswaterman, for helping me with working out some issues with the compilation of the fesvr and Verilator when using mismatched versions of g++! That really helped unblock me and get me to the finish line!