Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testsuite: revert travis-ci to vm builds and fix many broken &&-chains in tests #1167

Merged
merged 22 commits into from
Aug 29, 2017

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Aug 29, 2017

This PR reverts Travis-CI to use the vm builder instead of container builders by replacing sudo: false with sudo: true in .travis.yml. In my testing, this change completely resolves the intermittent (but high frequency) failures under travis such as hangs the write error problem, and really slow builds (#1145, #1141, etc.)

WIth that change in place, I was able to rebase many (minor) testsuite fixes I had on a branch, which fix up all the broken &&-chains in the testsuite, as well as add timeouts to some tests which have the potential to hang.

I think because many people have moved to the container build infrastructure, the VM build queue is not quite as long as it was getting before, so I don't think this change will slow down the turnaround time on Travis reporting.

This PR also uses one of the builders (currently the gcc-4.9 build) to run the suite with --chain-lint so we catch any broken &&-chains that might creep back in. (Many of these are harmless, however some broken && chains could cause a false positive in a test, so best to keep the testsuite suite in this regard)

grondo and others added 22 commits August 28, 2017 10:04
For flux version to be properly detected, we need to at least fetch
tags and possibly also unshallow the git clone. Do this before running
configure in travis-ci.
Fix broken `&&`-chains in some tests
Fix some broken &&-chains in some sharness tests.

In some cases, the `&&` chain was broken presumably because a call
was expected to fail, e.g. in the valid/invalid rank flux-module
load/unload tests. These cases were fixed by either adding `!` or using
`test_must_fail`.
Where appropriate, fix broken &&-chain in tests.

The chain-lint check kludgery in sharness doesn't like when processes
are sent into background within the test, so add a flux_exec_bg()
shell function to wrap sending flux-exec into that background.
Fix broken &&-chain in tests so that --chain-lint passes on this test.
Add missing `&&`-chain in tests to make --chain-lint happy.
The check for flux-version output wasn't doing anything becuase of
leftover debugging in the test. In fact, the test itself had been
failing because of the use of extended regex pattern.

Remove the debugging from the test and fix the regex pattern.
Fix the broken &&-chain in the heaptrace tests by pulling out
the ||-based output checks into their own shell function.
Fix broken &&-chains in test so that --chain-lint doesn't complain.
Fix broken &&-chains in tests so that --chain-lint doesn't complain.
Fix broken &&-chain in tests so --chain-lint won't complain
Fix broken &&-chains in tests so that --chain-lint stops complaining
Fix a few broken &&-chains in tests so that --chain-lint doesn't
complain.
Fix broken &&-chain in tests so --chain-lint doesn't complain.

Some tests were checking for failed commands with

 `command; test $? -ne 0 && ...`

these were replaced with

 `test_must_fail command && ...`
Fix broken &&-chains in tests so --chain-lint doesn't complain.
Fix broken &&-chains in tests so that --chain-lint doesn't complain.
Fix a broken &&-chain in a test to make --chain-lint happy.
Fix a broken &&-chain in a test so that --chain-lint doesn't complain.
Timeout the test_under_flux test after 5s in case the test hangs
for unknown reason.
Since t0001-basic.t doesn't run under test_under_flux, it doesn't have
a test timeout. To ensure invidual tests using flux-start do not hang
the testsuite, add -o -Sinit.rc2_timeout option to each invocation
of flux-start.
Set `sudo: true` in .travis.yml to revert to VM based builds instead
of container/docker builds on Travis-CI. This is done because our
build appears to require more resources than are given to the
container-based Travis builds.
Add chain-lint check to the gcc-4.9 build in travis.
This way, broken &&-chains will be detected before merging.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.03%) to 78.025% when pulling 3341f7c on grondo:chain-lint into 1c99b02 on flux-framework:master.

@grondo
Copy link
Contributor Author

grondo commented Aug 29, 2017

failures under travis such as hangs the write error problem

Of course after I stated that one test failed with write error so I'm not sure this is a fix for that problem

@codecov-io
Copy link

Codecov Report

Merging #1167 into master will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1167      +/-   ##
==========================================
- Coverage    77.7%   77.65%   -0.05%     
==========================================
  Files         158      158              
  Lines       29075    29041      -34     
==========================================
- Hits        22592    22552      -40     
- Misses       6483     6489       +6
Impacted Files Coverage Δ
src/common/libflux/ev_flux.c 97.56% <0%> (-2.44%) ⬇️
src/connectors/ssh/ssh.c 83.25% <0%> (-1.92%) ⬇️
src/common/libflux/request.c 88.46% <0%> (-1.29%) ⬇️
src/cmd/flux-event.c 67.74% <0%> (-1.08%) ⬇️
src/common/libflux/dispatch.c 84.81% <0%> (-0.75%) ⬇️
src/connectors/local/local.c 87.4% <0%> (-0.75%) ⬇️
src/common/libkvs/kvs_txn.c 76.47% <0%> (-0.66%) ⬇️
src/common/libflux/message.c 80.94% <0%> (-0.36%) ⬇️
src/broker/overlay.c 71.67% <0%> (-0.35%) ⬇️
src/common/libflux/reactor.c 93.18% <0%> (-0.29%) ⬇️
... and 11 more

@garlick
Copy link
Member

garlick commented Aug 29, 2017

These are good fixes, even if we don't have a 100 pct fix for the spurious travis errors. I'm for merging this if you're ready.

@grondo
Copy link
Contributor Author

grondo commented Aug 29, 2017

Yes, I think it is ready. Pretty straightforward really.

@garlick
Copy link
Member

garlick commented Aug 29, 2017

Thanks!

@garlick garlick merged commit 820301e into flux-framework:master Aug 29, 2017
@grondo grondo deleted the chain-lint branch October 25, 2017 02:20
@grondo grondo mentioned this pull request May 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants