testsuite: revert travis-ci to vm builds and fix many broken &&-chains in tests #1167

grondo · 2017-08-29T00:48:34Z

This PR reverts Travis-CI to use the vm builder instead of container builders by replacing sudo: false with sudo: true in .travis.yml. In my testing, this change completely resolves the intermittent (but high frequency) failures under travis such as hangs the write error problem, and really slow builds (#1145, #1141, etc.)

WIth that change in place, I was able to rebase many (minor) testsuite fixes I had on a branch, which fix up all the broken &&-chains in the testsuite, as well as add timeouts to some tests which have the potential to hang.

I think because many people have moved to the container build infrastructure, the VM build queue is not quite as long as it was getting before, so I don't think this change will slow down the turnaround time on Travis reporting.

This PR also uses one of the builders (currently the gcc-4.9 build) to run the suite with --chain-lint so we catch any broken &&-chains that might creep back in. (Many of these are harmless, however some broken && chains could cause a false positive in a test, so best to keep the testsuite suite in this regard)

For flux version to be properly detected, we need to at least fetch tags and possibly also unshallow the git clone. Do this before running configure in travis-ci.

Fix broken `&&`-chains in some tests

Fix some broken &&-chains in some sharness tests. In some cases, the `&&` chain was broken presumably because a call was expected to fail, e.g. in the valid/invalid rank flux-module load/unload tests. These cases were fixed by either adding `!` or using `test_must_fail`.

Where appropriate, fix broken &&-chain in tests. The chain-lint check kludgery in sharness doesn't like when processes are sent into background within the test, so add a flux_exec_bg() shell function to wrap sending flux-exec into that background.

Fix broken &&-chain in tests so that --chain-lint passes on this test.

Add missing `&&`-chain in tests to make --chain-lint happy.

The check for flux-version output wasn't doing anything becuase of leftover debugging in the test. In fact, the test itself had been failing because of the use of extended regex pattern. Remove the debugging from the test and fix the regex pattern.

Fix the broken &&-chain in the heaptrace tests by pulling out the ||-based output checks into their own shell function.

Fix broken &&-chains in test so that --chain-lint doesn't complain.

Fix broken &&-chains in tests so that --chain-lint doesn't complain.

Fix broken &&-chain in tests so --chain-lint won't complain

Fix broken &&-chains in tests so that --chain-lint stops complaining

Fix a few broken &&-chains in tests so that --chain-lint doesn't complain.

Fix broken &&-chain in tests so --chain-lint doesn't complain. Some tests were checking for failed commands with `command; test $? -ne 0 && ...` these were replaced with `test_must_fail command && ...`

Fix broken &&-chains in tests so --chain-lint doesn't complain.

Fix broken &&-chains in tests so that --chain-lint doesn't complain.

Fix a broken &&-chain in a test to make --chain-lint happy.

Fix a broken &&-chain in a test so that --chain-lint doesn't complain.

Timeout the test_under_flux test after 5s in case the test hangs for unknown reason.

Since t0001-basic.t doesn't run under test_under_flux, it doesn't have a test timeout. To ensure invidual tests using flux-start do not hang the testsuite, add -o -Sinit.rc2_timeout option to each invocation of flux-start.

Set `sudo: true` in .travis.yml to revert to VM based builds instead of container/docker builds on Travis-CI. This is done because our build appears to require more resources than are given to the container-based Travis builds.

Add chain-lint check to the gcc-4.9 build in travis. This way, broken &&-chains will be detected before merging.

coveralls · 2017-08-29T01:25:47Z

Coverage decreased (-0.03%) to 78.025% when pulling 3341f7c on grondo:chain-lint into 1c99b02 on flux-framework:master.

grondo · 2017-08-29T01:58:15Z

failures under travis such as hangs the write error problem

Of course after I stated that one test failed with write error so I'm not sure this is a fix for that problem

codecov-io · 2017-08-29T02:05:31Z

Codecov Report

Merging #1167 into master will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1167      +/-   ##
==========================================
- Coverage    77.7%   77.65%   -0.05%     
==========================================
  Files         158      158              
  Lines       29075    29041      -34     
==========================================
- Hits        22592    22552      -40     
- Misses       6483     6489       +6

Impacted Files	Coverage Δ
src/common/libflux/ev_flux.c	`97.56% <0%> (-2.44%)`	⬇️
src/connectors/ssh/ssh.c	`83.25% <0%> (-1.92%)`	⬇️
src/common/libflux/request.c	`88.46% <0%> (-1.29%)`	⬇️
src/cmd/flux-event.c	`67.74% <0%> (-1.08%)`	⬇️
src/common/libflux/dispatch.c	`84.81% <0%> (-0.75%)`	⬇️
src/connectors/local/local.c	`87.4% <0%> (-0.75%)`	⬇️
src/common/libkvs/kvs_txn.c	`76.47% <0%> (-0.66%)`	⬇️
src/common/libflux/message.c	`80.94% <0%> (-0.36%)`	⬇️
src/broker/overlay.c	`71.67% <0%> (-0.35%)`	⬇️
src/common/libflux/reactor.c	`93.18% <0%> (-0.29%)`	⬇️
... and 11 more

garlick · 2017-08-29T02:22:57Z

These are good fixes, even if we don't have a 100 pct fix for the spurious travis errors. I'm for merging this if you're ready.

grondo · 2017-08-29T03:00:48Z

Yes, I think it is ready. Pretty straightforward really.

garlick · 2017-08-29T03:16:44Z

Thanks!

grondo and others added 22 commits August 28, 2017 10:04

travis-ci: unshallow git clone so git-describe works

170383a

For flux version to be properly detected, we need to at least fetch tags and possibly also unshallow the git clone. Do this before running configure in travis-ci.

t0001-basic.t: fix broken &&-chain in tests

7467d76

Fix broken `&&`-chains in some tests

t0007-ping.t: fix broken &&-chain in tests

9afdff5

Fix broken &&-chain in tests so that --chain-lint passes on this test.

t0009-dmesg.t: fix broken &&-chain in tests

91d0d5c

Add missing `&&`-chain in tests to make --chain-lint happy.

t0010-generic-utils.t: fix broken &&-chain in test

0d30426

Fix the broken &&-chain in the heaptrace tests by pulling out the ||-based output checks into their own shell function.

t0015-cron.t: fix broken &&-chain in test

917c8ea

Fix broken &&-chains in test so that --chain-lint doesn't complain.

t1000-kvs.t: fix broken &&-chains in tests

11c3877

Fix broken &&-chains in tests so that --chain-lint doesn't complain.

t0016-cron-faketime.t: fix broken &&-chain in tests

4619323

Fix broken &&-chain in tests so --chain-lint won't complain

t1001-barrier-basic.t: fix broken &&-chains in tests

bfb82ae

Fix broken &&-chains in tests so that --chain-lint stops complaining

t1002-kvs-extra.t: fix broken &&-chain in tests

119fb7a

Fix a few broken &&-chains in tests so that --chain-lint doesn't complain.

t1008-proxy.t: fix broken &&-chains in test

ed234c2

Fix broken &&-chain in tests so --chain-lint doesn't complain. Some tests were checking for failed commands with `command; test $? -ne 0 && ...` these were replaced with `test_must_fail command && ...`

t2000-wreck.t: fix broken &&-chains in tests

b5735d8

Fix broken &&-chains in tests so --chain-lint doesn't complain.

t2001-jsc.t: fix broken &&-chains in tests

5f86dc3

Fix broken &&-chains in tests so that --chain-lint doesn't complain.

t2005-hwloc-basic.t: fix broken &&-chain in test

db57044

Fix a broken &&-chain in a test to make --chain-lint happy.

t3000-mpi-basic.t: fix broken &&-chain in test

47b813e

Fix a broken &&-chain in a test so that --chain-lint doesn't complain.

t0001-basic.t: run test_under_flux test under timeout

b64856d

Timeout the test_under_flux test after 5s in case the test hangs for unknown reason.

t0001-basic.t: add timeouts to all flux-start invocations

62e28e0

Since t0001-basic.t doesn't run under test_under_flux, it doesn't have a test timeout. To ensure invidual tests using flux-start do not hang the testsuite, add -o -Sinit.rc2_timeout option to each invocation of flux-start.

travis-ci: revert to VM builds

8abd53f

Set `sudo: true` in .travis.yml to revert to VM based builds instead of container/docker builds on Travis-CI. This is done because our build appears to require more resources than are given to the container-based Travis builds.

travis-ci: run one build with --chain-lint

3341f7c

Add chain-lint check to the gcc-4.9 build in travis. This way, broken &&-chains will be detected before merging.

garlick merged commit 820301e into flux-framework:master Aug 29, 2017

grondo mentioned this pull request Aug 29, 2017

travis-ci: builds have become very slowwwww... #1141

Closed

grondo deleted the chain-lint branch October 25, 2017 02:20

grondo mentioned this pull request May 10, 2018

0.9.0 Release #1479

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testsuite: revert travis-ci to vm builds and fix many broken &&-chains in tests #1167

testsuite: revert travis-ci to vm builds and fix many broken &&-chains in tests #1167

grondo commented Aug 29, 2017 •

edited

Loading

coveralls commented Aug 29, 2017

grondo commented Aug 29, 2017

codecov-io commented Aug 29, 2017

garlick commented Aug 29, 2017

grondo commented Aug 29, 2017

garlick commented Aug 29, 2017

testsuite: revert travis-ci to vm builds and fix many broken &&-chains in tests #1167

testsuite: revert travis-ci to vm builds and fix many broken &&-chains in tests #1167

Conversation

grondo commented Aug 29, 2017 • edited Loading

coveralls commented Aug 29, 2017

grondo commented Aug 29, 2017

codecov-io commented Aug 29, 2017

Codecov Report

garlick commented Aug 29, 2017

grondo commented Aug 29, 2017

garlick commented Aug 29, 2017

grondo commented Aug 29, 2017 •

edited

Loading