Single stress testing of parallel tests in parallel #3

santigimeno · 2016-01-14T22:10:48Z

When trying to reproduce a flaky test from the parallel folder, I've found very useful, to reproduce the error more easily, making some copies of it and then run those tests in parallel. Something along this lines:

#!/bin/bash

BASE_PATH=./test/parallel
TEST=$1
COPIES=$2

for i in `seq 1 $COPIES`;
do
  cp $BASE_PATH/$TEST.js $BASE_PATH/$TEST-$i.js
done

while true; do
  /usr/bin/python tools/test.py --mode=release parallel/${TEST}* -J
  if [ $? -ne 0 ]; then
    break
  fi
done

From what I understand the node-stress-single-test job from the CI runs the tests sequentially. Maybe trying to run them in parallel can be useful. What do you think?

The text was updated successfully, but these errors were encountered:

Trott · 2016-01-26T06:12:44Z

I started to write up a lengthy pros-and-cons comment for this, but then realized that everything I was saying boiled down to a simple statement:

The two pieces of information needed on this would be "how valuable would this feature be" and "how difficult would it be to implement on Jenkins"? I don't know the answer to either of these, which makes me unqualified to have an opinion. All I know is that I've never personally needed this, but I can imagine there is value.

Anyway, mostly commenting to see if others have opinions.

orangemocha · 2016-01-26T13:11:39Z

While I see the value of being able to stress parallel test runs, I am not sure if this strategy would be very effective. For one, tests in the parallel folder are assumed to be safe to run in parallel with other tests, but likely not with themselves.

Does the test runner output tell us which tests are running concurrently (at the time of a failure)? If not, that's probably low hanging fruit.

santigimeno · 2016-01-26T22:36:39Z

Thanks for the feedback.
I understand that some tests in the parallel folder might not work well when running some of them concurrently but shouldn't they? What I'm trying to say is that lots of tests are quite similar to each other in the sense that they handle system resources: create a socket and bind it to a port, create temporal files, etc. and there are no collisions among them. Why shouldn't this work for the same test? Probably if a test can't be run in parallel with itself, it should be moved to sequential, as there can be another similar test that tries to use the same resource. I hope I'm making some kind of sense...

orangemocha · 2016-01-27T11:06:14Z

Why shouldn't this work for the same test?

I remember a few instances of tests that make an assumption on a resource (eg file name, pipe name) that is unique to the particular test, and I think we have considered that enough to make the test parallelizable. The goal of running tests in parallel was to speed up a test run, so we haven't had a need to run tests in parallel with themselves. If we had that need, I suppose we could make the 'parallelizable' criteria more stringent and move a few tests back to sequential.

Pinging @jbergstroem who knows more about parallel tests.

jbergstroem · 2016-01-27T21:41:23Z

If run in parallel tests should use their own subfolders for sockets, temporary files and similar. Not sure about common.PORT tests though. I guess it would be a good way to flesh out tests that doesn't support it?

santigimeno · 2016-02-08T19:02:29Z

I've read the last meeting minutes regarding this issue. I'd be really interested in doing this. How should we proceed from here? Alexis said about opening an issue in the build repository. Should I do that, or someone else? Thanks

Trott · 2016-02-08T21:21:31Z

@santigimeno I don't know if it matters who opens the issue. I would say you should open it and others can chime in as needed.

orangemocha · 2016-02-09T15:54:37Z

👍

santigimeno · 2016-02-12T18:56:31Z

Issue @ build group opened here: nodejs/build#331

joaocgreis · 2016-02-26T18:26:07Z

Current scripts in use by https://ci.nodejs.org/job/node-stress-single-test/ :

Windows

call vcbuild.bat release nosign x64
if %errorlevel% neq 0 exit /b %errorlevel%

setlocal EnableDelayedExpansion

set OK=0
set NOK=0
for /l %%i in (1,1,%RUN_TIMES%) do (
  python tools\test.py -p tap %RUN_TESTS% && set /a OK=!OK!+1 || set /a NOK=!NOK!+1
  echo %%i   OK: !OK!   NOT OK: !NOK!
)

echo on
rm -rfv test/tmp* || true
git clean -fdx || true

if %NOK% NEQ 0 (
  echo The test is flaky.
  exit /b 1
)

All others:

#!/usr/bin/env bash -ex

python ./configure
# centos5 defaults to an old python if PYTHON is not defined
# freebsd needs NPROCESSORS_ONLN without underscore
# @TODO: jbergstroem - this should use the environment variable $JOBS
# once available on all slaves
PYTHON=python make -j $(getconf _NPROCESSORS_ONLN || getconf NPROCESSORS_ONLN)
PYTHON=python make build-addons

echo "Running test ${RUN_TIMES} times" 

OK=0
NOK=0
for i in `seq $RUN_TIMES`; do
  python tools/test.py -p tap --mode=release $RUN_TESTS && OK=$(($OK+1)) || NOK=$(($NOK+1))
  echo "$i   OK: $OK   NOT OK: $NOK"
done

if [ "$NOK" != "0" ]; then
  echo The test is flaky.
  exit 1
fi

Trott · 2017-03-02T17:18:34Z

It's not particularly obvious I suppose, but if you pass something like this to RUN_TESTS, it will run the test 8 times in parallel:

-j8 parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection

gibfahn · 2017-03-02T19:42:45Z

@Trott couldn't you also do this?

-J --repeat=1000 parallel/test-https-agent-create-connection

Trott · 2017-03-02T19:53:56Z

@gibfahn In my experience, no. It will use multiple processes (if you are on a multi-processor host) to run the tests you supply (in this case just one test), and after it is done, it will repeat until it runs 1000 times. But it will not run multiple repetitions at once, if that makes any sense. But feel free to verify. Stuff may have changed or I could just be flat out wrong.

Trott · 2017-03-02T19:56:16Z

@gibfahn Actually, I just tested it and it seems to work as you suggest and not as I describe. Woot!

Trott · 2017-03-02T19:57:23Z

@gibfahn One caveat is that it will only work for tests in parallel but that's all we're usually concerned about anyway. If you ever do want/need to run a test in (say) sequential in parallel (maybe to test if it might be OK to move it out of sequential), then the wonky approach I suggest is the way to go. Er, or just move the test to parallel and run it from there. :-D

Trott · 2017-03-02T20:02:09Z

Bleargh! I don't think my last comment is right either (in that I don't think my wonky suggestion will cause the test run in parallel either). I'm going to stop now.

Trott · 2017-09-28T23:27:04Z

I think this can be closed but re-open if I'm wrong...

Trott added the wg-agenda label Jan 26, 2016

Trott mentioned this issue Jan 28, 2016

Meeting #2 #9

Closed

santigimeno mentioned this issue Feb 12, 2016

New experimental stress test job / Access permission nodejs/build#331

Closed

Trott mentioned this issue Feb 26, 2016

Meeting #3 #17

Closed

santigimeno mentioned this issue Feb 26, 2016

Experimental Jenkins instance #20

Closed

santigimeno mentioned this issue Mar 20, 2016

test: fix flaky test-cluster-shared-leak nodejs/node#5802

Closed

2 tasks

santigimeno mentioned this issue Jun 17, 2016

Investigate flaky test-fs-watch-encoding nodejs/node#7243

Closed

santigimeno mentioned this issue Jul 24, 2016

test: fix test-vm-sigint flakiness nodejs/node#7854

Closed

2 tasks

Trott removed the wg-agenda label Sep 28, 2017

Trott closed this as completed Sep 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single stress testing of parallel tests in parallel #3

Single stress testing of parallel tests in parallel #3

santigimeno commented Jan 14, 2016

Trott commented Jan 26, 2016

orangemocha commented Jan 26, 2016

santigimeno commented Jan 26, 2016

orangemocha commented Jan 27, 2016

jbergstroem commented Jan 27, 2016

santigimeno commented Feb 8, 2016

Trott commented Feb 8, 2016

orangemocha commented Feb 9, 2016

santigimeno commented Feb 12, 2016

joaocgreis commented Feb 26, 2016

Trott commented Mar 2, 2017

gibfahn commented Mar 2, 2017

Trott commented Mar 2, 2017 •

edited

Loading

Trott commented Mar 2, 2017

Trott commented Mar 2, 2017

Trott commented Mar 2, 2017

Trott commented Sep 28, 2017

Single stress testing of parallel tests in parallel #3

Single stress testing of parallel tests in parallel #3

Comments

santigimeno commented Jan 14, 2016

Trott commented Jan 26, 2016

orangemocha commented Jan 26, 2016

santigimeno commented Jan 26, 2016

orangemocha commented Jan 27, 2016

jbergstroem commented Jan 27, 2016

santigimeno commented Feb 8, 2016

Trott commented Feb 8, 2016

orangemocha commented Feb 9, 2016

santigimeno commented Feb 12, 2016

joaocgreis commented Feb 26, 2016

Trott commented Mar 2, 2017

gibfahn commented Mar 2, 2017

Trott commented Mar 2, 2017 • edited Loading

Trott commented Mar 2, 2017

Trott commented Mar 2, 2017

Trott commented Mar 2, 2017

Trott commented Sep 28, 2017

Trott commented Mar 2, 2017 •

edited

Loading