Skip to content
This repository has been archived by the owner on May 13, 2019. It is now read-only.

Single stress testing of parallel tests in parallel #3

Closed
santigimeno opened this issue Jan 14, 2016 · 17 comments
Closed

Single stress testing of parallel tests in parallel #3

santigimeno opened this issue Jan 14, 2016 · 17 comments

Comments

@santigimeno
Copy link
Member

When trying to reproduce a flaky test from the parallel folder, I've found very useful, to reproduce the error more easily, making some copies of it and then run those tests in parallel. Something along this lines:

#!/bin/bash

BASE_PATH=./test/parallel
TEST=$1
COPIES=$2

for i in `seq 1 $COPIES`;
do
  cp $BASE_PATH/$TEST.js $BASE_PATH/$TEST-$i.js
done

while true; do
  /usr/bin/python tools/test.py --mode=release parallel/${TEST}* -J
  if [ $? -ne 0 ]; then
    break
  fi
done

From what I understand the node-stress-single-test job from the CI runs the tests sequentially. Maybe trying to run them in parallel can be useful. What do you think?

@Trott
Copy link
Member

Trott commented Jan 26, 2016

I started to write up a lengthy pros-and-cons comment for this, but then realized that everything I was saying boiled down to a simple statement:

The two pieces of information needed on this would be "how valuable would this feature be" and "how difficult would it be to implement on Jenkins"? I don't know the answer to either of these, which makes me unqualified to have an opinion. All I know is that I've never personally needed this, but I can imagine there is value.

Anyway, mostly commenting to see if others have opinions.

@orangemocha
Copy link
Contributor

While I see the value of being able to stress parallel test runs, I am not sure if this strategy would be very effective. For one, tests in the parallel folder are assumed to be safe to run in parallel with other tests, but likely not with themselves.

Does the test runner output tell us which tests are running concurrently (at the time of a failure)? If not, that's probably low hanging fruit.

@santigimeno
Copy link
Member Author

Thanks for the feedback.
I understand that some tests in the parallel folder might not work well when running some of them concurrently but shouldn't they? What I'm trying to say is that lots of tests are quite similar to each other in the sense that they handle system resources: create a socket and bind it to a port, create temporal files, etc. and there are no collisions among them. Why shouldn't this work for the same test? Probably if a test can't be run in parallel with itself, it should be moved to sequential, as there can be another similar test that tries to use the same resource. I hope I'm making some kind of sense...

@orangemocha
Copy link
Contributor

Why shouldn't this work for the same test?

I remember a few instances of tests that make an assumption on a resource (eg file name, pipe name) that is unique to the particular test, and I think we have considered that enough to make the test parallelizable. The goal of running tests in parallel was to speed up a test run, so we haven't had a need to run tests in parallel with themselves. If we had that need, I suppose we could make the 'parallelizable' criteria more stringent and move a few tests back to sequential.

Pinging @jbergstroem who knows more about parallel tests.

@jbergstroem
Copy link
Member

If run in parallel tests should use their own subfolders for sockets, temporary files and similar. Not sure about common.PORT tests though. I guess it would be a good way to flesh out tests that doesn't support it?

@Trott Trott mentioned this issue Jan 28, 2016
@santigimeno
Copy link
Member Author

I've read the last meeting minutes regarding this issue. I'd be really interested in doing this. How should we proceed from here? Alexis said about opening an issue in the build repository. Should I do that, or someone else? Thanks

@Trott
Copy link
Member

Trott commented Feb 8, 2016

@santigimeno I don't know if it matters who opens the issue. I would say you should open it and others can chime in as needed.

@orangemocha
Copy link
Contributor

👍

@santigimeno
Copy link
Member Author

Issue @ build group opened here: nodejs/build#331

@joaocgreis
Copy link
Member

Current scripts in use by https://ci.nodejs.org/job/node-stress-single-test/ :

Windows

call vcbuild.bat release nosign x64
if %errorlevel% neq 0 exit /b %errorlevel%

setlocal EnableDelayedExpansion

set OK=0
set NOK=0
for /l %%i in (1,1,%RUN_TIMES%) do (
  python tools\test.py -p tap %RUN_TESTS% && set /a OK=!OK!+1 || set /a NOK=!NOK!+1
  echo %%i   OK: !OK!   NOT OK: !NOK!
)

echo on
rm -rfv test/tmp* || true
git clean -fdx || true

if %NOK% NEQ 0 (
  echo The test is flaky.
  exit /b 1
)

All others:

#!/usr/bin/env bash -ex

python ./configure
# centos5 defaults to an old python if PYTHON is not defined
# freebsd needs NPROCESSORS_ONLN without underscore
# @TODO: jbergstroem - this should use the environment variable $JOBS
# once available on all slaves
PYTHON=python make -j $(getconf _NPROCESSORS_ONLN || getconf NPROCESSORS_ONLN)
PYTHON=python make build-addons

echo "Running test ${RUN_TIMES} times" 

OK=0
NOK=0
for i in `seq $RUN_TIMES`; do
  python tools/test.py -p tap --mode=release $RUN_TESTS && OK=$(($OK+1)) || NOK=$(($NOK+1))
  echo "$i   OK: $OK   NOT OK: $NOK"
done

if [ "$NOK" != "0" ]; then
  echo The test is flaky.
  exit 1
fi

@Trott
Copy link
Member

Trott commented Mar 2, 2017

It's not particularly obvious I suppose, but if you pass something like this to RUN_TESTS, it will run the test 8 times in parallel:

-j8 parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection parallel/test-https-agent-create-connection

@gibfahn
Copy link
Member

gibfahn commented Mar 2, 2017

@Trott couldn't you also do this?

-J --repeat=1000 parallel/test-https-agent-create-connection

@Trott
Copy link
Member

Trott commented Mar 2, 2017

@gibfahn In my experience, no. It will use multiple processes (if you are on a multi-processor host) to run the tests you supply (in this case just one test), and after it is done, it will repeat until it runs 1000 times. But it will not run multiple repetitions at once, if that makes any sense. But feel free to verify. Stuff may have changed or I could just be flat out wrong.

@Trott
Copy link
Member

Trott commented Mar 2, 2017

@gibfahn Actually, I just tested it and it seems to work as you suggest and not as I describe. Woot!

@Trott
Copy link
Member

Trott commented Mar 2, 2017

@gibfahn One caveat is that it will only work for tests in parallel but that's all we're usually concerned about anyway. If you ever do want/need to run a test in (say) sequential in parallel (maybe to test if it might be OK to move it out of sequential), then the wonky approach I suggest is the way to go. Er, or just move the test to parallel and run it from there. :-D

@Trott
Copy link
Member

Trott commented Mar 2, 2017

Bleargh! I don't think my last comment is right either (in that I don't think my wonky suggestion will cause the test run in parallel either). I'm going to stop now.

@Trott Trott removed the wg-agenda label Sep 28, 2017
@Trott Trott closed this as completed Sep 28, 2017
@Trott
Copy link
Member

Trott commented Sep 28, 2017

I think this can be closed but re-open if I'm wrong...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants