Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: remove error message checking in test-worker-init-failure #34727

Merged
merged 0 commits into from
Aug 14, 2020

Conversation

Trott
Copy link
Member

@Trott Trott commented Aug 11, 2020

Let the check for the error code suffice and don't check for a
particular string in the message.

Fixes: #33759

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label Aug 11, 2020
@Trott Trott added the request-ci Add this label to start a Jenkins CI on a PR. label Aug 11, 2020
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Aug 11, 2020
@nodejs-github-bot
Copy link
Collaborator

@@ -35,7 +35,6 @@ if (process.argv[2] === 'child') {
// (i.e. single cpu) `ulimit` may not lead to such an error.

worker.on('error', (e) => {
assert.match(e.message, /EMFILE/);
assert.ok(e.code === 'ERR_WORKER_INIT_FAILED' || e.code === 'EMFILE');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this will work: as in the error site in the related failure incident, e.code is going to be ENOENT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and worker did not fail in it's INIT, instead at runtime, so the assertion at line 38 would fail anyways?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess putting Fixes: in the commit message was a bit hasty. I still think we should probably remove the message sniffing, even if it doesn't resolve the test flakiness.

@Trott
Copy link
Member Author

Trott commented Aug 11, 2020

In trying to test this, the test fails when run in parallel with itself. I can make it fail with ENFILE that way pretty easily. I'm wondering if the real problem is that this test is somewhat resource-intensive and needs to be either dialed down a little bit or else moved to sequential.

@gireeshpunathil
Copy link
Member

assert.ok(e.code === 'ERR_WORKER_INIT_FAILED' || e.code === 'EMFILE' || e.code === 'ENOENT');

@Trott - are you able to test (and cause it to fail) with this change to the tests's assertion?

@Trott
Copy link
Member Author

Trott commented Aug 11, 2020

assert.ok(e.code === 'ERR_WORKER_INIT_FAILED' || e.code === 'EMFILE' || e.code === 'ENOENT');

@Trott - are you able to test (and cause it to fail) with this change to the tests's assertion?

Unfortunately, no. The errors are varied.

tools/test.py -j 64 --repeat 192 test/parallel/test-worker-init-failure.js
=== release test-worker-init-failure ===                   
Path: parallel/test-worker-init-failure
child stdout: 

child stderr: (libuv) kqueue(): Too many open files in system
net.js:328
      err = this._handle.open(fd);
                         ^

Error: ENFILE: file table overflow, uv_pipe_open
    at new Socket (net.js:328:26)
    at createWritableStdioStream (internal/bootstrap/switches/is_main_thread.js:72:18)
    at process.getStdout [as stdout] (internal/bootstrap/switches/is_main_thread.js:122:12)
    at new Worker (internal/worker.js:179:42)
    at Object.<anonymous> (/Users/trott/io.js/test/parallel/test-worker-init-failure.js:25:20)
    at Module._compile (internal/modules/cjs/loader.js:1265:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1286:10)
    at Module.load (internal/modules/cjs/loader.js:1114:32)
    at Function.Module._load (internal/modules/cjs/loader.js:976:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12) {
  errno: -23,
  code: 'ENFILE',
  syscall: 'uv_pipe_open'
}
assert.js:103
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:

1 !== 0

    at ChildProcess.<anonymous> (/Users/trott/io.js/test/parallel/test-worker-init-failure.js:63:12)
    at ChildProcess.<anonymous> (/Users/trott/io.js/test/common/index.js:365:15)
    at ChildProcess.emit (events.js:314:20)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:276:12) {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: 1,
  expected: 0,
  operator: 'strictEqual'
}
Command: out/Release/node /Users/trott/io.js/test/parallel/test-worker-init-failure.js
=== release test-worker-init-failure ===                   
Path: parallel/test-worker-init-failure
child stdout: 

child stderr: /Users/trott/io.js/out/Release/node[55368]: ../src/tracing/agent.cc:55:node::tracing::Agent::Agent(): Assertion `(uv_loop_init(&tracing_loop_)) == (0)' failed.
 1: 0x1000ae755 node::Abort() [/Users/trott/io.js/out/Release/node]
 2: 0x1000ae5c1 node::Assert(node::AssertionInfo const&) [/Users/trott/io.js/out/Release/node]
 3: 0x10016e028 node::tracing::Agent::Agent() [/Users/trott/io.js/out/Release/node]
 4: 0x1000839a6 node::V8Platform::Initialize(int) [/Users/trott/io.js/out/Release/node]
 5: 0x1000834c3 node::InitializeOncePerProcess(int, char**) [/Users/trott/io.js/out/Release/node]
 6: 0x100083bce node::Start(int, char**) [/Users/trott/io.js/out/Release/node]
 7: 0x7fff6ddbdcc9 start [/usr/lib/system/libdyld.dylib]
 8: 0x3 
/bin/sh: line 1: 55368 Abort trap: 6           /Users/trott/io.js/out/Release/node /Users/trott/io.js/test/parallel/test-worker-init-failure.js child
assert.js:103
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:

134 !== 0

    at ChildProcess.<anonymous> (/Users/trott/io.js/test/parallel/test-worker-init-failure.js:63:12)
    at ChildProcess.<anonymous> (/Users/trott/io.js/test/common/index.js:365:15)
    at ChildProcess.emit (events.js:314:20)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:276:12) {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: 134,
  expected: 0,
  operator: 'strictEqual'
}
Command: out/Release/node /Users/trott/io.js/test/parallel/test-worker-init-failure.js
=== release test-worker-init-failure ===                   
Path: parallel/test-worker-init-failure
Command: out/Release/node /Users/trott/io.js/test/parallel/test-worker-init-failure.js
--- CRASHED (Signal: 11) ---
=== release test-worker-init-failure ===                   
Path: parallel/test-worker-init-failure
out/Release/node[54987]: ../src/tracing/agent.cc:55:node::tracing::Agent::Agent(): Assertion `(uv_loop_init(&tracing_loop_)) == (0)' failed.
 1: 0x1000ae755 node::Abort() [out/Release/node]
 2: 0x1000ae5c1 node::Assert(node::AssertionInfo const&) [out/Release/node]
 3: 0x10016e028 node::tracing::Agent::Agent() [out/Release/node]
 4: 0x1000839a6 node::V8Platform::Initialize(int) [out/Release/node]
 5: 0x1000834c3 node::InitializeOncePerProcess(int, char**) [out/Release/node]
 6: 0x100083bce node::Start(int, char**) [out/Release/node]
 7: 0x7fff6ddbdcc9 start [/usr/lib/system/libdyld.dylib]
Command: out/Release/node /Users/trott/io.js/test/parallel/test-worker-init-failure.js
--- CRASHED (Signal: 6) ---
=== release test-worker-init-failure ===                   
Path: parallel/test-worker-init-failure
child stdout: 

child stderr: /bin/sh: line 1: 55006 Abort trap: 6           /Users/trott/io.js/out/Release/node /Users/trott/io.js/test/parallel/test-worker-init-failure.js child
assert.js:103
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:

134 !== 0

    at ChildProcess.<anonymous> (/Users/trott/io.js/test/parallel/test-worker-init-failure.js:63:12)
    at ChildProcess.<anonymous> (/Users/trott/io.js/test/common/index.js:365:15)
    at ChildProcess.emit (events.js:314:20)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:276:12) {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: 134,
  expected: 0,
  operator: 'strictEqual'
}
Command: out/Release/node /Users/trott/io.js/test/parallel/test-worker-init-failure.js

Trott added a commit to Trott/io.js that referenced this pull request Aug 11, 2020
Unfortunately, the test is sensitive to resource constraints and is
unreliable on macOS in CI when in parallel.

Fixes: nodejs#34727
@gireeshpunathil
Copy link
Member

@Trott - thanks. While the test did not expect these failures, with the (resource) constrained execution, these failures are absolutely meaningful.

the first one (Error: ENFILE: file table overflow, uv_pipe_open) is another manifestation of libuv failure when fds runs out.

the secone one (../src/tracing/agent.cc:55:node::tracing::Agent::Agent(): Assertion (uv_loop_init(&tracing_loop_)) == (0)' failed.`) is a worker failure scenario that is not covered under #31621

IMO the first one can be accommodated in the test, while the second one should be fixed in the tracing agent.

@nodejs-github-bot
Copy link
Collaborator

@Trott Trott added commit-queue Add this label to land a pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels Aug 14, 2020
@Trott
Copy link
Member Author

Trott commented Aug 14, 2020

I removed "Fixes: " from the commit message.

Landed in 9861962

@Trott Trott closed this Aug 14, 2020
@Trott Trott deleted the test-fix branch August 14, 2020 02:29
@Trott Trott merged commit 9861962 into nodejs:master Aug 14, 2020
Trott added a commit to Trott/io.js that referenced this pull request Aug 16, 2020
Refs: nodejs#34727 (comment)

PR-URL: nodejs#34769
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ricky Zhou <[email protected]>
MylesBorins pushed a commit that referenced this pull request Aug 17, 2020
Let the check for the error code suffice and don't check for a
particular string in the message.

PR-URL: #34727
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
MylesBorins pushed a commit that referenced this pull request Aug 17, 2020
Refs: #34727 (comment)

PR-URL: #34769
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ricky Zhou <[email protected]>
@danielleadams danielleadams mentioned this pull request Aug 20, 2020
BethGriggs pushed a commit that referenced this pull request Aug 20, 2020
Let the check for the error code suffice and don't check for a
particular string in the message.

PR-URL: #34727
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
BethGriggs pushed a commit that referenced this pull request Aug 20, 2020
Refs: #34727 (comment)

PR-URL: #34769
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ricky Zhou <[email protected]>
addaleax pushed a commit that referenced this pull request Sep 22, 2020
Let the check for the error code suffice and don't check for a
particular string in the message.

PR-URL: #34727
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
addaleax pushed a commit that referenced this pull request Sep 22, 2020
Refs: #34727 (comment)

PR-URL: #34769
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ricky Zhou <[email protected]>
addaleax pushed a commit that referenced this pull request Sep 22, 2020
Let the check for the error code suffice and don't check for a
particular string in the message.

PR-URL: #34727
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
addaleax pushed a commit that referenced this pull request Sep 22, 2020
Refs: #34727 (comment)

PR-URL: #34769
Reviewed-By: Gireesh Punathil <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Denys Otrishko <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Ricky Zhou <[email protected]>
@codebytere codebytere mentioned this pull request Sep 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate flaky test-worker-init-failure
5 participants