nixos/test-driver: Fix thread cleanup and execute #142560

dasJ · 2021-10-22T11:05:33Z

Motivation for this change

The unstable channel is blocked by the test runner hanging in cleanup.
This is not my change, it was written by @K900 and I don't fully understand why it fixes things but it seems to make the test runner exit which it is currently not doing.

Things done

K900 · 2021-10-22T11:07:52Z

This seems to resolve a deadlock when shutting down a test while the QEMU process is hanging (which it seems to be doing consistently on Hydra, and not anywhere else I've tested). Also a few assorted cleanups to ensure we can't get deadlocked receiving outputs from a command.

dasJ · 2021-10-22T11:10:06Z

@GrahamcOfBorg test simple switchTest

Synthetica9 · 2021-10-22T12:33:42Z

nixos/lib/test-driver/test-driver.py

-                output += match[1]
-                status_code = int(match[2])
-                return (status_code, output)
+            chunk = self.shell.recv(4096)
            output += chunk


Oof, we may also want to do something about the quadratic complexity here

Ideally we'd want some sort of an incremental matcher, but in my limited testing, it seems like it's not really an issue - outputs don't generally exceed 4k, and even when they do, they don't exceed 8k.

As a "solution" we could split this into two commands:

f"( set -euo pipefail; {command} ); echo "$?" > /.nixos-test-runner-exit-code\n" "cat /.nixos-test-runner-exit-code"

This way we can get rid of the entire matching stuff

If we do that, we could just use echo $? as the second command.

Synthetica9 · 2021-10-22T12:49:41Z

nixos/lib/test-driver/test-driver.py


        while True:
-            chunk = self.shell.recv(4096).decode(errors="ignore")


Also, is there a reason we are no longer ignoring errors?

It shouldn't be necessary if we can no longer end up on a character boundary.

I guess there's a technicality here: a command could potentially output complete garbage, but in that case I'd much rather have it crash than return something that makes no sense.

dasJ · 2021-10-23T16:49:44Z

The thing that is blocking the channel is fixed here: #142675, we can refactor execute() later.

dasJ · 2021-10-24T13:50:38Z

Closing this since I reimplemented execute() refactor in #142747 now. This time there should not be quadratic complexity because we use a shifting window.

nixos/test-driver: Fix thread cleanup and execute

5ff3079

dasJ requested a review from tfc as a code owner October 22, 2021 11:05

github-actions bot added the 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS label Oct 22, 2021

dasJ mentioned this pull request Oct 22, 2021

nixos/test-runner: Make less flakey #142498

Closed

12 tasks

dasJ added 6.topic: testing Tooling for automated testing of packages and modules 1.severity: channel blocker Blocks a channel labels Oct 22, 2021

ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Oct 22, 2021

Synthetica9 reviewed Oct 22, 2021

View reviewed changes

dasJ mentioned this pull request Oct 23, 2021

nixos/test-runner: Fix thread cleanup #142675

Merged

12 tasks

dasJ removed the 1.severity: channel blocker Blocks a channel label Oct 23, 2021

dasJ closed this Oct 24, 2021

dasJ deleted the fix/test-runner-cleanup branch October 24, 2021 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nixos/test-driver: Fix thread cleanup and execute #142560

nixos/test-driver: Fix thread cleanup and execute #142560

dasJ commented Oct 22, 2021

K900 commented Oct 22, 2021

dasJ commented Oct 22, 2021

Synthetica9 Oct 22, 2021

K900 Oct 22, 2021

dasJ Oct 22, 2021

K900 Oct 22, 2021

Synthetica9 Oct 22, 2021

K900 Oct 22, 2021

K900 Oct 22, 2021

dasJ commented Oct 23, 2021 •

edited

Loading

dasJ commented Oct 24, 2021


		while True:
		chunk = self.shell.recv(4096).decode(errors="ignore")

nixos/test-driver: Fix thread cleanup and execute #142560

nixos/test-driver: Fix thread cleanup and execute #142560

Conversation

dasJ commented Oct 22, 2021

Motivation for this change

Things done

K900 commented Oct 22, 2021

dasJ commented Oct 22, 2021

Synthetica9 Oct 22, 2021

Choose a reason for hiding this comment

K900 Oct 22, 2021

Choose a reason for hiding this comment

dasJ Oct 22, 2021

Choose a reason for hiding this comment

K900 Oct 22, 2021

Choose a reason for hiding this comment

Synthetica9 Oct 22, 2021

Choose a reason for hiding this comment

K900 Oct 22, 2021

Choose a reason for hiding this comment

K900 Oct 22, 2021

Choose a reason for hiding this comment

dasJ commented Oct 23, 2021 • edited Loading

dasJ commented Oct 24, 2021

dasJ commented Oct 23, 2021 •

edited

Loading