Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
37390: roachprod: remove monitor netcat command r=ajkr a=ajkr `roachprod monitor` assumes `nc` will exit as soon as Cockroach server exits. This actually is not the case in later versions of netcat (tested on Ubuntu 18.04+). This PR changes to a polling approach calling `kill -0` once per second to monitor the Cockroach server's liveness. This should give us better portability and we verified the overhead is low (~0.65ms of a CPU core's time per `kill` invocation). Tested by running `roachprod monitor` locally, gradually killing the nodes, and observing the output: ``` 3: 28342 1: 28176 2: 28257 3: kill exited nonzero 3: dead 2: kill exited nonzero 2: dead 1: kill exited nonzero 1: dead ``` Fixes #37370. Release note: None Co-authored-by: Andrew Kryczka <[email protected]>
- Loading branch information