Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TEST][NO-MERGE] Stress test sockets #381

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

bell-db
Copy link
Contributor

@bell-db bell-db commented Sep 23, 2024

It's found that the nc based communication introduced in #367, while more reliable than the named-pipe based communication, can still get stuck or have mismatched messages occasionally during 100k runs.

Out of 100k runs, we had 8 timeouts, 20 empty responses, and 4 Bad Request response as shown below. This is better than with named pipes (70 timeouts out of 100k times) but still pretty unreliable.

git clone [email protected]:bell-db/protoc-bridge.git
cd protoc-bridge
git fetch origin bell-db/v0.9.7-socket-stress-test:bell-db/v0.9.7-socket-stress-test
git checkout bell-db/v0.9.7-socket-stress-test
sbt "testOnly protocbridge.frontend.MacPluginFrontendSpec"

Moreover, sometimes the returned message is pretty confusing:

SSH-2.0-OpenSSH_9.3p1 Ubuntu-1ubuntu3.2
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Connection: close

400 Bad Request

The same test (with nc -N) can pass on Linux (Ubuntu 20.04.6 LTS, Xeon(R) Platinum 8375C).

Port conflict

It turns out that on macOS (14.6.1, M2 Max), new ServerSocket(0) (or Python sock.bind(('', 0))) might return a port already in use by another process (even those already for a while, so not a race condition in allocation). This can be confirmed with

sbt "testOnly protocbridge.frontend.SocketAllocationSpec"
python3 port_conflict.py

Such commands can succeed with no conflict found on Linux (Ubuntu 20.04.6 LTS, Xeon(R) Platinum 8375C).

Pop-up

There are reports that nc (invoked from Bazel) can trigger a pop-up on macOS "Do you want the application "nc" to accept incoming network connections?" or fail with nc: connectx to 127.0.0.1 port 59416 (tcp) failed: Operation not permitted. It's not clear how to reliably reproduce this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant