-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process does not exit after https request with keepAlive enabled (now default in Node 19) #47228
Comments
Confirmed. |
@nodejs/http |
I should clarify; in my testing where I found this bug (in dprint's npm install script), it does eventually exit at exactly 2 minutes. So there is a timeout of some sort, somewhere, if that provides any guesses as to what's going on. (Of course, npm install taking an extra 2 minutes isn't so great 😄) |
The sockets in the keepalive pool should probably be "unreffed". https://nodejs.org/api/net.html#socketunref |
http.Agent#keepSocketAlive() does just that: Line 485 in 8200304
https.Agent inherits that behavior from http.Agent. There's #47137 (comment) (tl;dr wrong C++ method called for TLS sockets) although that issue is more or less the exact opposite of what's reported here. |
Node 20 recently branched off, so this bug seems like it's about to become LTS and cause a lot of hangs. Is there anything I can provide to help try and get this fixed? Any pointers as to what might be the problem in case I can go and try to fix it myself? |
let me know if I can pickup this bug? |
@jagadeeshmeesala go for it. |
I've been trying to figure this one out. It turns out that Manually calling |
If anyone else is taking a look, I've gotten this to reproduce in a test: 'use strict';
const common = require('../common');
const cp = require('child_process');
const http = require('http');
if (process.argv[2] === 'server') {
process.on('disconnect', () => process.exit(0));
const server = http.createServer((req, res) => {
res.writeHead(200);
res.end();
});
server.listen(0, () => {
process.send(server.address().port);
});
} else {
const serverProcess = cp.fork(__filename, ['server'], { stdio: ['ignore', 'ignore', 'ignore', 'ipc'] });
serverProcess.once('message', common.mustCall((port) => {
serverProcess.channel.unref();
serverProcess.unref();
const agent = new http.Agent({ keepAlive: true });
http.get({ host: common.localhostIPv4, port, agent }, common.mustCall());
}));
// If any sockets are left open, we'll hit the below instead of exiting.
setTimeout(common.mustNotCall(), common.platformTimeout(3000)).unref();
} The test also confirms that simply |
After hitting my head against this all weekend, I don't really know the way forward here. Someone has to call I would have thought |
@ShogunPanda wdyt? I think we should have the Agent call unref() once the socket enters the pool. |
@mcollina I'm not sure about it. I guess we shall not do this automatically but rather have the user opt-in this behavior. The reason being I'm expecting this change to have application exit at the wrong time if people are mismanaging events. |
started looking into this. |
This bug is also happening to many github actions running or switching to node20. Silently slowing them down by approx. 2min |
The default for keepAlive on https request changed from false to true in Node 19 nodejs/node#47228
It looks like the behavior is different depending on the remote server (the Keep-Alive hint header, is the culprit maybe?) Given
On my laptop (mac, m1, node 20): Requests to neverssl.com cause node to exit after 5 seconds:
Whereas requests to example.com doesn't exit for 180 seconds:
If I remove the if condition within Edit: Trying to resolve this making me go mad. I was just forgetting to resume the request. |
The problem is that the response is not consumed.
exits right away. MDN says:
But the node http.ClientRequest docs say:
... without a carve out for HEAD. Should HEAD requests always call resume on agent incoming response? |
Similarly to HEAD, what about 3xx's (or other requests with no body)? Should the socket return to freeSockets pool and get unref'd automatically? Pulling out relevant parts of the dprint install script referenced by @jakebailey at the time:
hangs for about 5 seconds after "done" is printed. But if the https.get response handler "consumes" the initial redirect response:
then node exits immediately after printing "done". Edit: 3xx may return a response body. So either automatic resume would need to be based on Content-Length -- or it should be up to the caller to remember. |
In accordance with https://www.rfc-editor.org/rfc/rfc9112#name-message-body-length: HEAD, 1xx, 204, and 304 responses cannot contain a message body. If a socket will be kept-alive, resume the socket during parsing so that it may be returned to the free pool. Fixes nodejs#47228
In accordance with https://www.rfc-editor.org/rfc/rfc9112#name-message-body-length: HEAD, 1xx, 204, and 304 responses cannot contain a message body. If a socket will be kept-alive, resume the socket during parsing so that it may be returned to the free pool. Fixes nodejs#47228
In accordance with https://www.rfc-editor.org/rfc/rfc9112#name-message-body-length: HEAD, 1xx, 204, and 304 responses cannot contain a message body. If a socket will be kept-alive, resume the socket during parsing so that it may be returned to the free pool. Fixes nodejs#47228
According to RFC9112 section 6.3.1: HEAD requests, and responses with status 204 and 304 cannot contain a message body, If a socket will be kept-alive, resume the socket during parsing so that it may be returned to the free pool. Fixes nodejs#47228
According to RFC9112 section 6.3.1: HEAD requests, and responses with status 204 and 304 cannot contain a message body, If a socket will be kept-alive, resume the socket during parsing so that it may be returned to the free pool. Fixes nodejs#47228
Running the postinstall script takes around 30s. It looks like nodejs/node#47228 is the root cause. Adding a `response.resume()` is a working workaround.
Running the postinstall script takes around 30s. It looks like nodejs/node#47228 is the root cause. Adding a `response.resume()` is a working workaround.
Version
v19.8.1
Platform
Linux Jake-Framework 6.2.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 18 Mar 2023 01:06:36 +0000 x86_64 GNU/Linux
Subsystem
https
What steps will reproduce the bug?
Run a script containing:
This will hang.
How often does it reproduce? Is there a required condition?
100% of the time; downgrade to Node 18 and it no longer hangs.
What is the expected behavior? Why is that the expected behavior?
No hang. Run this:
And it will succeed.
What do you see instead?
Hang; the request succeeds and the response is given, but the process never exits.
Additional information
This is a change I saw in the Node 19 release blog post; keep-alive is enabled by default now, but, it seems to cause a hang.
The text was updated successfully, but these errors were encountered: