Fix some EOF-handling issues in TLS #44563

davidben · 2022-09-07T19:46:26Z

This fixes a pair of issues with EOF handling in TLSWrap. This originally came up because @codebytere noticed an incompatibility with Node and a recent BoringSSL change, but the root cause is a bug in Node that also impacts how it uses OpenSSL. I've split it into two commits in case you all want to backport only one of them, as the second changes how some errors are surfaced. Node was previously incorrectly representing a class of TLS errors are ECONNRESET, dropping the true error on the ground.

The changes are as follows:

tls: fix re-entrancy issue with TLS close_notify

Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted.

TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error.

In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?)

Instead, SSL_get_error and the error queue should be sampled earlier. Back in #1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.)

I've done this by just moving the calls up a bit, to preserve the existing C++ => JS behavior, but that revealed a bigger issue about EmitRead(UV_EOF), which is the second commit.

tls: don't treat fatal TLS alerts as EOF

SSL_RECEIVED_SHUTDOWN means not just close_notify but also fatal alert. From what I can tell, treating fatal alert as EOF was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify.

This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node.

See the tests that had to be updated as part of this.

I should note, while I'm familiar with TLS and OpenSSL, I'm not at all familiar with Node's streams machinery or sockets APIs. Do you all actually expect to emit EOF on error? This change removes an EmitRead(UV_EOF) on fatal TLS alert. Is that consistent with JS state? I'm assuming so, because there are plenty of other fatal errors which TLSWrap did not emit EOF on (anything case where we reject something from the peer, rather than seeing an alert message, will not set SSL_RECEIVED_SHUTDOWN). So I assume this change just makes peer fatal alerts go through the same codepaths as other peer errors. But hopefully you all will be able to review this better.

I suspect there's also a way to construct a test which demonstrates the first fix standalone. We ran into it in BoringSSL because, to fix some other bugs in OpenSSL's error-handling, we had to treat SSL_ERROR_ZERO_RETURN more like the other SSL_ERROR_* constants. OpenSSL is still quite lax about it, which is masking this bug in the particular scenario where we hit it. But there's probably a more interesting case where reentrancy causes Node to lose a TLS alert. Except you all seem to already be losing TLS alerts until the second commit anyway, so I'm not sure. :-)

Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in nodejs#1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well.

nodejs-github-bot · 2022-09-07T19:46:30Z

Review requested:

@nodejs/crypto

nodejs-github-bot · 2022-09-08T16:00:14Z

CI: https://ci.nodejs.org/job/node-test-pull-request/46488/

SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this.

davidben · 2022-09-08T19:10:56Z

(Updated the second commit to hopefully fix the lint errors.)

nodejs-github-bot · 2022-09-17T19:01:41Z

CI: https://ci.nodejs.org/job/node-test-pull-request/46639/

lpinca

RSLGTM

lpinca · 2022-11-03T19:19:58Z

This should be reviewed by someone who is more familiar with this code and TLS. @bnoordhuis @jasnell?

davidben · 2022-12-05T23:02:36Z

Friendly ping. Is this waiting on anything from my end?

lpinca · 2022-12-06T21:08:30Z

Is this waiting on anything from my end?

No. I'll try to ping @nodejs/crypto again.

nodejs-github-bot · 2022-12-12T19:46:00Z

Landed in ef8aa88...8b89d4d

Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in #1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>

SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>

Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in #1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>

SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>

Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in #1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>

SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>

nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Sep 7, 2022

lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 8, 2022

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 8, 2022

davidben force-pushed the tls-error-handling branch from bfa2092 to 030b8c2 Compare September 8, 2022 19:10

lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Sep 17, 2022

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 17, 2022

lpinca approved these changes Nov 3, 2022

View reviewed changes

Trott force-pushed the main branch from 2d76238 to ca3ed36 Compare November 12, 2022 01:49

lpinca added commit-queue-rebase Add this label to allow the Commit Queue to land a PR in several commits. commit-queue Add this label to land a pull request using GitHub Actions. labels Dec 12, 2022

nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Dec 12, 2022

nodejs-github-bot closed this Dec 12, 2022

RafaelGSS mentioned this pull request Jan 2, 2023

v19.4.0 proposal #46061

Merged

juanarbol mentioned this pull request Jan 28, 2023

V18.14.0 proposal #46396

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix some EOF-handling issues in TLS #44563

Fix some EOF-handling issues in TLS #44563

davidben commented Sep 7, 2022 •

edited

Loading

nodejs-github-bot commented Sep 7, 2022

nodejs-github-bot commented Sep 8, 2022

davidben commented Sep 8, 2022

nodejs-github-bot commented Sep 17, 2022

lpinca left a comment

lpinca commented Nov 3, 2022 •

edited

Loading

davidben commented Dec 5, 2022

lpinca commented Dec 6, 2022

nodejs-github-bot commented Dec 12, 2022

Fix some EOF-handling issues in TLS #44563

Fix some EOF-handling issues in TLS #44563

Conversation

davidben commented Sep 7, 2022 • edited Loading

nodejs-github-bot commented Sep 7, 2022

nodejs-github-bot commented Sep 8, 2022

davidben commented Sep 8, 2022

nodejs-github-bot commented Sep 17, 2022

lpinca left a comment

Choose a reason for hiding this comment

lpinca commented Nov 3, 2022 • edited Loading

davidben commented Dec 5, 2022

lpinca commented Dec 6, 2022

nodejs-github-bot commented Dec 12, 2022

davidben commented Sep 7, 2022 •

edited

Loading

lpinca commented Nov 3, 2022 •

edited

Loading