-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC TLS resume on Linux client #64369
Conversation
Tagging subscribers to this area: @dotnet/ncl, @vcsjones Issue DetailsThis complements #57079 and contributes to #22977 Unlike the server part, this is somewhat more complicated. It may still need some work but I would like to get early feedback for the overall strategy. Lets start with some numbers:
When This creates first complication for client. It also needs to use same To make it more complicated, OpenSSL pushes responsibility for setting the sessions to caller e.g. The rest is mostly technicality. In order to process the sessions, we need to register callbacks and OpenSSL will inform us when it wants to add or remove session. There is some plumbing to map that to .NET and original I added some methods to SafeSslContextHandle to allocate cache and do operations. The
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, I am not confident in how TLS resumption works so it would be best if somebody else added their review as well
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Outdated
Show resolved
Hide resolved
string? name = Marshal.PtrToStringAnsi(serverName); | ||
if (!string.IsNullOrEmpty(name)) | ||
{ | ||
Interop.Ssl.SessionSetHostname(session, serverName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels really weird to me that SetHostname would be inside TryAddSession.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need something to find the session in removal. Associating name with it allows me to get the string and then do lookup. It would be great if we can come up with something that allows to lookup by both IntPtr and Name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure... it just feels like calling SetHostname'd be the responsibility of the caller. Doesn't it have to be done in the case where TryAddSession returns false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ony only for the lookup - there is no functional difference. And I tried to hide all this inside the handle.
If we can lookup/remove the entry just from the IntPtr session, we would not need to do that. But I'm not sure if there is good way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put in comment and made changes to make it clear. I also moved complementing removal so both parts are done inside the SafeHandle.
{ | ||
Interop.Ssl.SessionSetHostname(session, serverName); | ||
|
||
lock (_sslSessions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The locking around _sslSessions makes sense, since you're manipulating state depending on how the dictionary performed.
But, since you're already locking it, it feels like you want a non-Concurrent dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. I was also thinking about grabbing extra reference on the session. That would allow me to use ConcurrentDictionary without locking as the session would never be released in the middle.
Do you have preference/recommendation @bartonjs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grabbing extra reference on the session
Something like
Interop.Ssl.SslSessionUpRef(session);
if (!_sslSessions.TryAdd(...))
{
// Undo the upref since it's not in the dictionary
Interop.Ssl.SslSessionFree(session);
}
? (Upref inside has a race condition with the cleanup in ReleaseHandle)
That would get a little weird since in the cleanup you'd need to call free twice, I think?
The fact that we wrote ConcurrentDictionary suggests that it gives better perf (on average) than manual locking... but if the code to interact with it is doing memory/lifetime management and it becomes unreadable with the gymnastics... then locking is better for maintainability. (If it's clean code and more performant, than by all means use upref+ConcurrentDictionary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. and then call free twice. I'm inclined to good with the lock and better maintainability as the perf does not depend on this. This happens one in while - not even for each SSL session. I started with ConcurrentDictionary but you are right - we don't need it at this point.
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.SslCtx.cs
Outdated
Show resolved
Hide resolved
src/native/libs/System.Security.Cryptography.Native/opensslshim.h
Outdated
Show resolved
Hide resolved
Test failures are related. It seems like bug in old OpenSSL. The public async Task ServerAsyncAuthenticate_SniSetVersion_Success(SslProtocols version)
{
var serverOptions = new SslServerAuthenticationOptions() { ServerCertificate = _serverCertificate, EnabledSslProtocols = version };
var clientOptions = new SslClientAuthenticationOptions() { TargetHost = _serverCertificate.GetNameInfo(X509NameType.SimpleName, forIssuer: false), EnabledSslProtocols = SslProtocols.Tls11 | SslProtocols.Tls12 }; the PAL shim also turned out more complicated for old OpenSSL than I really wanted. Any thoughts on that @bartonjs? (and support for 1.0.1 in general?) |
The OpenSSL project no longer considers 1.0.x to be in support. Ubuntu 16.04 ESM, and other distros older than 2018 that are in a mode other than full-EoL, presumably is custom-patching their source base to try to maintain parity with 1.1.1's security patches. Ubuntu 18.04 for x86 and x64 upgraded from 1.1.0 to 1.1.1 in Dec 2018. https://packages.ubuntu.com/bionic/libssl1.1 says that other architectures of Ubuntu 18.04 still use 1.1.0... but I don't know what our support story is for .NET on any of those architectures on 18.04. That said, I think it's fine to say that a feature only works on an OS with a new-enough version of OpenSSL. |
This should be ready for another review pass with following changes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only minor suggestions.
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs
Outdated
Show resolved
Hide resolved
/azp run runtime-extra-platforms |
Azure Pipelines successfully started running 1 pipeline(s). |
@bartonjs can you please finalize your Code Review? Is there something you consider blocking? |
if (newSessionCb != NULL || removeSessionCb != NULL) | ||
if (newSessionCb != NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logically nested (but not indented) if is really just the latter condition. One of these two lines should be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed 603. I probably started something I end up not needing.
|
||
const char* CryptoNative_SslSessionGetHostname(SSL_SESSION *session) | ||
{ | ||
#ifdef SSL_SESSION_get0_hostname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe the ifdef
will evaluate to true on a non-portable build. (Since it's a function identifier then)
This should probably instead be based on the NEED_OPENSSL_x_y values.
Try building with ./build-native.sh -portableBuild=false
and see if it works as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did build on Ubuntu 16.04 with OpenSSL 1.0.1e with portableBuild true
and false
. And it did not fail. Same on Ubuntu 20.04 with OpenSSL 1.1.1.
Would NEED_OPENSSL_1_1 cover also 3.0? I can still change that if you prefer it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did build on Ubuntu 16.04 with OpenSSL 1.0.1e with portableBuild true and false. And it did not fail.
That's... really surprising. Looking at the headers over time, it looks like it was added in 1.1.0, and has always been a function (vs a macro acting like a function), so what I'd expect is
1.0.1e in non-portable:
The compile doesn't see any definition of SSL_SESSION_get0_hostname at all, turns this into #if false
and removes the code.
1.1.1 in non-portable:
SSL_SESSION_get0_hostname is a function, so the preprocessor doesn't see it, so this still turns into #if false
and goes away.
In both cases the compile would succeed, but I'd expect 1.1.1-direct to fail to run (remember that ./build-native.sh
doesn't update the testhost, so you need to manually copy over it or make it a symlink -- or just load up the compile output in a debugger and see if there's a function body or not)
Would NEED_OPENSSL_1_1 cover also 3.0?
No, you'd need #if defined NEED_OPENSSL_1_1 || defined NEED_OPENSSL_3_0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right. updated. It is interesting that we don't really have any test coverage for non-portable builds, right? (aside from compilation)
The build stuff should be resolved @bartonjs. can you please take another look? |
return 1; | ||
} | ||
|
||
// OpenSSL will destroy session. | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these interesting from a logging perspective (if so, that can easily be in a future change)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about logging but there are conditions that make it normal. If we hit this, there should be no functional change as we simply won't do the caching & resume.
* TLS resume on client * shim functions introduced in 1.1.1 * add missing struct * disable resume on old OpenSSL * feedback from review * fix source build * update comment * feedback from review * feedback from review * avoild client resume on old OpenSSL
This complements #57079 and contributes to #22977
Unlike the server part, this is somewhat more complicated. It may still need some work but I would like to get early feedback for the overall strategy.
Lets start with some numbers:
When
SslStreamCertificateContext
is used on server side, TLS resume is possible and we get ~ 5.5x boost.This should get us on par with Windows but I don't have to same machines to verify this.
On server side #57079 linked the cache to the
SslStreamCertificateContext
. So when server will reuse it it will also provide ability to resume previous session. When the context is Disposed, it will also dispose the SSL_CTX and the cache.This creates first complication for client. It also needs to use same
SSL_CTX
but since there is typically no certificate, there is nothing to hook to.To solve this I added separate dictionary to hold contexts for client. That allows to reuse previous context and save native allocations. But the native contexts will be preserved during duration of the process. Complete test runs for
SslStream
creates 13 instance, in ideal case there will be one e.g.SslProtocols.None
. e.g. it depends on unique combinations ofSslProtocols
used by the client.To make it more complicated, OpenSSL pushes responsibility for setting the sessions to caller e.g.
SslStream
.So we use the
TargetHost
e.g. SNI host to lookup previous session. Since there may be multiple series running on same name this is not perfect. But since this is just offer to the server, it is up to the server to decide if the session is valid or not. Based on my testing, this is how SCHANNEL operates. Some implementations use IP/Port info for the lookup but SslStream does not have access to it in general case. Even with that, this can be just load balancer and the SSL may be terminated or different servers. Once again, if the session is offered to "wrong" server or if the server was restored or cleaned up the session, it should proceed with full handshake.The rest is mostly technicality. In order to process the sessions, we need to register callbacks and OpenSSL will inform us when it wants to add or remove session. There is some plumbing to map that to .NET and original
TargetHost
.There can be multiple sessions for given host but the current strategy is to keep the last one.
I added some methods to SafeSslContextHandle to allocate cache and do operations. The
GCHandle
can possibly be stored in native SSL_CTX but I felt it would be easier for debug to keep it on managed side. Since the sessions are not directly used anywhere in managed code, I useIntPtr
instead of SafeHandle.