-
Notifications
You must be signed in to change notification settings - Fork 989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teardown tunnel automatically if peer's certificate expired #370
Teardown tunnel automatically if peer's certificate expired #370
Conversation
be368eb
to
f670376
Compare
f670376
to
ac84e19
Compare
ac84e19
to
91793df
Compare
6b5bb37
to
33a7a59
Compare
Can we please also have a notice "Node X's certificate has expired, tearing down connection" in the logs? |
33a7a59
to
71b2c4a
Compare
@virtadpt added more logging regarding disconnection. |
So, monitor output for "Invalid certificate status", and then suss out the value of Works for me. I'll give it a try this weekend. |
@rawdigits @nbrownus any chance to review this? |
@ton31337 Not yet. I'm still working 90 hour weeks. |
I finally went off call yesterday. I'll give it a try when I have a moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a few comments, using connectionManager
for this is nice as it won't really punish the hot path and should give a fairly quick reaction to changes in CA trust/certificate expiry.
71b2c4a
to
f96d554
Compare
Handle this with periodic keepalive ticks. This is needed to avoid hanging a connection even peer's certificate expired. In case you use short-lived certificates (let's say 2 hours), current behavior is a bit wrong, because the tunnel stays UP and RUNNING fine unless restarted nebula service. This is the log from how it's reflected. ``` time="2021-05-12T10:15:51Z" level=debug msg="Tunnel status" tunnelCheck="map[method:passive state:alive]" vpnIp=172.17.90.241 time="2021-05-12T10:15:59Z" level=debug msg="Tunnel status" tunnelCheck="map[method:passive state:alive]" vpnIp=172.17.90.241 time="2021-05-12T10:16:06Z" level=debug msg="Invalid certificate status" certName=ton31337 vpnIp=172.17.90.241 time="2021-05-12T10:16:06Z" level=debug msg="Tunnel status" tunnelCheck="map[method:passive state:alive]" vpnIp=172.17.90.241 time="2021-05-12T10:16:06Z" level=debug msg="Tunnel status" certName=ton31337 tunnelCheck="map[method:active state:testing]" vpnIp=172.17.90.241 time="2021-05-12T10:16:20Z" level=debug msg="Tunnel status" tunnelCheck="map[method:active state:alive]" vpnIp=172.17.90.241 time="2021-05-12T10:16:20Z" level=info msg="Tunnel status" certName=ton31337 tunnelCheck="map[method:active state:dead]" vpnIp=172.17.90.241 time="2021-05-12T10:16:20Z" level=debug msg="deleting 172.17.90.241 from lighthouse." time="2021-05-12T10:16:20Z" level=debug msg="Hostmap hostInfo deleted" hostMap="map[indexNumber:3347248980 mapName:main mapTotalSize:844 remoteIndexNumber:3605974939 vpnIp:172.17.90.241]" ``` Signed-off-by: Donatas Abraitis <[email protected]>
f96d554
to
1246442
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, lets just put a note in the CHANGELOG that this won't recognize if the remote side used SIGHUP to refresh to a new certificate. (and we should fix that in a follow up PR)
This restores the hostMap.QueryVpnIP block to how it looked before #370 was merged. I'm not sure why the patch from #370 wanted to continue on if there was no match found in the hostmap, since there isn't anything to do at that point (the tunnel has already been closed). This was causing a crash because the handleInvalidCertificate check expects the hostinfo to be passed in (but it is nil since there was no hostinfo in the hostmap). Fixes: #657
This restores the hostMap.QueryVpnIP block to how it looked before #370 was merged. I'm not sure why the patch from #370 wanted to continue on if there was no match found in the hostmap, since there isn't anything to do at that point (the tunnel has already been closed). This was causing a crash because the handleInvalidCertificate check expects the hostinfo to be passed in (but it is nil since there was no hostinfo in the hostmap). Fixes: #657
Handle this with periodic keepalive ticks. This is needed to avoid
hanging a connection even peer's certificate expired.
In case you use short-lived certificates (let's say 2 hours), current
behavior is a bit wrong, because the tunnel stays UP and RUNNING fine
unless restarted nebula service.
Signed-off-by: Donatas Abraitis [email protected]