Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/tinydtls: Multiple issues #16108

Open
5 tasks
janosbrodbeck opened this issue Feb 28, 2021 · 4 comments
Open
5 tasks

pkg/tinydtls: Multiple issues #16108

janosbrodbeck opened this issue Feb 28, 2021 · 4 comments
Assignees
Labels
Area: pkg Area: External package ports Area: security Area: Security-related libraries and subsystems Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: tracking The issue tracks and organizes the sub-tasks of a larger effort

Comments

@janosbrodbeck
Copy link
Member

janosbrodbeck commented Feb 28, 2021

Description

My approach to support the sock_dtls API in gcoap(#15549) showed some issues with tinydtls and the implementation for the sock_dtls API.

Issues:

  • sock_dtls_session_init(): Even when the peer is already connected, the function initiates a session re-negotiation. That may be fine - but the documentation does not make that super clear. Is this intended behavior?
  • Session re-negotiation/re-connect doesn't send any messages. When sock_dtls_session_init() triggers session re-negotiation or session re-connecting, no messages are sent and the process never successful finishes.
  • sock_dtls_send_aux(): When a session was previously established with a server and the session has been destroyed by the client, BUT the server never sent a Encrypted Alert (Close Notify) (or the client just did not receive it), this function does not even try to re-establish a session. It thinks a session is still established. Send always fails then.
  • Session closing: tinydtls always waits endlessly for the Close Notify msg of the other peer. In that time the peer slot is still occupied. I described this problem also in pkg/tinydtls: DTLS process gets completly borked #15842. It can bring tinydtls in an unusable state, since the sock_dtls API does not provide a way to recover or force delete such sessions.
  • Handshake process: Async sockets are used and a DTLS server and client are running over the same socket. For tinydtls concurrent handshakes are activated (>1). The server is async and receives handshakes in a DTLS event handler. Meanwhile the user wants to use the DTLS client to connect to a remote server - it makes a handshake in a way it is described in the API: calling sock_dtls_session_init() and then sock_dtls_recv(). Right now, in that moment: another node wants to connect to the DTLS server via handshake. Who receives which handshake? What happens?

Steps to reproduce the issue

  • Problem 2: Can be reproduced by removing the two tinydtls commits in my gcoap DTLS PR. Send a message to a server, which is not running, wait till it fails. Start the server and try again. CONFIG_DTLS_DEBUG=1 activates tinydtls debug messages. It should indicate that a session re-connect is triggered. I could never sniff packets via wireshark in such scenario.

  • Problem 3: Can be tested with this branch and remove the tinydtls commit which is currently fixing that problem. Can be triggered by sending a message to a server successfully, but close the server immediately after. Then try again to send a message.

  • Problem 4: Can be reproduced by using my gcoap DTLS pull request without the latest tinydtls hotfix. Send a request to a client and close the server or client before they can close the session. Restart the closed node again ant re-try. A new session cannot be established.

The handshake process and session closing process is very error prone at the moment. If a session has been established it works fine, but especially a completely stuck tinydtls is a problem that must never occur.

I have currently no idea what's the problem with session re-negotiation/re-connecting. Maybe @pokgak has an idea. Problem 3 could be fixed by checking that scenario first and reset the peer in that case or re-connect the peer.

@benpicco benpicco added the Area: security Area: Security-related libraries and subsystems label Feb 28, 2021
@pokgak
Copy link
Contributor

pokgak commented Feb 28, 2021

  • sock_dtls_session_init(): Even when the peer is already connected, the function initiates a session re-negotiation. That may be fine - but the documentation does not make that super clear. Is this intended behavior?

Problem 1: This is a bug. We rely on dtls_connect to detect if a peer is already connected. It should return 0 if a peer already exists and skips the session init as shown here. It seems like the function does not detect the peer is already exists here. I suspect this might be something to do with endpoints port and interface not configured correctly when initialized.

  • Session re-negotiation/re-connect doesn't send any messages. When sock_dtls_session_init() triggers session re-negotiation or session re-connecting, no messages are sent and the process never successful finishes.

Problem 2: have you looked at the DTLS debug log and see if there's anything interesting there when this happens?

Problem 3 & 4: this is related to the behavior when closing connections. Currently, tinydtls will not free the peer slot until both sides have sent and received the CloseNotify message, just like described in the RFC. A solution that is suggested in #15842 is to reset the peer on the server each time the session timed out to forcefully free the peer slots even without receiving a CloseNotify back from the client. The RFC allows this forceful reset but only for the read side of the connection but for write, both side MUST send a CloseNotify.

@janosbrodbeck
Copy link
Member Author

janosbrodbeck commented Mar 1, 2021

  • sock_dtls_session_init(): Even when the peer is already connected, the function initiates a session re-negotiation. That may be fine - but the documentation does not make that super clear. Is this intended behavior?

Problem 1: This is a bug. We rely on dtls_connect to detect if a peer is already connected. It should return 0 if a peer already exists and skips the session init as shown here. It seems like the function does not detect the peer is already exists here. I suspect this might be something to do with endpoints port and interface not configured correctly when initialized.

Yes, but dtls_connect() does re-connect even when it's already connected. Tinydtls jumps from dtls_connect() here into dtls_connect_peer(), which then always re-negotiate when the peer is found (here). If we don't want this behavior we need to check it before and prevent calling dtls_connect().

  • Session re-negotiation/re-connect doesn't send any messages. When sock_dtls_session_init() triggers session re-negotiation or session re-connecting, no messages are sent and the process never successful finishes.

Problem 2: have you looked at the DTLS debug log and see if there's anything interesting there when this happens?

Nothing interesting. The gcoap_cli: msg send failed indicates the timeout (5 seconds). Wireshark shows no packets.

coap get fe80::88ba:fff:fed6:46c6 5684 /riot/board
gcoap_cli: sending msg ID 53326, 17 bytes
Mar 01 01:02:23 DEBG found peer, try to re-connect
Mar 01 01:02:26 DEBG send header: (13 bytes):
00000000 15 FE FD 00 00 00 00 00 00 00 02 00 02  
Mar 01 01:02:26 DEBG send unencrypted: (2 bytes):
00000000 02 00 
gcoap_cli: msg send failed

Problem 3 & 4: this is related to the behavior when closing connections. Currently, tinydtls will not free the peer slot until both sides have sent and received the CloseNotify message, just like described in the RFC. A solution that is suggested in #15842 is to reset the peer on the server each time the session timed out to forcefully free the peer slots even without receiving a CloseNotify back from the client. The RFC allows this forceful reset but only for the read side of the connection but for write, both side MUST send a CloseNotify.

Sorry, should have written that the solution for this was already discussed in the other issue. Just wanted to note it again, because it can lead to problems.

The force delete you described is what I did in my hotfix commits, when a node tries to initiate a handshake and the peer is still found but not connected. Since I assume, that if sock_dtls_send_aux() needs to perform a handshake - and the peer is still found, which means the session was not closed completely(=no close notify of the other peer was received) - the closer of the session was the node itself. The same applies to sock_dtls_session_init(). But I was not confident enough to pull request this. Is my assumption always correct?

In general: It's sane that sessions should have a expiration time. Then the session can been reset forcefully. But we do not provide a function for that right now.

From looking into wolfSSL: wolfSSL seems to implement session expiration itself. There is a wolfSSL_flush_sessions() function (API link). WolfSSL seems to make session management a bit easier from looking into the API.

@miri64
Copy link
Member

miri64 commented Mar 1, 2021

Mind if we make this a tracking issue, since you have multiple issues literally in the name? ;-)

@miri64 miri64 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: tracking The issue tracks and organizes the sub-tasks of a larger effort labels Mar 1, 2021
@janosbrodbeck
Copy link
Member Author

janosbrodbeck commented Mar 2, 2021

Added another issue that stems out of a discussion between @pokgak and me. But more investigation is needed to make any statement about it. I just want to keep track of it.

@aabadie aabadie added the Area: pkg Area: External package ports label May 20, 2021
@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 21, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: pkg Area: External package ports Area: security Area: Security-related libraries and subsystems Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: tracking The issue tracks and organizes the sub-tasks of a larger effort
Projects
None yet
Development

No branches or pull requests

7 participants