Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sending a coap request to an unreachable server: missing notification? #249

Closed
vdaele opened this issue Aug 25, 2018 · 11 comments
Closed

sending a coap request to an unreachable server: missing notification? #249

vdaele opened this issue Aug 25, 2018 · 11 comments

Comments

@vdaele
Copy link

vdaele commented Aug 25, 2018

As a test case, I want to send a coap request to an unreachable server.
Precondition: my coap client app is started up correctly and I've been able to toggle the (Ikea) light on and off a few times.
Next, I unplug the network cable from the Ikea hub and try to toggle the light again.
I notice that the packet is retransmitted 4 times (as expected I guess) and afterwards, I see in the logs tid=44815: give up after 4 attempts

However, there seems to be no notification/callback to the coap-client, right?
What would be the best way to get this information ? an extra call to coap_handle_event where the "give up" mesage is logged?

Edit: I just noted that the comment before coap_event_t states

Scalar type to represent different events, e.g. DTLS events or retransmission timeouts

Thanks in advance,

Marc

@vdaele
Copy link
Author

vdaele commented Aug 26, 2018

And on a similar note: If I send a request to a server using an old/wrong IP address, I see multiple times in the logs retransmit handshake packet and after 7 retries I see removed transaction from within dtls_retransmit.
However, I don't seem to get an error/notification in my client.

I do note a number of lines containing DTLS_ALERT_HANDSHAKE_FAILURE in the code. Would it be sufficient to add a statement like CALL(context, event, &node->peer->session, DTLS_ALERT_LEVEL_FATAL, DTLS_ALERT_HANDSHAKE_FAILURE) right before the remove transaction logline in dtls.c?

@mrdeep1
Copy link
Collaborator

mrdeep1 commented Aug 26, 2018

Do you have a NACK handler defined by coap_register_nack_handler()? For confirmable messages, I would expect this to be getting called in your first case, otherwise in all cases for your second case.

@vdaele
Copy link
Author

vdaele commented Aug 26, 2018

I don't have this handler defined (and I noticed some other handlers that might me interesting as well!).
I'll try adding one and update this issue accordingly!
Thanks!

@vdaele
Copy link
Author

vdaele commented Aug 27, 2018

I registered a coap_register_nack_handler and it indeed gets called in the first case (I do have confirmable messages)

However, in the second case (when using an invalid IP) I would expect a callback when removed transaction is printed. However, it seems that it gets called after a timeout of about 80 seconds.

The log below is a log using #define DTLS_DEFAULT_MAX_RETRANSMIT 7 in tinydtls/global.h.

Aug 27 19:48:52 DEBG *** new session 0x1830d70
...
Aug 27 19:50:20 DEBG ** DTLS global timeout set to 38073ms
Aug 27 19:50:21 DEBG ** DTLS global timeout set to 37073ms
Aug 27 19:50:22 INFO timeout
Aug 27 19:50:22 DEBG send header: (13 bytes):
00000000 15 FE FD 00 00 00 00 00 00 00 06 00 02
Aug 27 19:50:22 DEBG send unencrypted: (2 bytes):
00000000 02 00
Aug 27 19:50:22 DEBG * 192.168.1.12:46603 <-> 192.168.1.55:5684 DTLS: sent 15 bytes
Aug 27 19:50:22 DEBG removed peer: 192.168.1.55:5684
Aug 27 19:50:22 DEBG *** removed session 0x1830d70
Aug 27 19:50:22 MyLog: coap_nack_handler_static 3***
Aug 27 19:50:22 DEBG *** 192.168.1.12:46603 <-> 192.168.1.55:5684 DTLS: session closed

The log below is a log using #define DTLS_DEFAULT_MAX_RETRANSMIT 3 (instead of 7) in tinydtls/global.h. This log shows that the removed transaction comes after 30s and the timeout and the call to the nack_handler after 90s.
Hence my question: shouldn't the nack_handler be called earlier (when the removed transaction is printed)?

Aug 27 19:44:05 DEBG *** new session 0x744d70
...
Aug 27 19:44:35 DEBG ** removed transaction
Aug 27 19:45:35 INFO timeout
Aug 27 19:45:35 DEBG send header: (13 bytes):
00000000 15 FE FD 00 00 00 00 00 00 00 04 00 02
Aug 27 19:45:35 DEBG send unencrypted: (2 bytes):
00000000 02 00
Aug 27 19:45:35 DEBG * 192.168.1.12:55611 <-> 192.168.1.55:5684 DTLS: sent 15 bytes
Aug 27 19:45:35 DEBG removed peer: 192.168.1.55:5684
Aug 27 19:45:35 DEBG *** removed session 0x744d70
Aug 27 19:45:35 MyLog: coap_nack_handler_static 3***
Aug 27 19:45:35 DEBG *** 192.168.1.12:55611 <-> 192.168.1.55:5684 DTLS: session closed

@mrdeep1
Copy link
Collaborator

mrdeep1 commented Aug 28, 2018

In your first case, encryption has been set up and then you drop the traffic. In the second case, encryption has not been agreed between client and server - so a NACK is not so appropriate here.

Tinydtls is a separate project - libcoap is just using it for one of the DTLS type options, and so changes in the tinydtls code will not be integrated into libcoap. The only possibility is changes to src/coap_tinydtls.c - the glue between libcoap and tinydtls.

The coap-client is setting a global timeout of 90 seconds - which when expired closes the session and triggers the NACK.

With DTLS_DEFAULT_MAX_RETRANSMIT 7, the 90 second timeout expires before the re-transmit count expires. With DTLS_DEFAULT_MAX_RETRANSMIT 3, the re-transmits timeout before the global 90 second timeout.

PR #183 has many event/nack fixes, and it looks like this may already fix the issue you are highlighting.

@vdaele
Copy link
Author

vdaele commented Aug 28, 2018

Is it correct that this PR #183 is not yet merged to the develop branch (I'm no git expert at all)? If so, when do you expect that this merge will happen? I certainly prefer using your lib to modifying coap_tinydtls.c myself.

@mrdeep1
Copy link
Collaborator

mrdeep1 commented Aug 28, 2018

Yes we are waiting on PR #183 to get merged in. In terms of when, not sure, but would expect it to be soon. based on https://sourceforge.net/p/libcoap/mailman/message/36346758/ last update.

@obgm
Copy link
Owner

obgm commented Oct 4, 2018

As PR #183 has been merged, can you check if this has solved this issue?

@vdaele
Copy link
Author

vdaele commented Oct 7, 2018

Current status with the latest 4.2.0 prerelease:

  • when disconnecting the hub and using CON messages, the nack_handler gets called ("give up after 4 attempts") after about 1'20" with reason=COAP_NACK_TOO_MANY_RETRIES.
    • I can get a faster notification I guess by reducing COAP_DEFAULT_MAX_RETRANSMIT in coap_session.h to eg 1 or 2?
  • when disconnecting the hub and using NON messages, no handlers seems to get called, right?
    • Is there some way to detect this? Can you somehow detect that the remote peer closed the socket?
  • when trying to connect to an invalid hub, the nack_handler gets called immediately with reason=COAP_NACK_TLS_FAILED

@mrdeep1
Copy link
Collaborator

mrdeep1 commented Oct 7, 2018

  1. Instead of changing COAP_DEFAULT_MAX_RETRANSMIT, use coap_session_set_max_retransmit() instead to define the local value. See the coap_recovery(3) man page.
  2. Correct - UDP is unreliable, and the use of NON does not guarantee any response, so libcoap will not generate any failure events. However, the CoAP Ping is designed for this purpose of testing out link connectivity by sending an Empty Confirmable Message - see RFC7252 4.3. Messages Transmitted without Reliability.
  3. This is likely down to a the fact that the local device is unable to transmit the packet to the non existent address (as no arp entry could be created to send the packet to).

@vdaele
Copy link
Author

vdaele commented Oct 8, 2018

Thanks a lot for your input (again!)

Marc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants