Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not sending keep alive ping packets? #164

Open
brianmay opened this issue Aug 5, 2022 · 3 comments
Open

not sending keep alive ping packets? #164

brianmay opened this issue Aug 5, 2022 · 3 comments

Comments

@brianmay
Copy link

brianmay commented Aug 5, 2022

From all my clients using this library, and only clients using this library, I regularly get these messages from my mosquito mqtt broker:

1659682346: Client robotica-nerves-f447 has exceeded timeout, disconnecting.
1659682730: Client robotica-nerves-f447 has exceeded timeout, disconnecting.
1659683078: Client robotica-nerves-f447 has exceeded timeout, disconnecting.
1659683672: Client robotica-nerves-f447 has exceeded timeout, disconnecting.
1659684140: Client robotica-nerves-c775 has exceeded timeout, disconnecting.
1659684236: Client robotica-nerves-3739 has exceeded timeout, disconnecting.
1659684254: Client robotica-nerves-f447 has exceeded timeout, disconnecting.
1659684326: Client robotica-nerves-b034 has exceeded timeout, disconnecting.
1659684350: Client robotica-nerves-9be0 has exceeded timeout, disconnecting.
1659684434: Client robotica-nerves-c775 has exceeded timeout, disconnecting.
1659684470: Client robotica-nerves-3739 has exceeded timeout, disconnecting.
1659684518: Client robotica-nerves-f447 has exceeded timeout, disconnecting.
1659684602: Client robotica-nerves-3739 has exceeded timeout, disconnecting.

I have tried different values for the keepalive parameter, and it hasn't helped. I tried 60, currently it is set to 30.

I can only think that this library is not reliably sending the PINGREQ packets as required.

@brianmay
Copy link
Author

brianmay commented Aug 20, 2022

On further investigate, I found that the library sometimes is late to to sending PINGREQ packets after it receives a PUBLISH packet. As in around 60 seconds can elapse from receiving the PUBLISH packet before a PINGREQ is sent, when keepalive is set to 30. This can result in mosquitto disconnecting the client, as above.

Setting force_ping to true seems to solves this.

@zmstone
Copy link
Member

zmstone commented Aug 22, 2022

Hi @brianmay
I had a look at the code, the timer seems to be OK (which also confirmed by your observation that force_ping helped).
The only suspicious part seems to be the ping or not-ping decision is perhaps not quite accurate, or not quite aligned with mosquitto's expectation.
emqtt does not send a ping when there were packets sent within the last ping interval.

though, it checks the number of bytes sent, instead of the actual number of packets sent.
--- however, this is only an issue if it takes more than 30 seconds to send one MQTT packet.

@brianmay
Copy link
Author

brianmay commented Aug 22, 2022

Yes, my suspicion was in that code also. Which I don't entirely understand.

Here is an example log excerpt from mosquitto:

[...]
2022-08-17T12:26:32: Received PINGREQ from robotica-nerves-c775
2022-08-17T12:26:32: Sending PINGRESP to robotica-nerves-c775
2022-08-17T12:27:02: Received PINGREQ from robotica-nerves-c775
2022-08-17T12:27:02: Sending PINGRESP to robotica-nerves-c775
2022-08-17T12:27:02: Sending PUBLISH to robotica-nerves-c775 (d0, q1, r0, m6, 'command/Passage/Light/scene/auto', ... (128 bytes))
2022-08-17T12:27:02: Received PUBACK from robotica-nerves-c775 (Mid: 6, RC:0)
2022-08-17T12:27:02: Received PUBLISH from robotica-nerves-c775 (d0, q0, r1, m0, 'state/Passage/Light/colors', ... (1985 bytes))
2022-08-17T12:27:02: Received PUBLISH from robotica-nerves-c775 (d0, q0, r1, m0, 'state/Passage/Light/power', ... (2 bytes))
2022-08-17T12:27:52: Client robotica-nerves-c775 has exceeded timeout, disconnecting.
2022-08-17T12:27:53: New client connected from 2001:44b8:4112:8a10:ba27:ebff:fe5a:9220:37620 as robotica-nerves-c775 (p5, c1, k30, u'robotica').

You can see here that the PINGs were working perfectly. Until we received the PUBLISH packets. Then mosquitto disconnected us after not receiving anything for 50 seconds.

I believe I read somewhere that because we negotiated a keep alive period of 30 seconds, the server can kill the connection after 30*1.5 = 45 seconds. But in practise it can be longer then that.

I imagine what happened is we decided to skip the 12:27:32 ping packet. Because we received traffic at 12:27:02. But this means the next ping would not have been sent until 12:28:02 which is too late.

Instead of skipping scheduled packets, we should be adjusting the timer we send the next keepalive to +30 seconds from when we sent/received a packet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants