[WiFi] read all lines from slow servers #1720

TD-er · 2018-09-08T13:19:24Z

As suggested by @clumsy-stefan here: #1669 (comment)

The suggested fix is based on this PR in the ESP8266 Arduino examples. (PR 5113)

I would like to know if someone who is experiencing issues with slow servers, if they see this line in the logs:

"Timeout while reading input data!"

If so, please report it, so I can make the timeout setting configurable.
Later I will make the function safeReadStringUntil use a callback construction and let it operate async.
For now the function is blocking, so it may cause all kind of issues.

The text was updated successfully, but these errors were encountered:

See letscontrolit#1720

sassod · 2018-09-08T13:25:32Z

I saw a lot of these when thingspeak plugin was active and mandatory 15s interval was applied, the data was sent successfully and after that the warning appeared. But next data sent was again successful and again the warning appeared.

…

On Sat, Sep 8, 2018, 15:19 Gijs Noorlander ***@***.***> wrote: As suggested by @clumsy-stefan <https://github.com/clumsy-stefan> here: #1669 (comment) <#1669 (comment)> The suggested fix is based on this PR <esp8266/Arduino@4fdff07> in the ESP8266 Arduino examples. (PR 5113 <esp8266/Arduino#5113>) I would like to know if someone who is experiencing issues with slow servers, if they see this line in the logs: "Timeout while reading input data!" If so, please report it, so I can make the timeout setting configurable. Later I will make the function safeReadStringUntil use a callback construction and let it operate async. For now the function is blocking, so it may cause all kind of issues. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1720>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADPy_58jLgO0CHcqDEj1yOJloFt-MhPlks5uY8PdgaJpZM4Wf3kW> .

TD-er · 2018-09-08T13:30:40Z

@sassod Was this with the controller queue included? (PR merged yesterday, but I made a lot of test builds.)

sassod · 2018-09-08T13:43:20Z

I haven't updated all the nodes some are running builds dated May or April but some have been updated to early September releases, I will need to check if the most recent releases are still producing this info. Unfortunately I'm not gonna be able to check because almost all have deep sleep enabled and I'm away from home till 16th of September. Maybe someone else can check the status if thingspeak plugin is sending multiple data one after ankther with 15s delay

…

On Sat, Sep 8, 2018, 15:30 Gijs Noorlander ***@***.***> wrote: @sassod <https://github.com/sassod> Was this with the controller queue included? (PR merged yesterday, but I made a lot of test builds.) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1720 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADPy_2_Dus7A50_B74rlxa-oXQdOzvVPks5uY8aCgaJpZM4Wf3kW> .

TD-er · 2018-09-08T13:49:59Z

The change introduced with yesterdays merge is meant to be used for controllers like Thingspeak.
The controllers themselves manage their own queue and only send new values when the minimal interval has passed since last send.
So you don't have to use the Message Delay value anymore. (that one blocks the whole ESP)

Only thing I haven't checked is if the result of millis() increases when in deep sleep. (this just occurred to me)

clumsy-stefan · 2018-09-08T15:42:32Z

I only have "HTTP : C009 connection failed" in the logs (on firsth sight) and no TImeouts as described... but probably the same root cause?!

TD-er · 2018-09-08T16:08:39Z

@clumsy-stefan Looking at the code, that error will occur when

client.connect returns something other than 1
WiFi is not connected
Hostname cannot be resolved (when using DNS)

As far as I can see, the timeout cannot be set, but is 5000 msec for WiFiClient.

[#1720] WiFi read all lines from slow servers

clumsy-stefan · 2018-09-09T07:47:31Z

I've upgraded my server to a newer and faster HW/SW.... interestingly since then the units run without issues and reboots.... it really seems, that when the server is really slow or has tcp/network problems the client can get into some strange state... eg. having to many parallel open sessions or waiting too long for a result...

clumsy-stefan · 2018-09-12T06:25:50Z

quick update, all units runing with latest commit from yesterday (self compiled):

nb: now I also do get the timeouts in the logs ... I only see reboots, when the server is not running/answering... other than this they seem to work fine..

Name / date / UPTIME (over 30 hours, most of them)....
sonoff_basic_06_sysinfo 2018-09-12 08:20:08 2125
sonoff_basic_08_sysinfo 2018-09-12 08:20:21 2125
sonoff_dual_01_sysinfo 2018-09-09 16:37:51 1594
sonoff_pow_01_sysinfo 2018-09-12 08:20:17 1364
sonoff_pow_02_sysinfo 2018-09-12 08:20:23 2126
wemos_mini_01_sysinfo 2018-09-12 04:24:45 1896
wemos_mini_02_sysinfo 2018-09-12 08:19:59 2131
wemos_mini_03_sysinfo 2018-09-12 08:20:14 2131
wemos_mini_04_sysinfo 2018-09-12 08:20:07 1991
wemos_mini_05_sysinfo 2018-09-11 19:40:42 1371
wemos_mini_06_sysinfo 2018-09-12 08:20:07 2130
wemos_mini_07_sysinfo 2018-09-12 08:20:19 1585
wemos_mini_08_sysinfo 2018-09-12 08:20:16 2129
wemos_mini_09_sysinfo 2018-09-12 08:20:30 2129
wemos_mini_10_sysinfo 2018-09-12 08:19:55 2128
wemos_mini_11_sysinfo 2018-09-12 08:20:14 2128
wemos_mini_12_sysinfo 2018-09-12 08:20:31 2128
wemos_mini_13_sysinfo 2018-09-12 08:19:49 2127
wemos_mini_14_sysinfo 2018-09-12 08:20:03 2127
wemos_mini_15_sysinfo 2018-09-12 08:20:15 2127
wemos_mini_16_sysinfo 2018-09-12 08:20:26 2127
wemos_pro_01_sysinfo 2018-09-12 08:20:14 2126

clumsy-stefan · 2018-09-14T08:30:33Z

Another quick update... As per list below lots of nodes have >5000min uptime!! great... As I had to shutdown and work on the server it seems some of the nodes (quite randomly) rebooted while others survived without issues...

However, for my part the current version(s) seem to be really stable!! appreciate it!! Good work!!

Name / date / UPTIME
sonoff_basic_02_sysinfo 2018-09-14 05:21:41 27
sonoff_basic_03_sysinfo 2018-09-14 10:27:28 1665
sonoff_basic_06_sysinfo 2018-09-14 10:27:06 5131
sonoff_basic_08_sysinfo 2018-09-14 10:27:19 5132
sonoff_pow_01_sysinfo 2018-09-14 10:27:15 4371
sonoff_pow_02_sysinfo 2018-09-14 10:27:20 5133
wemos_mini_01_sysinfo 2018-09-14 10:26:49 4528
wemos_mini_02_sysinfo 2018-09-14 10:27:01 5137
wemos_mini_03_sysinfo 2018-09-14 10:27:29 2745
wemos_mini_04_sysinfo 2018-09-14 10:27:08 4997
wemos_mini_05_sysinfo 2018-09-14 10:26:47 3933
wemos_mini_06_sysinfo 2018-09-14 10:27:05 5136
wemos_mini_07_sysinfo 2018-09-14 10:27:11 760
wemos_mini_08_sysinfo 2018-09-14 10:27:12 5136
wemos_mini_09_sysinfo 2018-09-14 10:27:27 5135
wemos_mini_10_sysinfo 2018-09-14 10:27:01 793
wemos_mini_11_sysinfo 2018-09-14 10:27:20 5135
wemos_mini_12_sysinfo 2018-09-14 10:26:37 5133
wemos_mini_13_sysinfo 2018-09-14 10:26:57 5134
wemos_mini_14_sysinfo 2018-09-14 10:27:11 5079
wemos_mini_15_sysinfo 2018-09-14 10:27:19 5134
wemos_mini_16_sysinfo 2018-09-14 10:27:23 5134
wemos_pro_01_sysinfo 2018-09-14 10:26:57 1994

TD-er · 2018-09-14T08:34:43Z

OK, then for now, I will close it

giig1967g · 2018-09-14T08:36:56Z

ok to close it, but still the fact that some units rebooted just because the server was shutdown needs further investigation, I think.

clumsy-stefan · 2018-09-14T08:50:54Z

problem is, since updating my server it seems to be fast enough so the "slow response" issue does not occur anymore ;) so I'm having troubles to test it..

giig1967g · 2018-09-14T08:58:42Z

but if you shutdown the server some units do reboot?

TD-er · 2018-09-14T09:00:35Z

And what is the reboot reason then?
Watchdog?

clumsy-stefan · 2018-09-14T09:10:33Z

@giig1967g: yes, true, but can't be sure it's because of that
@TD-er: as an example units 7 & 10: both of them:
Boot Manual reboot (2)
Reset Reason Hardware Watchdog
Free Mem 16504 (13504 - addToLog) (Unit 7)
Free Mem 13368 (5800 - LoadTaskSettings) (Unit 10)

I think it's a difference if the server doesn't answer at all or if it answer's opens the socket and is too slow or even doesn't respond at all... I guess these are different situations from the unit's pov...

it's easier to deal with if the server is not reachable at all, than having open sockets to deal with..

giig1967g · 2018-09-14T09:19:32Z

I am asking this because I also was having serious issues that stopped when I swapped my router (from Mikrotik to Lynksys).
With Mikrotik I was also getting some reboots (HW watchdog) if the router rebooted or even if just the wifi on the router was set to off. With Linksys this does not happen.
So I just wanted to highlight that the fact that for me and for @clumsy-stefan the problem disappeared, is not because the root cause has been identified, but because we changed hardware.
Many other could face this problem until the root cause is found.

What do you think?

clumsy-stefan · 2018-09-15T17:26:29Z

had a really strange issue today: after nearly all units ran for over 6'000Min. I updated all of them with the newst commit this morning. After that they all ran fine for about 4 hours until in the afternoon I had to fix some things on the server and shut it down for about an hour... after it restartet all units were "absent". Looking at the log (via UDP syslog or webpage) they all said "Timeout while reading input data!"... even after a (software) reboot, the problem persisted.
I finally solved it by going to the controller tab editing the controller and just pressed "save"... then the units rebooted once again and all ran fine after that....
Anyone any clue what happened here?

TD-er · 2019-10-27T16:34:14Z

These timeout patches are already present in the code for a while now.
Core 2.6.0 also has a lot of fixes related to proper handling of timeouts.

So I will close this one for now.
Please create a new issue and refer to this one if it is still an issue.

TD-er added Type: Bug Considered a bug Category: Controller Related to interaction with other platforms Category: Stabiliy Things that work, but not as long as desired Category: Wifi Related to the network connectivity labels Sep 8, 2018

TD-er added a commit to TD-er/ESPEasy that referenced this issue Sep 8, 2018

[letscontrolit#1720] WiFi read all lines from slow servers

e7d28e8

See letscontrolit#1720

TD-er mentioned this issue Sep 8, 2018

[#1720] WiFi read all lines from slow servers #1721

Merged

TD-er added a commit that referenced this issue Sep 8, 2018

Merge pull request #1721 from TD-er/bugfix/wifi_connection_slow_server

c8f02d2

[#1720] WiFi read all lines from slow servers

TD-er closed this as completed Sep 14, 2018

TD-er reopened this Sep 14, 2018

clumsy-stefan mentioned this issue Sep 15, 2018

Boot loop with 20180915 release #1748

Closed

TD-er closed this as completed Oct 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WiFi] read all lines from slow servers #1720

[WiFi] read all lines from slow servers #1720

TD-er commented Sep 8, 2018

sassod commented Sep 8, 2018 via email

TD-er commented Sep 8, 2018

sassod commented Sep 8, 2018 via email

TD-er commented Sep 8, 2018

clumsy-stefan commented Sep 8, 2018

TD-er commented Sep 8, 2018

clumsy-stefan commented Sep 9, 2018

clumsy-stefan commented Sep 12, 2018

clumsy-stefan commented Sep 14, 2018

TD-er commented Sep 14, 2018

giig1967g commented Sep 14, 2018

clumsy-stefan commented Sep 14, 2018

giig1967g commented Sep 14, 2018

TD-er commented Sep 14, 2018

clumsy-stefan commented Sep 14, 2018

giig1967g commented Sep 14, 2018

clumsy-stefan commented Sep 15, 2018

TD-er commented Oct 27, 2019

[WiFi] read all lines from slow servers #1720

[WiFi] read all lines from slow servers #1720

Comments

TD-er commented Sep 8, 2018

sassod commented Sep 8, 2018 via email

TD-er commented Sep 8, 2018

sassod commented Sep 8, 2018 via email

TD-er commented Sep 8, 2018

clumsy-stefan commented Sep 8, 2018

TD-er commented Sep 8, 2018

clumsy-stefan commented Sep 9, 2018

clumsy-stefan commented Sep 12, 2018

clumsy-stefan commented Sep 14, 2018

TD-er commented Sep 14, 2018

giig1967g commented Sep 14, 2018

clumsy-stefan commented Sep 14, 2018

giig1967g commented Sep 14, 2018

TD-er commented Sep 14, 2018

clumsy-stefan commented Sep 14, 2018

giig1967g commented Sep 14, 2018

clumsy-stefan commented Sep 15, 2018

TD-er commented Oct 27, 2019