-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DHCPDISCOVER every 5 minutes, ignores server lease time #7020
Comments
This is interesting, as I am also looking for DHCP issues here on the network. The reason I am asking this is that the Ziggo Connect box I have here does seem to reply quite slow to DHCP renewals (lease time set to 3600 sec here) and it does clearly show that the ESP reacts terribly slow on some other network requests to the ESP itself. (5 seconds longer than usual) N.B. the DHCP server on this Connect box is a bit strange/buggy, as some other devices like my Windows laptop sometimes cannot get an IP. Edit: |
This is what I am seeing in my dnsmasq log: This is a renewal request, note the DHCPREQUEST that includes the address the devices wishes to renew. The server responds back with the DNSACK, but the devices starts back on a full request. It is important that when I although in the log, the server ACKs before the device resends the DISCOVER, in a PCAP, I can see that the device resends the DISCOVER just a moment before the server sends the ACK. By this, I am only guessing that the device is not waiting long enough for its renewal ACK to come back. I am reviewing the code trying to follow the processing thread. So far, I dont see where there are any problems... |
Does your DHCP server set the T1 and T2 values in the response? |
Other devices running different code might change or ignore T1 and T2 values passed by the server resulting in different behaviour for the ESP8266 devices. The DHCP REQUEST is being broadcast from 0.0.0.0 which should only happen in response to an OFFER. When sent to renew a lease (after T1) it should be sent from its own IP address directly to the DHCP server that gave it the lease. When sent to rebind (after T2) it should be broadcast from its own IP address. I've had issues with devices running tasmota that would reboot every 5 minutes and produce similar output in the DHCP logs. I moved the devices to a different AP on the same network and the problem disappeared. I suspect that I was hitting some sort of limit on the original AP, due to the number of WiFi devices I had connected, and this was causing the ESP device to reboot. |
Similar issue here with NodeMCU running WLED that uses user_interface.h. No other ESP8266 on network has this issue and I'm assuming it may be related. Issue continues with Node despite blank bin flash and different sketch not using extern "C" { |
I ran into this issue with an WLED device. My situation may be different, so I'll outline it here. I started with a network with an ISC DHCP server using a 24h TTL for leases. I needed to reallocate address spaces in my network, so I set the TTL to 300s (10m) and left it for over a day to ensure old leases were either expired or active hosts were on the 10m cycle. After reallocating the address spaces, I monitored the network to ensure everyone moved to their new addresses. Once that was complete, I changed the TTL to 24h and monitored the logs to ensure everyone transitioned to the new TTL. It was at this point that the WLED device started to misbehave. It started to ask for a lease renewal every 300s. Neither a warm restart nor a cold restart fixed the bahavior. The eventual workaround was to configure the WLED device in its GUI to a static allocation and then back to a dynamic allocation. At that point, the device did a DISCOVER and accepted the lease it was given--with the 24h TTL--and hasn't reverted to the 300s renewal timing. I will continue to monitor it. As an asside, there are two other ESP8266 devices on the network. One is a simple clock written in the arduino environment (with some older version of this code) and the other is running esp-link v3.0.14-g963ffbb. Neither of these devices exibited this issue. It's unfortunate that I picked 10m as my 'short' TTL value as that means I can't tell if the 5m time is a product of my TTL or something innate in the library like the original issue poster observed. |
I have WLED, Tasmota and WThermostatBeca devices and all 3 suffer the same problem of dropping from WiFi when using DHCP. ESP32 seems to suffer this less, but the ESP8266 is more problematic. Things seem to have got worse since upgrading my APs to WiFi 6E ones. |
I am thinking that having a slow DHCP server may be part of it. I've moved my DHCP service from the very slow NAS box it was on to a different (more powerful) system. So far things are looking more stable but only time will tell. Maybe the part of Arduino code that asks for and waits for a DHCP response may be part of the issue? Does anyone know if there Is there a short timeout in that area? If it's not the same code used for WiFi & Ethernet DHCP clients, for me the issue is on WiFi and not Ethernet. |
Lease time is defined by your DHCP server. Try to start one of your devices with enabled debug output so more informative messages can help understand. |
@d-a-v I wasn't talking about the lease time, but the time between a dhcp client asking for a lease and it getting sent one from the dhcp server. Surely there must be a timeout for that? Anyway, I've figured out that my devices only like a 2min lease time. Anything higher than that and they disconnect after a while. It is strange that they manage the first few lease renewals, and then at some point during the day, they will drop off the network. This has only started happening since I upgraded to a WiFi 6E mesh network. I have TP-Link Deco XE75s, and I've also tried Eero 6E devices with the same problems. Currently I have the XE75s with Beamforming off, and Fastroam off, and a 2m dhcp lease time and this is the most stable I can get it. |
The DHCP RFC is here. The DHCPREQUEST and DHCPACK are repeated when it comes to renewal time. By default the renewal time (known as T1) is half the lease time but can be specified by the server. If it hasn't renewed by time T2 (usually 0.875 * lease time) then it will start broadcasting DHCPDISCOVER requests. If you can capture the traffic and you see this pattern then your DHCP is working. What you may be experiencing is the WiFi connection dropping for a couple of seconds forcing the device to reset the interface and causing the DHCP to start again. If I am remembering correctly then the WiFi library on the ESP devices used to save some metrics for the WiFi connection so as to improve the connection and help with stability. Of course, if you change the AP then the old metrics are no longer valid and could maybe cause instability (I don't know enough to be sure). On the devices that allow you, you could try wiping the permanent storage and reconfiguring from scratch to see if that resolves the issue.
|
If you suspect this to be related to newer APs, which do in some way support WiFi mesh (doesn't have to be enabled), then you should really try connecting the ESP using 802.11g mode and not 802.11n. |
Basic Infos
Platform
Settings in IDE
Problem Description
I was sent here from the Tasmoda wiki.
I am still investigating, but figured I would post while I worked on it.
I am using Tasmota-lite (v8.1) on a Gosund WP3. When I clear settings (RESET 5), the device gets a DHCP address, and seems to follow the server lease time (24h in my case). However, after the first lease renewal approx 12h later, it requests a lease approx every 5 minutes. Rebooting does not go back to the server config lease time, it immediately starts with the 5 minute renewals...
One weird thing may be that it send DHCPREQUEST and immediately a DHCPDISCOVER Then the handshake proceeds. It looks like it might be requesting a lease renewal (without going through the whole DORA process) but it doesnt wait long enough for the server to answer the initial request. The server ACKs the request, but the device has already sent the DISCOVER.
I have verified with wireshark that the lease time being sent to the device is correct. The device just seems to ignore it after the first renewal...
I am using dnsmasq (v2.80) on RaspberryPi running arch (Linux dns 4.19.76-1-ARCH #1 SMP PREEMPT Tue Oct 8 03:34:09 UTC 2019 armv7l GNU/Linux). My Tasmoda devices are the only one exhibiting this behavior. All over devices wait the standard "one half of lease time" to start requesting again. I have verified with wireshark that the devices are being given the proper initial lease time.
I suspect that not a lot of people are checking their DHCP logs (probably on a networking device) so not sure how many even know this is happening...
MCVE Sketch
Debug Messages
The text was updated successfully, but these errors were encountered: