-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ACS SCT crash in SetWatchdogTimer_Func RaiseTPL #216
Comments
From past experience, if a Watchdog Timer is throwing more errors than it solves, it should be switched off and a redundant always-on machine or dedicated power management circuit should be used instead for when the machine stops responding. If you absolutely still need the Watchdog to work, it is expected that on slower machines the drivers are going to take longer to initialise, so maybe an Increased Timeout is an alternative solution (then seeing which other driver brings up the assert if any). There could also be a better approach that hasn't come to mind yet. Your workaround seems to effectively perform part of option 1 and the variable for option 2 is in a different place every time called something different every time (proprietary). |
@TheMindVirus Thank you for your response! The Watchdog above is used as part of a test unit in the ACS EBBR or SBBR to test for compliance. I am not sure I understand the timeout reference, my apologies. AFAIU, the assert will always trigger on the Pi 4 when running any BBR test sequence since the These ones reliably trigger (Ipv6ConfigGetData after I change UsbBusControllerDriverStop to run at |
The Watchdog Timeout can be considered like setting an Alarm Clock, of which there are Raspberry Pi EEPROM/Firmware settings to set the values for: https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#DHCP_TIMEOUT As for the priority levels, if they are anything like Windows Driver's IRQL's and IRQL_NOT_LESS_OR_EQUAL, they also cause more issues than they solve. A routine can and should check that the priority level it wants to set isn't less than or equal to the current priority level before trying to raise it. On Multiprocessing systems, the routine can also wait in a loop (with a timeout counter) for another thread to complete its higher priority task before continuing with the lower priority task. The ever increasing priority level is not a solution for this and it has to be reset at some point anyway for it to be useful again. Task completion flags can bloat the system, so instead a simple Busy status flag (or sometimes spinlock) is preferred. |
Yah, I think the core edk2 TPL raise/lower code isn't correct. I have a fix around here somewhere, because PhatFree pointed out a whole bunch of errors trip in ACS when run against a debug build. Most of which were core edk2 problems (like this tpl code, and there are issues around page zeroing/etc). I'm sure the edk2 people are open to patches, but at this point, if your trying to run ACS to test the platform its probably better to start with a RELEASE build. |
@jlinton Thank you for the answer! I have not looked at this in some time. The error was with a RELEASE build, assuming that's the same error since the hang was silent without the ASSERT output, which I then rebuilt to DEBUG to get the output provided, unless you are referring to pre-built ACS binaries? |
Hi,
I'm seeing an assert trigger when running the ACS SCT BBR test
SetWatchdogTimer_Func
:Looking at the backtrace,
DisconnectAll
callsDisconnectController
callbacks and ends up invokingUsbBusControllerDriverStop
. IIUC the WatchdogTimerEvent runs atTPL_NOTIFY
which corroborates the assert message.There is a mention
UsbBusControllerDriverStop
may not be at the right priority level. Trying to useUSB_BUS_TPL
(== TPL_NOTIFY
) just moves the issue in the next driver (EfiIp6ConfigGetData
,Ip4ServiceBindingDestroyChild
, ...) to reach the same assert for what appears to be the same reason.I tried a naive work-around to confirm this. It ran the ACS SCT BBR tests without triggering the assert, which might as well run the
WatchdogTimerEvent
atTPL_CALLBACK
sinceDisconnectAll
is also reached through other paths with nothing to do with theWatchdogTimerEvent
?How should this be approached for a proper fix?
The text was updated successfully, but these errors were encountered: