-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for selective deferment of device initialization (aka manual init) #39896
Comments
I don't think it would require an init_entry information here, another init level could just do: #define _SYS_INIT_LEVEL_MANUAL in init.h for instance. (thus a MANUAL to use in drivers). kernel/device.c would need relevant update of course. And a device_init() function exposed obviously (which would check that the device is being in the MANUAL level, else it would return -EALREADY or current init status). |
I believe there should not be a check if driver is already initialized. I faced issue when LCD should be reinitialized after cutting off LCD (and other devices) power to save battery life. And there is no way to reinitialize screen driver. Your idea is good, just if Zyphyr checks if driver was initialized it won't allow to reinitialize driver. As an option there should be function to unload/destroy driver before doing something so in that case mentioned check makes sense. |
The issue with no checking is that then it's possible for the system to get into an initialization loop where driver A depends on driver B, both require manual initialization, and each call the other's init method in their own init method. Admittedly this is a corner case, but so is the case of needing to re-init hardware that was powered off. As an alternative, I'd propose that function that is called to manually invoke an init function take an argument to allow a "forced" initialization. A forced init would run the routine regardless of the current init state. This would be useful not just for re-initializing when the device is known to be uninitialized, but also for situations in which a device seems unresponsive and needs to be recovered. I like the idea of a MANUAL init level. That's cleaner than a separate flag. |
That's being discussed in this issue #20012 |
This would also fall within issue #20012 |
In my scenario LCD ST7785R needs first to be powered up by GPIO pins switching in main function. But ST7785R has POST_KERNEL initialization priority. So initialization fails because LCD do not respond on spi bus due to lack of power. |
Architecture WG:
|
From the Arch WG, there were 3 possible approaches discussed. Summaries of three options are presented below. All approaches will need to consider dependencies for any deferred nodes. For instance, if a node is deferred children/dependent nodes should also be deferred. We need to decide whether the application is responsible for initialization of all dependent nodes, or this will be handled automatically. Option 1: new devictree propertyAdd a new property Advantages
Disadvantages
Option 2: Set
|
@keith-zephyr thank you for the summary. after grepping for
Not sure how acceptable it would be introduce a new zephyr specific status in this case though. |
I doubt option 3 is a viable option unless we convince dt spec maintainers to add a new valid status to the standard. From the latest spec (0.4):
|
Reading this there's actually no status that matches what Zephyr uses the disabled status for today. Maybe whats needed is a new zephyr specific property that determines inclusion or exclusion of the build instead of relying on the status being disabled to do that. |
exactly, we have been using "disabled" the wrong way, which IMO is worse than using our own status I guess. |
It would be a very big ball of twine to untangle at this point to follow the spec the way say... Linux does. "not presently operational, but might become operational in the future" I mean this to me could be interpreted a bit. Maybe we just need our own attribute saying yes, disabled means not presently operational, but if this other property is set then in that case its possible to become operational later with some init/deinit type (module load/unload like) functionality. I think option 1 makes the most sense, disabled still could be interpreted to be right then but Zephyr having its own needs to try and reduce image size takes the default to mean "never built in" unless the additional property is there.
|
Well, our usage of "disabled" doesn't contradict the spec (device not presently operational). The only thing is that in Zephyr, the "might become operational" will never happen (not technically wrong). The thing is that disabled as is defined, seems to cover the deferred initialization case. For us, this is really a problem because disabled implies no device resource allocation at all. Also, "okay" + "zephyr,deferred-init" seems wrong, we'd be better off with "disabled" + "zephyr,deferred-init" (flag would be the mechanism to enable the "might become operational" part). |
yes, that works better and there will not be any need for a Kconfig.... Let's call this "option 4".
I am sure many of the thousands of disabled nodes we have are leftovers and things that will NEVER be operational, no way for us to verify because they are just being skipped and ignored.... |
I like option 4. |
Option 4: new devictree property with disabled statusAdd a new property Advantages
Disadvantages
|
updating my vote. Option 4 |
ditto |
I'm good with option 4 as well. It's debatable as to whether option 1 or option 4 is "easiest to implement", but the other advantages make it the clear winner. @galak - What are your thoughts on these options? |
I like option 4, but with a slight rewording of the option name and a slight change to its meaning. Consider a scenario where we a DT file for a development board that lists all peripherals as "available", but some of these peripherals are mutually exclusive (because they use the same HW resources under-the-hood). E.g.
Now, if my app-specific DT overlay file, I want to specify that I'm using UART0 for logging:
What should be the outcome in this situation? The UART0 node has both I would suggest that the property we add be called
With this change, the following combinations of DT properties are valid:
IMHO this makes it easier to adopt usage of the new property. DT overlay files that change only a node's status will continue to "do the right thing" rather than potentially putting the node into an invalid state. The property name then also makes it clear that the property is only affecting what code is included/excluded from the build, and that the init timing is still exclusively dictated by the status property. |
Or, as a third option for the name: |
I'd say that would either be a build-time error or a build-time warning (along with ignoring the |
I agree with this.
I disagree with this. This makes overlay files more complicated, as changing a node from "available, but not automatically initialized" to "available, automatically initialized" requires both setting the status and deleting the I think the |
Note that |
Note: option 4 will force us to: (1) change Kconfig |
Another note: I think |
I think a better usage of devicetree here would be (assuming this is a kind of "multi-serial" IP):
But I guess that would complicate things a bit more. The reason is that you cannot instantiate > 1 device per DT node, so instantiating both (setting both compatibles?) + deferring init would not work. |
Regarding the |
That's what the
This is probably a good idea. Though it should still be a compile time warning if a node marked with The |
The example provided is actually derived from how the DT is written for Nordic's nRF9160. See here for an example. |
Architecture WG: Slide shown:
|
There are three major problems I see with the callback-based solution: 1) Disproportionate run-time overhead (in current RFC implementation)The runtime overhead (in the current RFC implementation) is terrible. For each device in the entire system (not just the devices that are optionally present) the init code iterates through the entire list of "before init" callbacks. So for a system with ~30 devices total (remember, this includes all of the devices in the system, including all of the peripherals built-in to the SoC) and 5 optional devices, the init stage will have to make around 150 additional (indirect) function calls. And this is just to get the same information that in the best case could be obtained from five function calls. 2) The stated benefits are marginal or non-existentThe motivating benefit for the callback-based approach was that it potentially maintains the initialization sequence computed at compile time based on device-tree dependencies. The claim was that this would then let devices that are present (but might not have been) benefit from the "priority validation" performed during the compile-time devicetree processing. Except that it doesn't always. To understand why, consider a scenario where the system has an optionally present UART bus and attached to that bus an optionally present GPS module. The presence of the UART bus depends on a board stuffing option (detectable via a GPIO), and the presence of the GPS module depends the board stuffing as well as whether the GPS data lines are connected (detected via a non-break condition on the UART). This system requires two "before init" callbacks, one for the UART and one for the GPS. The presence-detect callback for the UART seems pretty straightforward to implement. All it has to do is configure a GPIO as an input, read the GPIO, and then decide whether the UART is present or not. Except that the GPIO controller may or may not have been initialized yet, because there is no dependency in devicetree between the GPIO controller and the UART bus. Why not? Because the SoC vendor that wrote both drivers couldn't have predicted this situation. This dependency is unique to this particular product. So now the UART presence-detect callback has to check if the GPIO controller is already initialized, and if it isn't, then what? Moving on, consider then the presence-detect callback for the GPS unit. This callback needs the UART to be present and initialized in order to detect the presence of the GPS unit. That's fine. Except that the callback doesn't automatically know which UART to check (for presence and init), so it has to pull that information from devicetree. So the GPS callback has to look at devicetree, determine what UART the GPS is on, and then determine if that UART is present. This means that callback reproduces the devicetree dependency information. Also, the GPS needs to check the board stuffing options. And, same as with the UART, checking the stuffing options depends on the GPIO controller. Furthermore, it's now possible for the stuffing options detection to be duplicated (and thus deviate) in two different callbacks. Or the the stuffing optoins detection is pulled out to a separate utility function, but one that has to be carefully maintained because it might be called before kernel init (as opposed to during normal "thread" context). 3) Devicetree no longer accurately models the hardware in the systemOne of the key strenghts of devicetree is that it allows downstream consumers of a generic piece of hardware to customize the software to match a specific usage and configuration. Consider a organization where there is a central "platforms team" that produces a development board with the main SoC and memory and several connectors for plugging in interchangeable daughter boards for the various peripherals. This platforms team also produces a BSP for the main dev board, and that BSP registers all of the "before init" callbacks to detect the daughterboard configuration. Fast forward six months to the second rev of the development board, and by this time some of the peripherals have been finalized. These peripherals are moved off of the daughterboards onto the main board. Now the BSP for the rev 2 hardware needs a way to "unregister" some of the "before init" callbacks that are used for the rev 1 hardware. Or, the BSP needs to be changed to optionally register callbacks based on the board revision. This makes for two different places where different information is stored about which hardware is present on a given board. The devicetree for rev 2 largely mirrors that of rev 1, as its the same peripherals on the same buses (just some are now always present on rev 2). But elsewhere in a source file in the BSP is the preprocessor logic that spells out which peripherals are always present in rev 2 and which remain optionally present (aka auto-detected). This problem becomes worse when the team that is consuming the development board (and BSP) are not in the same organization as the team building the dev board (and writing the BSP). In such a situation the downstream team will be burdened with maintaining a fork of the upstream BSP if the downstream team wants to take any device that the BSP treats as "optionally present" and make that device "always present". There is no way for the downstream team to "unregister" a "before init" callback that the BSP has registered, unless the BSP provides a Kconfig option to control registration of each callback. The BSP has resort to using Kconfig options to model parts of the hardware instead of devicetree, which is confusing. In devicetree, a downstream consumer can provide an overlay that completely overrides any property set in an upstream file. The callback approach does not provide such override flexibility, making it less general and more difficult to maintain over time and between teams. CodaThe name "before init" callback is pretty terrible. "Pre-init" callback is at least a little better. The best, IMHO, would be to reframe the question from "should this device be initialized" to "is this device present", and thus have a "device present" callback. I also think this reframing applies to the naming for the devicetree property as proposed above. That is, rather than a |
The specific use case can be covered by both approaches, have a generic devicetree node that just creates the callback for skipping and then init manually later, no difference. The run-time overhead though I think it's way more interesting, let's dig on that bit more. I saw it brought up few times at this point with no data backing the argument, let's see: I'd expect callbacks to be structured something like "if this is not my device return; else do something". Say we have a 1000 devices and 3 callbacks: static bool check_1(const struct device *dev)
{
int ret;
if (dev != (void *)100) {
return false;
}
ret = gpio_pin_get_dt(&led);
LOG_INF("pin is: %d", ret);
return true;
}
static bool check_2(const struct device *dev)
{
if (dev != (void *)110) {
return false;
}
return true;
}
static bool check_3(const struct device *dev)
{
if (dev != (void *)120) {
return false;
}
return true;
}
static bool (*checks[])(const struct device *dev) = {
check_1, check_2, check_3,
};
int main(void)
{
while (1) {
uint32_t ts = k_uptime_get_32();
for (uint32_t i = 0; i < 1000; i++) {
for (uint8_t j = 0; j < ARRAY_SIZE(checks); j++) {
bool out;
bool (* fn)(const struct device *dev) = checks[j];
out = fn((const struct device *)i);
if (out) {
LOG_INF("check %d", j);
}
}
}
LOG_INF("delta: %dms", k_uptime_get_32() - ts);
k_msleep(SLEEP_TIME_MS);
}
return 0;
} [00:00:03.667,968] <inf> main: pin is: 0
[00:00:03.667,968] <inf> main: check 0
[00:00:03.667,999] <inf> main: check 1
[00:00:03.668,029] <inf> main: check 2
[00:00:03.669,036] <inf> main: delta: 2ms
[00:00:03.769,256] <inf> main: pin is: 0
[00:00:03.769,287] <inf> main: check 0
[00:00:03.769,317] <inf> main: check 1
[00:00:03.769,317] <inf> main: check 2
[00:00:03.770,324] <inf> main: delta: 1ms This is on an nRF52dk. Is this really a "disproportionate overhead"? Remember that we are talking about a one-off sequence at boot time, that is interleaved with driver init calls which not uncommonly include sleeps of tens if not hundred of millliseconds, not to mention this specific SoC in the default configuration does not even hit main in less than 250ms (I think it's waiting for some clock to stabilize). Is my test flawed or unrepresentative? |
I'm assuming that you're referring to the BSP use case described in point 3? No, this cannot be covered by both approaches. The callback approach has no proposed or implemented explicit interaction with devicetree. There is no mechanism, again proposed or implemented, by which a callback can be added (or removed) by adding or removing a corresponding devicetree node. The only proposed feature of the callback approach is that any source file may add a callback. There is no mechanism for removing a callback in a downstream project except changing the upstream source.
So the argument is that something else will surely be a longer pole in the tent, so it doesn't matter what the performance is of this as long as it's not the worst performing step? I can't agree with this type of thinking. Bad performance is almost always the accumulation of many small decisions that were made with the mentality of "this is just slightly wasteful, so it won't matter".
You didn't measure a proportion, you measured the absolute overhead. What you then need to measure is the overhead of calling each of those three callbacks exactly once. This will give you the performance of the ideal solution. Then you compare those two numbers. This is where you see the disproportion in the overhead.
That certainly is would be the most efficient structure for a callback, but there's nothing to guarantee this. And these callbacks will be written not by the folks who are often writing low level driver code, nor by folks who are likely even familiar with the init machinery and the significant penalty (1000x in the provided example) for inefficiency. Rather, these callbacks will be written by developers more focused on "gluing together" the right pieces of Zephyr to get something that boots on their custom boards. It's a big, well-hidden footgun. And speaking of footguns, the provided example perfectly demonstrates another one: no features that depend on the kernel may be used in any of these callbacks. Thus, even a seemingly benign `kprintf` call or direct write to a memory mapped register to toggle a GPIO call for debugging could hose a system. The reason being that these callbacks have no knowledge of what init level they're running as. This means they could be running at `EARLY` or `PRE_KERNEL_{1,2}`. And there's not even a way for a callback to check what the current init level is. There are multiple variants of this footgun; I described a different variant in my point 2. That is, within a callback the state of the system is relatively opaque. The callback doesn't know what init level(s) have been completed, whether the kernel is available, or whether any other hardware drivers that might be needed (such as GPIO controller, internal EEPROM controller, or even whether the clock has been configured for the peripheral bus hosting a particular memory-mapped register). Driver authors don't have to consider all these subtleties, because drivers benefit from the guarantee that their dependencies will be initialized before the driver is called to initialize. |
good point, I wonder how all of this would work with userspace enabled and the fact that callbacks in userspace are never trusted. @edersondisouza @dcpleung does this apply here? |
IIRC, there was a PR introducing syscalls to set callbacks for a driver, so userspace can manipulate callbacks. However, it was agreed that it was a heavily frowned upon behavior, as the callback would be running in privileged mode (... which bypasses all userspace guards). So, in general, never let userspace do driver callbacks. |
At #67877 the callback is set up statically - there's a syscall to run the device initialisation, not add the callback, so I don't think this specific issue applies here. |
If we're fearing people abusing/misusing the callback, I guess the issue is insurmountable, unless we have some sort of restricted environment - eBPF? |
@edersondisouza Yes, you can count me as preferring the direction of #67335, though the implementation needs more iteration. As it stands it has its own footguns (silent failure of init if dependency was deferred) and sharp corners (no de-init). |
Cool - and yes, it's not like the PR doesn't need more reviews (although de-init is maybe a next step) - it's more about the direction. Thanks for the feedback @palchak-google =D |
@fabiobaltieri any more inputs here? |
@edersondisouza I think I voiced all my arguments, happy to expand on anything specific if you feel like there's some open point but otherwise at this stage there's probably more value in hearing for other people opinions. |
Can this be closed now that #67335 has been merged? |
Is your enhancement proposal related to a problem? Please describe.
Zephyr currently assumes that all devices available in the system should be automatically initialized at boot. This assumption does not account for the the following common embedded scenarios:
Describe the solution you'd like
Add an optional parameter to the Devicetree bindings for device nodes that allows application developers to specify whether initialization for that device should be deferred. This information should be stored in the
init_entry
struct for each device and checked in thez_sys_init_runlevel
function before calling theinit
function for the device.Devices with deferred initialization would then need to be manually initialized by the application code prior to use; Zephyr would provide a system call for manually invoking a device's initialization function (after checking that the device was not already initialized).
Describe alternatives you've considered
An alternative solution would be to add a "deferred" status enumeration for devices, alongside "okay" and "disabled". This status indication would serve the same purpose as described above, namely inhibiting automatic boot-time execution of the device's initialization function. At compile time the build system could also check if any "okay" nodes appear to depend on other nodes marked as "deferred". This would help catch potential initialization defects.
The text was updated successfully, but these errors were encountered: