-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flash API: road forward and the no_explicit_erase
capability
#71270
Comments
The proposed solution in stage0 does not scale well. The first "capability" that is added is forcing drivers to be split up in drivers with and without the capability. When applying this methodology to each capability that will be added will force drivers to be split up in multiple parts. |
There is possibility of a single split now, and it not even sure whether it happens. This has been explained many times, and you perfectly understand that, because you have followed the pattern in your alternate PRs. For these interested the link to discussion is on top of this issue. |
You seem to be missing the point. The solution in stage0 is setting the Kconfig on the driver level for the |
ECC is not a good example for that as in most cases everybody just implements proven ECC algorithm or recommended by manufacturer directly in a driver, and misses the main reason why the ECC in the hardware is in the first place; if somebody would like to implement ECC in software stack, that rather would be an addition on top of what storage offers in hardware, but lets go. So somebody implements the additional ECC in software stack to amend lack of it in hardware (so
Scalability is multi-dimensional issue. Ability to remove code that makes no sense when system lack something, is scalability because it gives back resources that other systems demand. Does that require extra work by maintainers (affects scalability of organization): probably, but I think users would like system to scale itself as much as possible without having to do extra steps in Kconfigs/dts, the occasional spike in maintainer involvement will happen, but we grow too. I also like this scalability: customer pays $ for N size storage and does not ask me why there is 0.1N used does nothing for their configuration. I like when time I spend cutting, by hand, things that are not needed scales down. |
Regarding the ECC, this is about hardware ecc like used on some nxp and st devices. These generate a nmi when a uncorrectable error occurs and allow retrieving the data as is. When used for xip the nmi is ok, when used for storage this nmi should be converted in a different error. The ECC is as such a hardware feature (capability). Next to the fact that the proposed solution does not scale well, there is another limitation of the solution: on boards where both "explicit-erase" and "no-explicit-erase" devices are used the solution does not allow subsystem to reduce the code size. This is a result of not reflecting the capabilities in dts. I really don't understand why I need to repeat myself. If you as a maintainer decide that this is the way to go you should make clear to all your users what the implications of this decision are. Be open about the limitations and the work they will create. Don't complain about someone else doing this. |
Have you looked through our drivers? The any XIP stuff that is done is either initialization or marking that device is working with XIP. XIP does not do If you want different NMI behavior depending whether XIP causes it or
The Kconfigs expose what drivers offer. Subsystems have to present what they offer according to what they work with. User has to choose what they finally want. DTS capability works here even worse, because it does not control how device behaves (device is as it is), but rather how multiple independent subsystems see it: the device will not change. This means that there is property introduced into device that controls something other than the device itself - that is bad modular design. Lets assume that the property is attached to partition: so this is property exposed to subsystems, lets say you have two subsystems using, interchangeably, that partition at-run time; now you have a property that tunes two independent subsystems, that is not defined for these subsystems. I have started my "already explained" stuff here https://github.com/zephyrproject-rtos/zephyr/files/14795769/QA.no-erase-flash.pdf, slides 8, 9 and 12 (will be updating it, with more, didn't have time).
Same complain on my side. DTS limitations explained above. It is easy to advertise a solution without going into details. I did go into details, and I have started the work with DTS flag and have abandoned the idea.
OK. Show reviewers where I complain about deficiencies in my proposal being pointed out rather then being forced to repeat myself. |
Regarding the ECC, these devices should have a Regarding the dts capability: why add information to a partition, this clearly belongs to the flash node. When a compatible is added to the flash node that describes the capability the driver can pick this up, verify it (error out if needed) and use it (if the driver can support multiple capabilities). This is the way dts is used for all of our driver parameters, the capability could be just the same. |
Sorry about the closing, pushed the wrong button.... |
Nothing stops driver from doing so.
That was your idea at some point, just bringing it up why it does not solve a problem with below:
https://github.com/zephyrproject-rtos/zephyr/files/14795769/QA.no-erase-flash.pdf, 12 explaining
Explained here: #71270 (comment), and it is not about "wrong capability" but how it affects sub-systems above it (multiple at once).
Then subsystems are figuring out their own property from parameter that does not belong to them. Explained here: #71270 (comment), |
This is exactly why the proposal does not scale well: Add a capability
It was never proposed this way, what was proposed was to add information about the backend (by using a compatible) to a partitions (the parent of a partition) definition. This is a way to allow defining partitions over a range of device types and allowing devices to be a true representation of what they are. In #71158 an alternative was proposed that would be limited to flash devices, the disadvantage of #71158 over #70979 is that it forces all storage devices to use the flash interface.
There is nothing wrong or broken about
As drivers and subsystems are getting their information from the same source, the dts file, there should not be any difference between getting the information from the driver or utilizing the information from devicetree. Allowing subsystems to retrieve this information at compile time instead of runtime reduces codesize. If the driver does not honor the specifications from devicetree this would be driver error. In general the power of the |
The response is not relevant to my response.
In the beginning everything is addition over the base. If at some point users find that they want to have a way to figure out whether there are no devices without a feature supported in their build, then we will probably add the negative Kconfig (I have answered this here already #71270 (comment))
Making Flash API generic storage device access API does the same.
We are slowly turning Flash API into generic storage API, yes. So pointless addition of code to verify something that driver provides or not, and can easily advertise without checking DTS.
If it is not a global property and if it is optional then subsystem can not use it reliably, and again they keep on picking their own property from the wrong node.
Subsystems should not poke around nodes that do not directly belong to them.
"For a device driver selection" And, it is possible to create device driver without even touching DTS, at run time, and attach it to a bus, for example SPI. Yes, you would probably have to add Kconfigs to add the |
@de-nordic, the recently proposed RFC #71773 when applied to storage devices would make it unnecessary to extend the flash api to other storage devices. Whatever storage device could be defined with a minimal (fit to purpose) api and it is possible to retrieve what the device type is. This makes it possible to create a solution similar to A layer for working with partitions (that is still needed on top of the currect proposal) could then be used to provide a unified way of working with these devices, and their This method would also allow support for nand-flash with their own peculiarities to be a separate type of devices with dedicated There would be no Kconfig options (other than the selection of a type of device) to describe device properties and also there would be no addition to the dts. Devices like The |
This does not change anything. This would only land flash class devices in specific class section and check whether devices passed to that class API syscalls have
I am not hiding my intention, for more than three years now, that I am willing to make There is no need for
NOR and other also can have those, in the end we will end up with multiple API having
Both PRs are addressing different issues.
One API for raw devices of storage type is the goal.
Disk access is for, kinda, FTL layer, not raw access. |
This should not be closed, till the other stages are followed. |
Introduction
Following discussions here:
Original issue: #62228
Original PR #67687
https://discord.com/channels/720317445772017664/883445484923011172/threads/1220424912829681846
I have decided that slowly extending Flash API into general code/storage access API is the way to go.
Therefore I will be adding the capability flag which will be informing code at run-time on what kind device it is working, and additional Kconfig options, in a HAS form, to inform software stack, and build system, on capabilities provided by compiled in drivers.
Problem description
We are in need of support for devices that do not require device, in a way that allows software stack to work efficiently but also not take in code that would never be used. As I have already explained in mentioned discussions software does work better when it does not have to erase, so it is happy to know that; I also do understand that our current solutions have been mainly focused around Flash - and that is where new storage come.
This means that there is no quick and easy way to transition to new kind of API and the path is to slowly move Flash API from Flash oriented API into general raw storage access type API, by breaking device into the features they represent, and adapting software to work on these features, rather than on types of devices - as every storage device can be described by various properties, often common with other technologies.
Adding more layers do not fix the issue, it will just add additional layers on top of what is already becoming obsolete. Adding more layers will just add opportunity to further diverge software stack to rely on co-existing APIs doing similar task, making it even harder to transition in the future. And the layers will have to break devices to features or recognize them by technology anyway.
Proposed change
Slowly adapt Flash API into generic storage API based on device capabilities rather than device types.
Proposed change (Detailed)
The change will happen in stages.
Stage 0
no_explicit_erase
capability to, currently, limited number of devices that have no erase requirement prior to write (MRAM,FRAM,RRAM, and so on, and yes I have agreed that addingexplicit_erase
flag to all flash drivers may be unneeded change);FLASH_HAS_NO_EXPLICIT_ERASE
andFLASH_HAS_EXPLICIT_ERASE
Kconfig options for drivers to advertise support or lack thereof to software stack and subsystems;FLASH_CAPS_EXPLICIT_ERASE
andFLASH_CAPS_NO_EXPLICIT_ERASE
macros to be used onflash_parameters
to check for the capability; these macros are provided in an attempt to blackbox theflash_parameters
;flash_erase
can now return-ENOSYS
as drivers do not have to implement erase (they actually never had to, by the API Design Guidelines anyway)flash_flatten
andflash_fill
, where the former is used as replacement forflash_erase
where erase has been used for cleaning storage, rather than enforced by used device; the later is just addition to allow users quickly fill devices with value with no need to repeat code between units;And no, adding new layer would not save us from doing all the subsystem, test and sample changes, it could just allow us to postpone them for some time.
Stage 1
Once we merge PR for stage 0 we go into advertisement phase, where Developers, via mailing list, will be informed and reminded that the
flash_erase
andflash_area_erase
are able now to return-ENOSYS
and there is intention to remove the code from drivers, including support for pages layout and so on, as this is needles code. This will limit dead code for, or needlessly existing code, in software solutions that are build around device that are not providing them anyway.Stage 2
Modules adaptation: all (three of them, and all based on MCUboot), will have to adapt. I take responsibility for adapting MCUboot.
They will have to work with devices that do not provide erase and do not have paged layout, require erase and have paged layout, or work with both types of devices at once.
Stage 3
This is rather distant stage from Stage 2.
This will be making
erase
callback and pages support optional for devices and will follow with removal of the code, depending on vendor approach to that.Stage N
After years of adapting Flash API to common storage API it will be renamed. Or not. We write to EEPROMs and they are still ROM, so I guess we will just get used to Flash API being the storage raw access API.
Dependencies
Users will slowly adapt to new reality. Dividing the change into stages should not end up in shocking the community.
Projects that have been, to this point, sure that they work with Flash type (progra-erase, erase required, explicit-erase) devices will have to slowly adapt to reality that this is no longer the case. We will give them a chance and opportunity.
Concerns and Unresolved Questions
I had to modify some subsystems, often with by best knowledge and assistance from samples and tests that have been provided for them. I have run a lot of these configurations on nrf54l15 RRAM base device in my possession and on Flash Simulator altered to support no-explicit-erase devices.
Alternatives
Adding new layer that is based on device type: FLASH, FRAM, RRAM, MRAM, EEPROM, etc:
Additional layer based on capabilities:
Additional layer that takes capabilities from drivers and represents them in higher layer:
DTS property that will state what kind of device there is (I have tried that actually at the beginning):
The text was updated successfully, but these errors were encountered: