-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misleading API in netdev driver #9805
Comments
I agree that this is a problem, it is also easy for the maintainers to overlook missing functionality when reviewing a new network driver because dropping the packet is a corner case and is not used very often in many simple lab tests. I don't know which solution is the best though. It would take some effort to refactor all drivers to an API change, so it's best if we discuss the potential solutions some more before beginning to change the code. |
Also, there may be more than the three options that came to my mind. Maybe in the course of a discussion a better solution will be proposed than the three suggestions. The suggestion 2 would be relatively cheap in terms of work required. I personally favor suggestion 1, as it is the most obvious API. This would make it easier for newbies to understand and for reviews to review. But that one would require the most work :-( |
My personal preference is splitting the API into three functions. The drivers can still use the same implementation behind the scenes, but it will be more clear when one function is missing. |
I also think so! It improves code readability |
To push the discussion further I opened a PR implementing three separate functions approach. By adding a two glue functions driver implementing the old API are "converted" to the new one. This allows to gradually port the drivers to the new API. |
I agree that the design of the recv function is not the most beautiful API we have in RIOT. But as I think option 5 would be quite a nice addition (at is completely transparent both to usage and resource usage). Tackling the issue of driver not fully implementing these three behaviors, adding explicit checks for this on the |
@haukepeterson: I think you got some facts wrong. I assume by efficiency you mean how much cpu time a driver spents on handling recv/drop/size features of the netdev_driver_t interface. There is practically no difference between all approaches. (Theoretically splitting the recv function implementing the recv, drop and size feature as separate functions is faster by dropping the if-statement and the additional parameter of approach 3 will be slower. But practically it will be challenging to just measure the difference.) So efficiency as criteria is not helpful here, as there is practically no difference. Regarding memory (= RAM) usage: Again, there is no difference. (Theoretically approach 3. needs 1-4 bytes more stack which could lead to the stack size needing to be increased, but practically this won't be the case.) So again, a good criteria, but not helpful here. The really relevant tradeoff here is ROM size vs. (un-)clean API (and the security and maintainability implications of an (un-)clean API).
The majority of the ROM increase is not due to the function pointers, but the duplicated machine code (as the compiler inlines static helper functions used in all three cases). For instance, the ROM increase is 146 Bytes in case of the enc28j60 driver (see discussion in #9832 for details).
This argument is often brought up, as its logic flaw is easily made. If internal code is hard to maintain/read/debug/review, all code using it directly or indirectly is affected by all the easily avoided bugs introduced this way. Also, fewer eyes reviews internal, low-level code, so that corner case bugs easily go unnoticed for quite some time. And from many, many hours of debugging I know for sure that internal low level functions that are in use for ages are the very last place to look at when debugging, so avoiding bugs there is much more worth than in more exposed positions.
You missed the point. The recv API is a textbook example of an misleading API. As result, a lot of bugs (missing drop implementation and missing size implementation without appropriate documentation/asserts) were introduced in RIOT (see all the FIXME comments in #9832). The missing drop implementation is not only easily overlooked by the person implementing a new driver, but also by the one reviewing it. And to put this very clearly: The missing drop implementation is trivially and remotely exploitable denial of service vulnerability. (E.g. see the "ping of death" @gschorcht describes here #9806.) Similar bugs in Linux would get CVE numbers and the whole insane security circus (a catchy name, a logo, a web page and maybe even a franchise shop). My point is: Those bugs are extremely relevant for any serious use of RIOT and preventing that this type of bug reappears in any future netdev_driver implementation has a huge value. A clean API is a huge step to make this kind of vulnerability less likely to pop up again in the future and I personally would not hastitate to "pay" 146 Bytes of ROM size for that. Also, let me proof the following sentence wrong:
We could easily save more ROM size replacing all function pointers in |
@OlegHahm, @miri64, @rousselk, @haukepetersen, @LudwigKnuepfer, @bergzand, @thomaseichinger, @smlng, @kaspar030, @PeterKietzmann, @mehlis, @BytesGalore, @kYc0o: You are listed as contributors to the cc2420 netdev_driver, which contains a critical DoS vulnerability which can easily be triggered by bringing RIOT in a low-memory state:
Roughly every second RIOT netdev_driver has the same security hole. I reported this issue 6 weeks ago. I think its about time to attend to it and finally fix it: The very least that should be done is fixing the bugs in all affected drivers. Better would be to fix the underlying problem: The misleading netdev_driver API should be fixed by introducing a separate |
Also: In the decision is made to fix the API by splitting |
Maybe to settle this argument once and for all, first provide a port to the API to one device (e.g. cc2420) so we can compare the actual code size. I agree that we should fix this issue one way or the other ASAP. I also noticed while reviewing #9942 that not only the current API is misleading, its documentation is lacking to outright wrong in its current form. I'm preparing a PR for that at the moment. |
While reviewing RIOT-OS#9942 I noticed that the documentation on the netdev driver API is unclear and in some cases outright contradicting itself: > ``` > @return number of bytes used from @p value > @return `< 0` on error, 0 on success > ``` IMHO this is unacceptable for such a central API where communication This fixes a few things and also clarifies preconditions: - Specifies negative `errno`s clearly so all drivers return the same when required - Re-iterates parameter preconditions and special cases within the parameter documentation itself (might also help towards RIOT-OS#9805?) - Fixes contradictions within return value documentation - Adds missing parameter documentation on `init()`.
Already did that on the 25th of August for the enc28j60, which also hat the same bug: #9832 (comment) ;-) |
Ok, sorry did not know about that :-/ |
While reviewing RIOT-OS#9942 I noticed that the documentation on the netdev driver API is unclear and in some cases outright contradicting itself: > ``` > @return number of bytes used from @p value > @return `< 0` on error, 0 on success > ``` IMHO this is unacceptable for such a central API where communication This fixes a few things and also clarifies preconditions: - Specifies negative `errno`s clearly so all drivers return the same when required - Re-iterates parameter preconditions and special cases within the parameter documentation itself (might also help towards RIOT-OS#9805?) - Fixes contradictions within return value documentation - Adds missing parameter documentation on `init()`.
While reviewing RIOT-OS#9942 I noticed that the documentation on the netdev driver API is unclear and in some cases outright contradicting itself: > ``` > @return number of bytes used from @p value > @return `< 0` on error, 0 on success > ``` IMHO this is unacceptable for such a central API where communication This fixes a few things and also clarifies preconditions: - Specifies negative `errno`s clearly so all drivers return the same when required - Re-iterates parameter preconditions and special cases within the parameter documentation itself (might also help towards RIOT-OS#9805?) - Fixes contradictions within return value documentation - Adds missing parameter documentation on `init()`.
Closed in favor of fixing the drivers against current interface #10410. Maybe paying more attention and providing better documentation will solve this issue and comes for zero ROM cost compared to API changes. If more attention and better documentation does not stop the same issue from being re-introduced, an API change can still be discussed. If the bug does not re-appear, we didn't waste ROM size on an API change that turned out to be unnecessary. |
I want to re-open this issue. Considering that:
So, I think a variant of proposal 1. is feasible with:
I wrote this for the Patch
I didn't introduce any functions (e.g SRAM access) to have a fair comparison. These are the results: samr21-xpro (at86rf2xx): Default example on master
samr21-xpro (at86rf2xx): Default example with patch
z1 (cc2420): Default example on master
z1 (cc2420): Default example with patch
Haven't tried on AVR yet, but this shows that the overloaded solution is not necessarily better in terms of ROM consumption and the API is very unclean. Also there are some optimization that could be done with a static recv function that are hard to write now |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions. |
Problem Description
The
int (*recv )(netdev_t *dev, void *buf, size_t len, void *info)
member innetdev_driver_t
(see Doxygen) is a textbook example of a misleading API:buf != NULL
(buf == NULL) && (len == 0)
(buf == NULL) && (len != 0)
Symptopms
This misleading API already lead to a bug #9784. I predict similar bugs will show up in the future, if the API is not changed. Update: Many similar bugs were found, see #9832
Suggestions to Address the Problem
recv()
,get_size()
anddrop()
). This would be the cleanest API, but the ROM size of the implementation is likely to increase, as all three will share some common code. (I assume the compiler will inline the common code when implemented in separate functions, so even when no duplicate C code is present, the ROM size will likely grow.)recv_or_size_or_drop()
. While this name is very clumsy, at least it is very obvious that this functions has to implement three different things.enum
) to specify which of the three things the function should do. This would be much more obvious to the programmer, even though a bit wastefulstatic inline
functions to increase readability: In the upper layers add wrappers to call the size and drop feature od the recv function more explicitly; in the lower layers to test for the drop/size/recv mode more readableThe text was updated successfully, but these errors were encountered: