Fix int32_t microseconds overflow in navigation. #6806
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Bug and reproduction
There is a severe bug in navigation. To reproduce: after 40mins of waiting arm the quad in the althold mode and the motors will instantly spin at full throttle. Needless to say, do not try that with the props on (and don't ask how I found out about this issue).
Cause
The navigation uses something called
deltaMicrosPositionUpdate
and it is calculated from zero at first update or since last update. This delta can get big either if you wait before first arm or between arms. This value is held inint32_t
type, and 2^31 microseconds equals ~36 minutes, so in the range from ~36 to ~72 minutes the value will overflow to a negative number. There is a check if the delta is not too big, but a negative number obviously always satisfies this check. Then, the negative value of delta is taken into computations causing a mess and causing altitude controller to output full throttle.To be precise
deltaMicrosPositionUpdate
is of typetimeDelta_t
and its definition looks like so:Well, not always sufficient.
Naive solution
The naive solution would be to just change
timeDelta_t
toint64_t
, but as this type is widely used in the project this would introduce performance degradation, for example this shows that 32 bits value oftimeDelta_t
is handled by the FPU while 64 bits is delegated to the soft emulation!Almost all uses of
timeDelta_t
deal with very small changes which do not present much risk of overflow. Moreover, for example in the scheduler performance is quite important. Therefore just changingtimeDelta_t
toint64_t
does not seem right.This solution
Important changes are at the top of
time.h
and in lines wheretimeDeltaLarge_t
is added innavigation_*
.Firstly, the comment of
timeDelta_t
was changed to accurately reflect the capabilities of the type and aTIMEDELTA_MAX
define was added. Secondly there is a newtimeDeltaLarge_t
which typedefs toint64_t
, which overflows at 300 000 years, so this one can definitely can hold any practical interval.Secondly, in
navigation_fixedwing.c
,navigation_multicopter.c
andnavigation_rover_boat.c
appropriatedeltaMicrosPositionUpdate
s are changed totimeDeltaLarge_t
and since they are checked for maximum possible interval anyway they will not be passed astimeDelta_t
if they are big enough to cause overflow.The last somewhat important change is in
navigation_private.h
where the maximum interval is asserted to fit in thetimeDelta_t
.By the way
I also looked at other occurences of
timeDelta_t
to figure out where overflow is possible/dangerous and apart from navigation I haven't found any cases requiring more than 32 bits, so I only did some minor refactors of related code here and there.