-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE_PSP_GetTime Loses Resolution in Conversion to Microseconds (GSFC DCR 14952) #76
Comments
Imported from trac issue 72. Created by sstrege on 2016-08-08T16:39:59, last modified: 2019-08-14T14:13:14 |
@avan989 This would be a good one to look into if the issue still exists. If it does I'd suggest:
for simplest solution, and check if there's a test case to confirm this works as expected. Maybe even run on the target and confirm. |
I am not familiar with the MCP750 hardware timer register that is being read here, but I would strongly discourage using any floating-point arithmetic in the calculation. This will almost certainly result in reduced performance. Note that suggested literal numbers like "8.333" or even "8333.0" or "1000.0" are of type The complication of the existing calculation is almost certainly done with the intent of keeping the calculation entirely within the integer domain. Integer ops are usually an order of magnitude faster. |
Good performance point. Really the two lines of concern are: PSP/fsw/mcp750-vxworks/src/cfe_psp_timer.c Lines 102 to 103 in a82f7c9
How about:
So it promotes to long and multiplies (never overflows because DecCount is always <8333311), does the full precision integer divide, then back into the uint32? |
But the intermediate value (DecCount*1000000L) may overflow.... It also looks like the original code has another problem where if the DecCount ends up being within the range of [8333000, 8333310], then the division will result in 1000, which will produce a microsecs value of 1000000, which is invalid, as it is supposed to be in range of [0,999999]. So I concur that this code definitely needs a fix. My recommendation is to do the multiply before the divide, but limit your multiplier to avoid uint32 overrun. So long as DecCount is guaranteed to be <8333311, you should be able to multiply by up to 512 (or do a 9 bit shift) first. (8333311 * 512) < 2^32. So either:
OR:
Neither are perfect because 83333311 doesn't have any common factors with 1000000. But both of these approaches will get you a much better microseconds approximation than the existing code does. The left shift might be slightly faster. |
Edit, just thought why not So, Simple and probably the most accurate approximation of all. |
Whoops, I meant LL (or ULL). Linux/gcc seems to promote it to 64 bit either way (I tested before I commented), but I haven't had a chance to figure out if that behavior is in the standard... either way probably more obvious as ULL. So I'd prefer the exact conversion, vs 3/25 approximation, since they do give different answers. Biggest error is at max value to rollover your approximation would skip from 999997 to 0, where as ULL solution smoothly goes from 999999 to 0. |
Although C99 requires long long of [at least] 64 bits, the 64-bit ops are not necessarily implemented in hardware. On a 32 bit processor the 64 bit operation it is likely to be emulated in software. At that point the floating-point solution might be more performant. All this is just speculation though - you might want to write a test program and benchmark all three. But I do think if you do, you'll find that the 3/25 ratio is the best compromise of speed and accuracy. Note that in my experience control loops do often call some form of polling of "gettime" from high priority threads. So its worth making it as efficient/lightweight as possible. |
Lacking context, how do you suggest we evaluate accuracy vs speed? |
Just had one more thought on this. How about:
This is an effective conversion ratio of ((300 + (1/1244) / 2500), or approx 1/8.33331100399, so its almost exact, with probably less performance penalty than either of the other options, as it f just an extra integer division. It should roll over smoothly with no "skipped" values. |
You could test for speed by simply writing a small C program that implements all three of the conversion styles (32 bit int, 64 bit int, double floating point) and just run it in a tight loop for a few seconds and count iterations. You need to run it on the actual target hardware, of course. |
I wasn't questioning how to test the speed of the solutions, but how do you suggest trading accuracy vs speed (3/25 is faster by x, but at the "cost" of up to 2 additional usec of error). Note (300*DecCount) overflows on Linux w/ gcc withough declaring 300 as unsigned. So:
Gives the exact same results as the ULL solution over the full range, on Linux. Definitely worth a speed check on all three. Since it's fun to guess, I bet this solution takes ~2x as long as 3/25, and is within 10% of ULL. |
I think for the tradeoff, it is just a subjective value decision at this point whether 2-3us of possible error is worth the extra CPU cycles that a more arithmetically-correct implementation produces. Obviously existing code was running OK with up to 1000us (or more!) of error in the existing implementation. To put it in context, most things in CFE apps don't get scheduled at a frequency higher than 100Hz or so anyway, usually more like 1Hz or less. It does speak to the need for PSP requirements though. Otherwise one cannot say whether CFE_PSP_GetTime is behaving within requirements or not here, or be able to quantify if a slight loss of precision for a faster implementation is "worth it" or not. It is a judgement call for the platform's users to make. This is also an area that IMO could use clean up because the MPC750 PSP passes a somewhat hw-specific timer up the stack that "rolls over" at weird intervals and there is a lot of corresponding complexity in both CFE_TIME and the PSP API to deal with that rollover, which is really specific to this PSP. But that is a separate issue, a discussion to be had when reviewing PSP requirements. I would definitely advocate for simplification here. Pedantic note: you really shouldn't need the "u" suffix on literals in the code, because the |
Definitely agree, which is why I lean towards just picking a solution, and documenting the other options in a comment. I'd prefer to avoid implying there is one right answer for all use cases, rather pick one present the trade and move on rather than spend too much time trying to pick the "right" one since it depends on context.
User submitted a bug report and a fix. I'd bet they "cloned and owned" the fix for their use rather than living with 1000us of error. I don't think your assumption here that 1000us of error was "OK".
I suggest we take an approach similar to OSAL, where the API and behavior gets documented. Basically this is what the code is expected to do and why, but allow projects the freedom to make the judgement calls (avoid expressing "requirements" that aren't really "required" without additional project context, like a timer shall be accurate to Xus or similar). The PSP's we provide as part of the open source cFS Framework are example implementations, so I'd prefer to scope our effort to describing the choices made vs trying to generate requirements that work for all use cases.
Agreed, but I'd consider it pretty low priority (at least from GSFC point of view) since no future projects have targeted this PSP for use. I certainly encourage a ticket to document clean-up opportunities/suggestions, but I'd consider it a low priority enhancement (unless funded I likely won't focus resources on it).
Agreed! I found my bug. I was defining DecCount as uint32_t, but then I used an iterator to check accuracy and defined it as int. Good catch. |
The only reason I bring it up is because the issue is not confined to this particular PSP. If only this PSP had to deal with it, then things would be fine. But its an area where the abstraction is lacking, which in turn increases complexity in code on top of it (specifically CFE_TIME which uses this API) which IMO is the real issue. The CFE_TIME has all sorts of complexity to handle the frequent wraparound because the PSP isn't actually handling it, its just passing basically a "raw" value up the stack. There are other related PSP calls - see CFE_PSP_Get_Dec, CFE_PSP_Get_Timebase, CFE_PSP_GetTimerLow32Rollover, etc. that are related to the insufficient abstraction, but I couldn't tell you what they are for or how one is supposed to implement them on other platforms. The CFE_TIME has way too much complexity and there are a number of bugs against this. Fixing this should entail simplifying and bringing some sanity/consistency to the PSP time access functions, which would enable much of the complex wraparound logic to be removed from CFE_TIME, which really exists there only for this platform - other platforms don't have this wraparound issue. Sorry if this is a diversion from the original ticket, this is a longer term fix but it should be on the to-do list somewhere because it will greatly improve CFE_TIME and other apps that need timing info. |
Definitely agree. The following tickets are in at the cFE level under @acudmore review, addressing these and your comments above from an architectural design level is definitely on the to-do list. A review of abstraction across the entire cFS for time makes sense to me. That said, I don't have identified resources to focus on it this FY. |
I did push a branch, but didn't do any associated testing. Really not all that useful without doing all the tests, but at least an example of what could be done. |
Fix #76, Microsec conversion lost resolution
There's a 'bug' in the implementation of the mcp750/vxworks PSP which causes the loss of timing resolution in the function CFE_PSP_GetTime(). for example:
{{{
void CFE_PSP_GetTime( OS_time_t *LocalTime)
{
uint32 DecCount;
/* Reads the time from the hardware register, then converts it
* into usable seconds and microseconds */
sysPciRead32(0xFC0011C0, (UINT32 )(&DecCount));
DecCount = DecCount & 0x7FFFFFFF;
DecCount = ((uint32) 0x0D6937E5) - DecCount;
LocalTime->seconds = DecCount / 8333311;
DecCount = DecCount % 8333311;
LocalTime->microsecs = (DecCount/8333) * 1000;
}/ end CFE_PSP_GetLocalTime */
}}}
In this case, (DecCount/8333) * 1000 is performed as an integer calculation (as DecCount and LocalTime->microsecs are integers), basically reducing resolution from microseconds (which this function can calculated from fractions of microseconds) to miliseconds.
NOTE: GSFC's rad750 version has a similar calculation ((DecCount / 2062) * 1000)
Should this be changed to something like (for mcp750):
{{{
LocalTime->microsecs = (uint32) (( ((double)DecCount) /8333.0) * 1000.0);
}}}
The text was updated successfully, but these errors were encountered: