Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange math in timespec_to_double #1131

Closed
morrone opened this issue Jul 28, 2017 · 9 comments
Closed

Strange math in timespec_to_double #1131

morrone opened this issue Jul 28, 2017 · 9 comments
Assignees

Comments

@morrone
Copy link
Contributor

morrone commented Jul 28, 2017

The addition of half of a nanosecond in the below function has me puzzled. Can you explain why that makes sense?

src/modules/cron/cron.c:

static double timespec_to_double (struct timespec *tm)
{
    double s = tm->tv_sec;
    double ns = tm->tv_nsec/1.0e9 + .5e-9; // round 1/2 epsilon
    return (s + ns);
}
@grondo
Copy link
Contributor

grondo commented Jul 28, 2017

I think that epsilon is there to offset round off error from the conversion of nanoseconds to double. I honestly do not remember the issue that prompted me to add the rounding (wish I'd made a better comment sorry) , but I'm assuming it was something like a timestamp that should have been run on a round second but instead the timestamp came out something like 1501211501.9999999999999.

If it is truly concerning I could remove the rounding and see if I still see any problem. If so, I'd add a much better comment?

@grondo
Copy link
Contributor

grondo commented Jul 28, 2017

Sorry, obviously the whole seconds could not underflow, but I meant something more like 5000000000 nanoseconds gets annoyingly converted to .499999.... By adding half a nanosecond you avoid this (IIRC what the problem was here)

@morrone
Copy link
Contributor Author

morrone commented Jul 28, 2017

Sure, both the seconds and nanoseconds conversion will be slightly inaccurate because of the machine epsilon inherent to floating point representation in binary. But the machine epsilon of a double is much smaller than the .5e-9, more like 2.22e-16. So how is adding .5e-9 "rounding"? It looks like it is just a consistent addition of error.

For all of the instances that you "correct" something like 500000000 becoming 499999999, aren't you creating just as many instances where the value will now be wrong?

@grondo
Copy link
Contributor

grondo commented Jul 28, 2017

So how is adding .5e-9 "rounding"?

It is rounding to the nearest nanosecond, or 1e-9. Just like you'd round to an integer from a float by adding .5 and truncating, e.g. 1.8 + .5 = 2.3 = 2 and 1.3 + .5 = 1.8 = 1.

Perhaps the use of the term epsilon here was confusing? (it has nothing to do with machine epsilon, but the precision of this specific calculation, which can't be any better than 1e-9)

For all of the instances that you "correct" something like 500000000 becoming 499999999, aren't you creating just as many instances where the value will now be wrong?

I guess define "wrong"? Keep in mind we probably don't care about nanosecond precision in these timestamps (if clock_gettime() really even has it), and certainly have no right to try to keep sub-nanosecond precision. The timestamps will almost always be truncated at either ms or us when printed, and I can't think of a case here where rounding the value would be construed as wrong, but maybe I missed something?

@morrone
Copy link
Contributor Author

morrone commented Jul 28, 2017

It is rounding to the nearest nanosecond, or 1e-9. Just like you'd round to an integer from a float by adding .5 and truncating, e.g. 1.8 + .5 = 2.3 = 2 and 1.3 + .5 = 1.8 = 1.

The key words there are and truncating. There is no truncation in your function, only addition.

I think that addition should be removed from an internal function like that, and left up to presentation code to do any rounding that is desired.

@grondo
Copy link
Contributor

grondo commented Jul 28, 2017

I think that addition should be removed from an internal function like that, and left up to presentation code to do any rounding that is desired.

Did you have a reason why for this use case? It takes the convenience rounding from one spot to various places in front end commands, logging functions, etc.

@morrone
Copy link
Contributor Author

morrone commented Jul 28, 2017

Because it is basically not-so-hot encapsulation.

As implemented, the function splits rounding between distant parts of the code. timespec_to_double() performs the addition, and then all the users of time stored in a double need to know that addition has already been applied and only do a truncation, not try to complete the full rounding operation on their own.

This means that any user of timespec_to_double() must read the implementation of the function to be able to use it properly. This is surprising behavior.

And that information is buried two levels deep. The public function is "get_timestamp()". So to understand how to use the double, the user needs to read both get_timestamp(), and then go another level deep to read timespec_to_double().

Furthermore, one would assume that the reason we want time to be in floating point format in the first place, rather than leaving it in unambiguous integer form, is so that we can perform easy arithmetic on the single floating point value rather than dealing with math on two integers dealing with overflow, etc.

So how does the error compound when we do the various arithmetic?

A better approach would be to have a separate clear function that does the entire rounding and conversion to presentation form.

@grondo
Copy link
Contributor

grondo commented Jul 28, 2017

This is not encapsulation it is a static function and not exported to anyone but private users. (in fact I probably shouldn't mention this but it is duplicated in cron/task.c as well)

However, point taken on the function naming basically telling a lie -- I guess if it was called round_timespec_to_double() this issue would not have come up.

However, the rounding is just to correct timestamps which are not critical to the code and is only for aesthetics as far as I can remember, so I'm fine with removing the addition if it means we can close this issue 😉

@morrone
Copy link
Contributor Author

morrone commented Jul 28, 2017

Yes, but get_timestamp() reexports the output of timespec_to_double() making it public again.

grondo added a commit to grondo/flux-core that referenced this issue Aug 4, 2017
Avoid implicit rounding in timespec_to_double in both cron.c and task.c.

For the cron task timestamps, rename the local function
round_timespec_to_double() to be honest about what we're doing,
and expand the comment to explain why.

For other cron timestamps, just drop the rounding for now. If it
an inconvenience in the future (unlikely), the rounding can be
copied from task.c as an example.

Resolves flux-framework#1131
grondo added a commit to grondo/flux-core that referenced this issue Aug 7, 2017
Avoid implicit rounding in timespec_to_double in both cron.c and task.c.

For the cron task timestamps, rename the local function
round_timespec_to_double() to be honest about what we're doing,
and expand the comment to explain why.

For other cron timestamps, just drop the rounding for now. If it
an inconvenience in the future (unlikely), the rounding can be
copied from task.c as an example.

Resolves flux-framework#1131
grondo added a commit to grondo/flux-core that referenced this issue Aug 7, 2017
Avoid implicit rounding in timespec_to_double in both cron.c and task.c.

For the cron task timestamps, rename the local function
round_timespec_to_double() to be honest about what we're doing,
and expand the comment to explain why.

For other cron timestamps, just drop the rounding for now. If it
an inconvenience in the future (unlikely), the rounding can be
copied from task.c as an example.

Resolves flux-framework#1131
grondo added a commit to grondo/flux-core that referenced this issue Aug 8, 2017
Avoid implicit rounding in timespec_to_double in both cron.c and task.c.

For the cron task timestamps, rename the local function
round_timespec_to_double() to be honest about what we're doing,
and expand the comment to explain why.

For other cron timestamps, just drop the rounding for now. If it
an inconvenience in the future (unlikely), the rounding can be
copied from task.c as an example.

Resolves flux-framework#1131
grondo added a commit to grondo/flux-core that referenced this issue Aug 10, 2017
Avoid implicit rounding in timespec_to_double in both cron.c and task.c.

For the cron task timestamps, rename the local function
round_timespec_to_double() to be honest about what we're doing,
and expand the comment to explain why.

For other cron timestamps, just drop the rounding for now. If it
an inconvenience in the future (unlikely), the rounding can be
copied from task.c as an example.

Resolves flux-framework#1131
grondo added a commit to grondo/flux-core that referenced this issue Aug 14, 2017
Avoid implicit rounding in timespec_to_double in both cron.c and task.c.

For the cron task timestamps, rename the local function
round_timespec_to_double() to be honest about what we're doing,
and expand the comment to explain why.

For other cron timestamps, just drop the rounding for now. If it
an inconvenience in the future (unlikely), the rounding can be
copied from task.c as an example.

Resolves flux-framework#1131
grondo added a commit to grondo/flux-core that referenced this issue Aug 14, 2017
Avoid implicit rounding in timespec_to_double in both cron.c and task.c.

For the cron task timestamps, rename the local function
round_timespec_to_double() to be honest about what we're doing,
and expand the comment to explain why.

For other cron timestamps, just drop the rounding for now. If it
an inconvenience in the future (unlikely), the rounding can be
copied from task.c as an example.

Resolves flux-framework#1131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants