Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/ticker: fix flakey spec comparing time with postgres stored time #1774

Merged
merged 3 commits into from
Sep 24, 2019
Merged

services/ticker: fix flakey spec comparing time with postgres stored time #1774

merged 3 commits into from
Sep 24, 2019

Conversation

leighmcculloch
Copy link
Member

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

Summary

Change the TestInsertOrUpdateAsset test to Round instead of Truncate time values that are being compared with time that was stored in Postgres.

Goal and scope

Intermittently the TestInsertOrUpdateAsset test will fail because a time value that's been stored in Postgres doesn't match the time value we gave to Postgres to store. We attempt to do our best to make them match by using the Truncate value on both the input and output values, but that doesn't work for some cases.

I managed to reproduce the failure 100% of the time by using a specific input time of 12:08:50.9419997 where the milliseconds component will be rounded up because of the values in the nanosecond component by Postgres when it reduces the value to microseconds. To see the exact failure case, take a look at the first commit where I hardcoded it for consistent failures. This isn't the only failure case though. When I ran the tests thousands of times I had plenty of times where it failed. Because we use the Truncate function, we are comparing a rounded down value with a rounded up value in any cases where the millisecond component gets rounded up when the microsecond component is rounded up.

To illustrate exactly what is happening here, this is our original time:

12:08:50.9419997

When Postgres is truncating it to microseconds, which is six decimal places, it is rounding to the nearest, which in this case is up:

12:08:50.942000

In our test we use the truncate function to truncate to milliseconds:

12:08:50.941

Using Round instead causes us to round up to the same value.

Close #1733

Summary of changes

  • Replace use of Truncate with Round.

Known limitations & issues

The fix in this change isn't perfect according to discussion at lib/pq#227 (comment), where a commenter states that Postgres' rounding function rounds to nearest and in the event of a tie rounds to even, which is different to Go's Round function which rounds to nearest and in the event of a tie rounds away from zero. This would cause a problem in this situation:

However, in the tests I ran with time 12:08:50.9419995 and rounding to the nearest microsecond, both Postgres and Go had rounded in the same directions, so I don't see evidence that this is an issue.

Even if this was still an issue I think this change is the best simple change we can make to the test and it would significantly reduce the number of cases this test fails.

What shouldn't be reviewed

N/A

@leighmcculloch
Copy link
Member Author

It'd be great to prevent us from using Truncate in tests in this way, and we could try write a checker compatible with go vet to prevent us from making the same mistake, but I think the only way to do that would be to ban us from ever using Truncate which isn't practical.

Copy link
Member

@accordeiro accordeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough analysis, this looks great!

I agree it's the simplest change we can do, and hopefully this fixes the problem once and for all. I think it's worth keeping an eye to see if the problem persists, though, as I've already had similar time comparison issues in the past (see in #1267 and #1421).

@leighmcculloch
Copy link
Member Author

leighmcculloch commented Sep 24, 2019

👍 Using WithinDuration that I see was used in those other PRs is also another option, but I think from the testing I've done that there's a good chance Round will fix our issues because from what I can see Postgres is actually rounding in the same way as Go does. If this starts failing again we can give that pattern a whirl here too. I'll keep an ear out for these type of issues.

@leighmcculloch leighmcculloch merged commit 3e23070 into stellar:master Sep 24, 2019
@leighmcculloch leighmcculloch deleted the issue-1733 branch September 24, 2019 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Intermittent failing test TestInsertOrUpdateAsset in services/ticker/internal/tickerdb
2 participants